(Hi! Here's a document I've been poking at , on and off, for a while. There's a lot more to say here, but I think it's ready to share with tor-dev for comment. Thanks!)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ 0. PRELIMINARIES. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
============== WHY?
If you've ever looked at even slightly serious software engineering methodology, you've seen some discussion of modularity: the idea that programs should be divided into a number of smaller modules; that complexity should be hidden within modules and not in the interfaces between them; that which modules can call which other modules should be limited; and other little rules like that.
We'd like modules isolated from one another for these reasons:
* Make them simpler to test * Minimize what is allowed to touch what * Make tor easier to understand * Make tor easier to hack on * Let us write parts of tor in other languages. * Make sandboxing more powerful
This is actually several problems all in one!
- OO/Isolation/Modularity design that doesn't actually split into multiple processes, but makes the callgraph much much simpler.
- Multiprocess sandbox framework with RPC implementing the above abstraction
- Finding the right stuff to carve out into separate sandboxes
- Finding the right stuff to carve into separate modules.
============ Steps/phases:
Over the past 6-8 months we've gone a lot of work to simplify Tor's callgraph in preparation for this work. We've got our callgraph simplified to a pretty extreme degree, for example. Here are some of our next steps, some of which can be done in parallel.
1. Figure out which modules should be able to call each others. Isolate them based on their ability to do so. Refactor until they actually work that way. - In some cases we will need to split - hide big piles of modules behind single entry points. - separate src/common and src/or into separate parts. - Add tools to enforce this separation - Add tools to separate Tor functions safely and not too error-prone.
(See sections 1 and 3 below.)
2. Add backend abstractions as needed to minimize module coupling. These should be abstractions that are friendly to in- and multi-process implementations. We will need at least:
- publish/subscribe{,/acknowledge}.
(See https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern. The 'acknowledge' abstraction allows the publisher to wait for every subscriber to acknowledge receipt. More on this in section 4 below.)
- handles
(We need frequently need handles to objects where the handles need to be persistent while the objects themselves can go away. We use several patterns for this now. We should just have a handle pattern for this instead. This would need separate implementations for in-process and cross-process access. See section 2 below for more information.)
- Need a way to mark functions as "internal to module".
We're going to have a larger number of multi-file modules. Right now, our only restriction for making functions less visible is "static", and the fact that we don't allow src/common to call src/or. Both of these should be cleaned up.
3. Write abstractions for splitting code into multiple processes that talk over IPC.
(See section 5 below)
========================= Metrics
We can measure our success in a number of ways:
* How large are the strongly connected components among modules calling another? Right now the largest SCC is 52 C files.
* How many layer/module violations can we find? This needs to be an automated process.
* How many exported functions from each module can we remove?
* How much can we remove from the main linux sandbox code? How many functions can we isolate to purpose-built modules?
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ 0. THE MODULES ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
========================= Modules in src/or
LOWEST LEVEL
These should only get called from higher-level components. They shouldn't call anything of a higher level.
Types and data structures. - part of configuration. - parts of statefile - routerset - policy (some) - torcert
Various backends and utilities. - confparse - geoip - fp_pair? - reasons - replaycache - buffers
Nodelist data structures, excluding routerinfo. - nodelist - microdesc - networkstatus - routerparse.c (parts)
SLIGHTLY HIGHER
These should only get called from higher-level components, and only call lower-level components. When they currently call a higher-level piece, they should use a callback or a publish/subscribe API or something.
Nothing in here should really be tor-specific.
mainloop: - connections (common, genericparts) - main (nonspecific parts) - config (maybe) - workqueue - scheduler - periodic - cpuworker (parts) - hibernate ? (Parts?)
HIGHER THAN THAT
These modules are where we start making Tor do Tor. They should only call lower-level modules
onionrouting: - channels - onion* - cpuworker (parts) - connection_or - connection_edge - circuitlist - command - relay - parts of router.c - transports
HIGHER STILL
These modules can call downwards, though they should really be made as independent as possible from each other and from one another.
We should enumerate which of these can call which others; mostly, they should leave one another alone.
hidden services: - rend*
controller: - control.c
auxiliary: - rephist
authority: - dirvote - dirserver (parts) - dirauth - dircollate - keypin
relay: - ext_orport - parts of router.c - routerkeys.c
cache: - parts of directory - dirserver (parts) - routerlist (parts) - routerparse.c (parts)
exit: - dns
client: - dnsserv - most circuitbuild, circuituse stuff - circpathbias - circuitstats
directory: - directory - dirserver -
HIGHEST LEVEL
Here's the parts of the application that know all the other parts, and call down. Nothing else calls up to these modules.
application (highest level) - Main (parts) - parts of config - parts of statefile - ntmain - status
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ 2. INFRASTRUCTURE: HANDLES. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
======================== Handles in C
* We need a light-weight handle mechanism. (A mechanism where we can have a reference to X that can still live even when X dies, where lookups of X through the reference are safe even when X is freed, and where lookups of X through the reference are FAST.)
* It's okay if the things being handled need to be marked as 'handleable'.
* When the handle is in a separate process, we probably need a table-based implementation: * Unique handle (64/96 bit int) for each thingie; hashtable to look up by handle.
* When the reference is local, can get as fast as pointer lookup.
* Threadsafety can be important but is not always required.
For an example implementation, see #18362.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ 3. INFRASTRUCTURE: ENFORCING ISOLATION. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
=== Making isolation happen.
It'll do us no good to declare that modules are isolated if they actually aren't. Once we have defined the allowed call patterns, we need to make sure that calls that fall outside of those patterns aren't permitted.
We can build something pretty simple out of 'nm', I believe.
See ticket #18617. This is a work in progress.
=== Moving functions around.
I'm working on a tool to move functions from one file to another in a deterministic way based on annotations in the original file. This lets us version our movement, and have the movement happen deterministically so that the patches are easier to verify. If we integrate it with our callgraph-examination tools above, we can make sure that our movement plans actually make the code more conformant with our modularity plans.
See ticket #18618. This is a work in progress.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ 4. INFRASTRUCTURE: PUBLISH/SUBSCRIBE ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Many of our modularity violations happen because when some property changes, or some event occurs, we need to inform a wide variety of other modules. The publish/subscribe{,/acknowledge} pattern, and the similar observer pattern, were made for this.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ 5. ADVANCED INFRASTRUCTURE: MULTIPROCESS SANDBOXING ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
======================== Multiprocess abstraction / design
We'll consider multiprocess for these cases:
* Easier/stronger/more-portable sandboxing. * Writing subcomponents in languages whose runtime/link rules don't play so nicely with C. * Enforcing modularity when modularity delivers security.
For sandboxing, here are some high-risk privileges we should make sure that other code doesn't have.
* Filesystem calls -- anything that can open a file. * Exec -- anything that can launch a process. * Access to identity or signing keys. * Invoking getaddrinfo() [because it uses a fairly huge segment of the underlying operating system]
And here is some higher-risk code that we could most safely isolate from the main modules.
* Anything that parses an object. * Consensus diff processing.
Our basic abstractions are those described above (pub/sub, handles, hooks), with the addition of:
* Blocking RPC * Capabilities???? * message queues.
Controversial: * We should NOT require that most of this be blindingly fast. Correct is perfectly adequate. Fast is only needed in the critical path.
Here's the general architecture I have in mind: * At the highest level, have a launcher/supervisor process whose only job is to start the other processes, monitor them, and notice if they go down.
* Additionally, have a communications-hub process whose role is to pass messages between other processes, performing access control on the messages to make sure that each message is only sent by a process that is allowed to send it, to a process that it is allowed to send to. Processes may only communicate with the hub.
* Messages should use a simple RPC format. I vote for protobufs or capnproto. They sure have a speed/simplicity tradeoff though.
* pipes or socketpairs should get used on Unix. Windows can use windows Pipes, which aren't quite FDs. Windows will need to have its pipes in a different thread from the socket-based libevent loop (if there is one.)
* We need a way to pass fds or handles back and forth. Both Windows and Unix can do this. (via DuplicateHandle() on Windows and sendmsg() on PF_UNIX sockets on unix.) The protocol needs to handle this explicitly.
Here's the development plan I would suggest: * In parallel, work on improving Tor's modularity as the earlier sections of this document suggest. Unless we get more modular, we won't be able to make anything actually separate besides the privilege-restriction sandboxing parts above (like isolating keys and FS access).
* Research what exactly windows and osx allow.
* Write the top-level supervisor process and hub process.
* Ensure that the APIs we expose can be written in C in a fairly mechanism-agnostic way, so that we can migrate to this architecture with minimal additional effort.
* Move privilege-restriction sandboxing parts into subprocesses.
And here is an alternative plan: * Investigate the chromium sandbox code; see if it can be extracted, or how much of it can be extracted.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ A. APPENDIX. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
=== progress so far
I've gotten these done so far:
* The starts of a build-time callgraph enforcement mechanism using nm and readelf.
* A handle-based-reference module. (See ticket #18362, but cpunks doesn't like it.)
* A draft refactoring tool that lets you annotate code with what should move where, so that you can deterministicly try the code movement and see what happens, and so you can see what effects it will have on the module-level callgraph.
Add backend abstractions as needed to minimize module coupling. These should be abstractions that are friendly to in- and multi-process implementations. We will need at least:
publish/subscribe{,/acknowledge}.
(See https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern. The 'acknowledge' abstraction allows the publisher to wait for every subscriber to acknowledge receipt. More on this in section 4 below.)
Maybe ZeroMQ can do this. See:
https://en.wikipedia.org/wiki/ZeroMQ
and:
Question: how are these modules you write about implemented? Do you plan to make each module a Dll? Will it be possible to only load a Dll if its functions are needed? I ask this because I currently have Tor running on my router and much of its functions (hidden services, node, etc.) are not needed.
On Mon, Mar 28, 2016 at 6:49 AM, Rob van der Hoeven robvanderhoeven@ziggo.nl wrote:
Add backend abstractions as needed to minimize module coupling. These should be abstractions that are friendly to in- and multi-process implementations. We will need at least:
publish/subscribe{,/acknowledge}.
(See https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern. The 'acknowledge' abstraction allows the publisher to wait for every subscriber to acknowledge receipt. More on this in section 4 below.)
Maybe ZeroMQ can do this. See:
https://en.wikipedia.org/wiki/ZeroMQ
and:
ZeroMQ and its competitors are pretty good, but overkill. They're designed to work in a distributed environment where with unreliable network connections, whereas for this application I'm only thinking about splitting a single Tor instance across multiple processes on the same host.
Question: how are these modules you write about implemented? Do you plan to make each module a Dll? Will it be possible to only load a Dll if its functions are needed? I ask this because I currently have Tor running on my router and much of its functions (hidden services, node, etc.) are not needed.
I haven't been working on the problem from that angle, but I would expect that making the code more modular will make it easier only to compile the modules required.
best wishes,
Nick Mathewson nickm@alum.mit.edu writes:
ZeroMQ and its competitors are pretty good, but overkill. They're designed to work in a distributed environment where with unreliable network connections, whereas for this application I'm only thinking about splitting a single Tor instance across multiple processes on the same host.
ZeroMQ has an "INPROC" transport that works for inter-thread communication (and it's way faster than the networked ones, even unix-sockets, at least a few years back when I benchmarked some things involving ZeroMQ in C++).
Recently someone leaked enormous amount of docs (2.6 TiB) to the journalists [1]. It's still hard to do such thing even over plain old Internet. Highly possible that these docs were transfered on a physical hard drive despite doing so is really *risky*.
Anyways, in the framework of anonymous whistleblowing, i.e. SecureDrop and Tor specifically it's seems to be an interesting case. I'm wondering about the following aspects:
o Even if we use exit mode/non-anonymous onions (RSOS) is such leaking reliable? The primary issue here is time of transmission. It's much longer than any time period we have in Tor.
o What is going to happen with the connection after the HS republishes its descriptor? Long after? [This one is probably fine if we are not using IPs, but...]
o Most importantly, is transferring data on >1 TiB scale (or just transferring data for days) safe at all? At least the source should not change their location/RP/circuits. Or need to pack all this stuff into chunks and send them separately. It's not obvious how it can be done properly. So at what point the source should stop the transmission (size/time/etc)/change location or the guard/ pick new RP?
-- [1] http://panamapapers.sueddeutsche.de/articles/56febff0a1bb8d3c3495adf4/ -- Happy hacking, Ivan Markin
How do you transmit an elephant? One byte at a time...
But on a serious note, it's possible to transfer 2.6TB over Tor in small pieces (such as file by file or via torrent). Given the size, however, I'd suspect they mailed hard drives after establishing contact with journalists. Even on a fairly fast connection, 2.6TB would take quite a while...
~Griffin
-- On Sun, Apr 03, 2016 at 5:24 PM, Ivan Markin < twim@riseup.net [twim@riseup.net] > wrote: Recently someone leaked enormous amount of docs (2.6 TiB) to the journalists [1]. It's still hard to do such thing even over plain old Internet. Highly possible that these docs were transfered on a physical hard drive despite doing so is really *risky*.
Anyways, in the framework of anonymous whistleblowing, i.e. SecureDrop and Tor specifically it's seems to be an interesting case. I'm wondering about the following aspects:
o Even if we use exit mode/non-anonymous onions (RSOS) is such leaking reliable? The primary issue here is time of transmission. It's much longer than any time period we have in Tor.
o What is going to happen with the connection after the HS republishes its descriptor? Long after? [This one is probably fine if we are not using IPs, but...]
o Most importantly, is transferring data on >1 TiB scale (or just transferring data for days) safe at all? At least the source should not change their location/RP/circuits. Or need to pack all this stuff into chunks and send them separately. It's not obvious how it can be done properly. So at what point the source should stop the transmission (size/time/etc)/change location or the guard/ pick new RP?
-- [1] http://panamapapers.sueddeutsche.de/articles/56febff0a1bb8d3c3495adf4/ -- Happy hacking, Ivan Markin _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 4/04/2016 10:31 AM, Griffin Boyce wrote:
How do you transmit an elephant? One byte at a time...
rsync is a beautiful thing. Have different clients / nodes accessing separate file paths. If the transfer drops out / is too slow, start up rsync again..
On 4/3/16, Griffin Boyce griffin@cryptolab.net wrote:
How do you transmit an elephant? One byte at a time...
But on a serious note, it's possible to transfer 2.6TB over Tor in small pieces (such as file by file or via torrent). Given the size, however, I'd suspect they mailed hard drives after establishing contact with journalists. Even on a fairly fast connection, 2.6TB would take quite a while...
That amount of data would take 27 days at 10Mbps. Few would be willing to sit supervising in a hotseat that long when they can physically mail 3TB for $100 and 8TB for $230. Though they might spend 3 days pushing 100Mbps via shells, etc. Overlay networks move data reasonably well, and reliability could be handled by chunking protocols. Available link speeds (thus path speeds) are likely to be limiting factor, ie: 10Mbps limits you to 100GiB a day. Though at 1Mbps, DVD torrenting on say I2P seems to be a thing.
Hi.
My general feeling here is that it's more useful for me to tell you how I think people should share files than it would be for me to answer your questions; sorry, not sorry.
Alice and Bob can share lots of files and they can do so with their Tor onion services. They should be able to exchange files without requiring them to be online at the same time. Are you sure you've choosen the right model for file sharing?
If you want reliability then you should desire to not have single points of failure such as a single Tor circuit or a single onion service; further, the high availability property might be important for certain types of file sharing situations.
If Alice and Bob share a confidential, authenticated communications channel then they can use that to exchange key material and secret connection information. That should be enough to bootstrap the exchange of large amounts of documents:
- Alice is clueful about distributed content-addressable ciphertext storage so she decides to operate a Tahoe-LAFS storage grid over onion services. - Alice uploads her ciphertext to the tahoe grid. - Alice sends Bob the secret grid connection information and cryptographic capability to read her files.
In this situation Alice really doesn't care where her storage nodes are hosted and if the virtual server hosting provider can be depended on to not get hacked or receive a national security letter. Why does Alice give zero fucks? ciphertext. "They" have her ciphertext and it's useless without a key compromise. Anyone who hacks the storage servers she is operating gets to see some interesting and useful metadata such as the size of the files and what time they are read; not nearly as bad as a total loss in confidentiality.
https://gnunet.org/sites/default/files/lafs.pdf
However what if Alice decides that Bob is a useless human being and she should instead publicize the documents herself? She writes her own badass adversary resistent distributed ciphertext storage system and convinces several organizations world wide to operate storage servers in various countries and thus several legal jurisdictions.
She can now gleefully upload ciphertext via onions services to the storage servers and then simply publicize the key material for specific files she wishes to share with the world or an individual. She can make this system censorship resistant by utilizing an erasure encoding for storing the ciphertext. For instance Tahoe-LAFS uses Reed Solomon encoding such that any K of N shares can be used to construct the ciphertext of the file. In this case if an adversary wanted to censor Alice's ciphertext publication they would have to DOS-attack N-K+1 servers.
Recently someone leaked enormous amount of docs (2.6 TiB) to the journalists [1]. It's still hard to do such thing even over plain old Internet. Highly possible that these docs were transfered on a physical hard drive despite doing so is really *risky*.
No that's not necessarily correct; if the drives contain ciphertext and the key was not compromised then the situation would not be risky.
Anyways, in the framework of anonymous whistleblowing, i.e. SecureDrop and Tor specifically it's seems to be an interesting case. I'm wondering about the following aspects:
o Even if we use exit mode/non-anonymous onions (RSOS) is such leaking reliable? The primary issue here is time of transmission. It's much longer than any time period we have in Tor. o What is going to happen with the connection after the HS republishes its descriptor? Long after? [This one is probably fine if we are not using IPs, but...] o Most importantly, is transferring data on >1 TiB scale (or just transferring data for days) safe at all? At least the source should not change their location/RP/circuits. Or need to pack all this stuff into chunks and send them separately. It's not obvious how it can be done properly. So at what point the source should stop the transmission (size/time/etc)/change location or the guard/ pick new RP?
-- [1] http://panamapapers.sueddeutsche.de/articles/56febff0a1bb8d3c3495adf4/ -- Happy hacking, Ivan Markin _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev