[Taking this discussion to tor-dev.]
On Sun, Sep 1, 2013 at 6:32 AM, George Kadianakis desnacked@riseup.netwrote:
Kevin P Dyer kpdyer@gmail.com writes:
Hi George/David,
Hi Kevin,
I spoke with Roger at USENIX. He said you're the pluggable transport (PT) gatekeepers. Please bear with me while I get up to speed.
My goals:
- I want Format-Transforming Encryption (FTE) [4] to be a "deployed" PT
in
the PT TBB. 2. I want FTE to be integrated seamlessly with your existing deployment process.
My initial roadblocks:
=== Building/Testing Tor on Linux/OSX/Windows I'm trying to understand exactly how the current build/release process works for tor. In regards to the PT TBB it seems like there are a few resources [1,2,3]. However, is there a canonical documentation on how the release process works? I'm especially interested in what you guys are
doing
to produce builds on Windows. Are you using virtualization or do you
have a
few physical build machines?
Prior to doing anything with FTE. I'd love to be able to create my own build environment that produces that current obfs2+obfs3+flash_proxy bundles [6] across all 4 OS/architecture configurations.
Building PTTBBs is mainly done by David these days. He has documented his process here: https://gitweb.torproject.org/pluggable-transports/bundle.git (For example for Windows you would look here:
https://gitweb.torproject.org/pluggable-transports/bundle.git/blob/HEAD:/bun... )
The release process is not standarized. David is doing PTTBB releases in a best-effort manner.
Right. On first glance, looks like this process will increase in complexity (and utilize more of David's time) as the number of PTs increases.
I need to better understand the build process, then.
=== Implementing Managed Mode in FTE I've implemented preliminary functionality for "managed" mode in FTE. However, I think I'm confused about the role of managed mode.
Say I add "Bridge fte IP:port" to torrc. Is "IP:port" supposed to be a
tor
bridge, or a server-side PT service? If it's the former, then it seems
that
the PT is completely responsible for managing a list of its own PT
servers.
If it is the latter, I can't figure out how to dynamically determine, via the "managed" environmental variables, how to capture user-entered "IP:port" information in Vidalia.
It is the latter. IP:port points to the server-side PT service. It does *not* point to the ORPort of the bridge (we don't care about it).
Concretely, if I have "Bridge fte IP1:port" and ""Bridge fte IP2:port" in my torrc, how does "IP1:port" and "IP2:port" get propagated to my PT via the managed interface?
The IP:port is *not* passed to your transport using the managed mode. Instead, IP:port is passed to your transport using the SOCKS protocol. That is, when Tor wants to connect to the bridge, it does a SOCKS handshake to your transport, and asks your transport to connect to IP:port.
Is this documented anywhere?
=== How do we invoke PTs? I had this discussion with Roger, but I don't see any open tickets or
clear
discussion on this already. If we have N>1 PTs and at least one bridge
per
PT, how do we select which PT (and which bridge associated with that PT)
to
use? Determinism is bad because then only one PT is used. Booting up all PTs is bad, especially if (say) the PTs make network connections prior to any incoming SOCKS connections. Selecting a random PT is potentially bad, too, depending upon how hostile and persistent and stateful the adversary is.
That's an interesting question. I'm not sure if the process of Tor picking bridges is deterministic or not. I should test it out. David might know.
(A good scenario would be that Tor treats bridges like guards and selects some at random to build circuits.)
We should definitely try to flesh this out.
We should probably chat about this. (Maybe you already have and I'm out
of
the loop?) It is especially important as the number of PTs increases.
I'm happy to take this discussion (or a subset of it) public, if you
think
it'll help others. I just didn't want to spam tor-dev/tor-assistants with this initial email.
Yes, let's take it public. Feel free to CC tor-dev in your next reply.
Done.
BTW, on the topic of deploying your PT, have you seen: https://lists.torproject.org/pipermail/tor-dev/2013-August/005231.html ?
FTE seems to be missing out on the code quality front. The Python code is quite complex and undocumented. There is also some C++ code in the codebase that is also complex and undocumented. We have decided that we won't ship C/C++ code in PTs, except if it's dead easy to review or if it has gotten heavily scrutinized. Any chance that the C++ code could be written in a memory-safe language?
Roughly, FTE has an offline mode (building DFAs, needs to be done once) and an online mode (transporting data, deployed to everyone.) In terms of C++ code, there are ~300 line of C++ code used in online mode that were implemented for performance-critical algorithms. Implementing this code in Python will slow FTE down by at least an order of magnitude. I'll document why I made this decision.
Alternatively, any suggestions for a memory-safe language that allows hooks for Python and affords the same performance as C++?
In terms of reducing complexity of the Python code, and increasing code documentation, do you have concrete suggestions? It would be great if you could raise a few issues on FTE's github [7]. My time is limited and I would prefer to focus on the things you care about.
Thanks, Kevin
[1] https://gitweb.torproject.org/pluggable-transports/bundle.git
[2] https://lists.torproject.org/pipermail/tor-dev/2013-June/005056.html [3] https://gitweb.torproject.org/builders/tor-browser-bundle.git [4] https://kpdyer.com/fte/ [5]
https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/180-pluggable...
[6] https://www.torproject.org/docs/pluggable-transports.html.en