Good day, tor-dev;
As some of you know, I've been working on a patchset for Tor to allow it to participate in Pluggable Transports 2.0 configuration, primarily the new JSON Parameter Block SOCKS method (at least that's what I've been calling it in the absence of a more official name), almost but not quite as described in section 3.3.4 of PT2 draft 2 [1]---i.e., basically #21816 [2].
[1] https://www.pluggabletransports.info/assets/PTSpecV2Draft2.pdf [2] https://trac.torproject.org/projects/tor/ticket/21816
I'm attaching a draft patchset which adds this functionality, with the intent of getting feedback and making remaining cleanups or other modifications necessary to get it merged into Tor. I have successfully completed circuits through an obfs4 bridge using both obfs4proxy (PT1) and a version of shapeshifter-dispatcher (PT2) using a patched Tor. I've tried to follow the local style, but the preferred implementation strategies aren't always clear, and of course I'd appreciate any reports of other problems.
A forked Git repository is also available on Bitbucket [5][6], which will be updated as I make remaining changes.
[5] https://bitbucket.org/DasyatidPrime/tor-rtt2017-21816.git (Git) [6] https://bitbucket.org/DasyatidPrime/tor-rtt2017-21816/src (Web)
More implementation details are below if you're interested in this; thanks for your attention. I'll try to be around on IRC more during the week, so feel free to ping me there as well.
-RTT
... Details:
Not visible above, related to the target functionality:
- I'm assuming the SOCKS method includes a response with an analogous structure to RFC 1929; PT2 draft 2 doesn't specify one. I've cleared this with blanu, and that's intended for the next draft of the PT2 specification.
- Similarly, the length prefix is in fact big-endian; the example in PT2 draft 2 is wrong, though the text is correct.
- We're looking into getting an IANA assignment for the method number. Technically this probably meets the requirements for the private use block, but I feel like interop might be easier later on, and it could simplify code paths in places to have a registered number (mainly if it's possible to decouple the method negotiation from the configuration-version plumbing). I believe this is still in limbo.
- For the PT2 side, I've been testing against my branch of shapeshifter-dispatcher [3] compiled with my branch of shapeshifter-ipc [4] since there were some breaking changes to Shapeshifter upstream which were otherwise preventing it from interoperating with Tor. I'm planning to help merge fixes into Shapeshifter upstream as I can; it's my current understanding that there aren't any other PT2 managed transport implementations to test against.
[3] https://github.com/OperatorFoundation/shapeshifter-dispatcher/tree/rtt2017 [4] https://github.com/OperatorFoundation/shapeshifter-ipc/tree/rtt2017
Issues on my radar currently (comments appreciated):
- We probably want unit tests for the (limited) JSON encoding functions, and for the factored-out RFC 1929 encoding functions. Anything else that looks feasibly testable here?
- It's not clear to me whether negotiating a PT2 configuration version still allows PT1-style RFC 1929 parameter encoding so that managed transports can support the new configuration version and the new SOCKS method separately. I've assumed it might be possible, so far, but that's not being tested against anything.
- I currently restrict parameters to ASCII to avoid either writing a JSON encoder that can spit out invalid JSON or writing a JSON encoder that has to validate incoming UTF-8. The impression I've gotten is that this is probably okay, but if there are counter- examples, I can put in UTF-8 passthrough.
- The commit sequence isn't the cleanest. How high a priority is it to reorder/combine patch hunks to make a cleaner one?
- We still need a 'changes' file. (What would be an appropriate heading for this? Is this a minor feature, for instance?)
A few other questions:
- Is there an effective way of doing automated testing of the SOCKS state machine currently in Tor? I didn't see anything obvious in the test directory. This seems like the most fragile part, especially since both the original and modified versions are not very explicit in their state machine nature and are split between multiple files.
- Can there ever be more than one managed_proxy_t to a transport name? More generally, is there a relational diagram of the main Tor data structures somewhere? A lot of the way the plumbing for state and configuration information is set up feels kind of fragile.