Hi Chang,
On 29 May 2013, at 06:22, Chang Lan changlan9@gmail.com wrote:
Given that ScrambleSuite is being deployed, improving protocol obfuscation will be my main focus. HTTP impersonation is really useful, since there are numerous HTTP proxy outside the censored region, while the number of bridges is quite limited. What I'm gonna be doing during the summer is implementing a good enough HTTP impersonation based on pluggable transports specification. There are still many open questions indeed. Discussions are more than welcome!
There certainly are quite a few open questions, so it would be good to start planning early. Implementing HTTP is a deceptively difficult project.
I'd suggest starting by reading the HTTP specification in detail, particularly the parts that deal with caching: http://tools.ietf.org/html/rfc2616 For comparison HTTP/1.0 is also worth looking at: http://tools.ietf.org/html/rfc1945
Some issues that you will need to deal with are: - Individual HTTP requests may be re-ordered if they are over different TCP connections - Responses may be truncated without an error being reported to higher layers (which is why HTTP includes length fields as an option). - HTTP doesn't give the same congestion avoidance as TCP - Proxies can both cache and modify data they transmit. - Proxies deviate from what is permitted by the specification - (and others)
When dealing with these, you will need to ensure you don't introduce any new ways for a censor to efficiently and reliably distinguish your protocol from HTTP.
I think it would also be a good idea to implement scanning resistance. Since it will be over TCP, you can't hide that something is listening, but you can ensure that if the initial request does not demonstrate knowledge of a valid secret, the response does not disclose that it is a Tor bridge.
As you start implementing, you should have some way of testing. Initially this can be a direct connection from your pluggable transport client to pluggable transport server. You can set up an OP and bridge on the same machine (set your bridge not to advertise itself), and get your OP to talk to your bridge via your pluggable transport.
However, you shouldn't keep to this setup for very long, as it won't test how your pluggable transport works with a proxy. So you should put a caching proxy (e.g. Squid) between your pluggable transport server and client, and make sure they keep working. You can try configuring Squid in ways to stress your pluggable transport, and also replace Squid with a proxy server you create (e.g. based on one of the many Python HTTP proxies http://proxies.xhaus.com/python/). This proxy server could behave pathologically, and test the corner cases of your pluggable transport.
When working on your experiments, automate the set up, running of the test, and processing of the results. This is not just to make your life easier but it means that your experiments can be repeatable. The scripts and configuration files should be checked into version control. Your goal should be that someone can check out your code, install a few standard packages via apt-get or yum, run a single command, and get the same results. There are tools to help do this (e.g. http://software-carpentry.org/4_0/data/mgmt.html and http://software-carpentry.org/4_0/data/bein.html) but just using make and shell scripts might be fine.
There's a lot to think about here, so we don't need answers to everything now, but if you have any questions or comments do let me know.
Best wishes, Steven