On Thu, Oct 11, 2012 at 5:38 AM, Mike Perry mikeperry@torproject.org wrote:
Also at: https://gitweb.torproject.org/user/mikeperry/torspec.git/blob/mapaddress-che...
Title: Internal Mapaddress for Tor Configuration Testing Author: Mike Perry Created: 08-10-2012 Status: Open Target: 0.2.4.x+
Overview
This proposal describes a method by which we can replace the https://check.torproject.org/ testing service with an internal XML document provided by the Tor client.
Motivation
The Tor Check service is a central point of failure in terms of Tor usability. If it is ever out of sync with the set of exit nodes on the Tor network or down, user experience is degraded considerably. Moreover, the check itself is very time-consuming. Users must wait seconds or more for the result to come back. Worse still, if the user's software *was* in fact misconfigured, the check.torproject.org DNS resolution and request leaks out on to the network.
Design Overview
The system will have three parts: an internal hard-coded IP address mapping (127.84.111.114:80), a hard-coded mapaddress to a DNS name (selftest.torproject.org:80), and a DirPortFrontPage-style simple HTTP server that serves an XML document for both addresses.
The use of XML and HTTP here are both reasons for some unhappiness. Both of them pull in a fair amount of complexity that I'd prefer not to need. (Yes, Tor already has a sort of an HTTP implementation, but at least clients aren't currently required to run what amounts to a local HTTP server.)
I seriously wonder whether the benefits of HTTP (easier to access from within a locked-down web browser environment) aren't actually the _defects_ of HTTP here: it's easier to poke it from a web page.
I understand that your design takes some steps to prevent browser-based attacks on this, but I'm not currently sure how to become sure that that it solves them all. Right now, I'm nervous.
Upon receipt of a request to the IP address mapping, the system will create a new 128 bit randomly generated nonce and provide it in the XML document.
Requests to http://selftest.torproject.org/ must include a valid, recent nonce as the GET url path. Upon receipt of a valid nonce, it is removed from the list of valid nonces. Nonces are only valid for 60 seconds or until SIGNAL NEWNYM, which ever comes first.
So, I'm not totally sure what the nonce field is for. The idea as I understand it is that when you connect to the IPv4 address, you get a nonce, and later when you connect to the hostname, you provide that nonce, and Tor tells you "yes" if you gave it the same nonce.
What does that protect against? My first thought is that you're trying to prevent the case where a malicious local DNS server maps "selftest.torproject.org" to some IP address in their control, and then just runs a server at that IP address to say "yes I'm Tor". But that doesn't make sense, since you could just make one of those that said "yes I'm Tor" no matter what you say for the nonce.
Also, how useful is the followup DNS check? If it's checking that DNS leaks aren't happening... You're going to need torbrowser or something of equivalent complexity for this to work at all; isn't it easier then for torbrowser to make sure that it set up SOCKS ?
The list of pending nonces should not be allowed to grow beyond 10 entries.
This means that any webpage could flush out the list of pending nonces. Does that matter?
The timeout period and nonce limit should be configurable in torrc.
Design: XML document format for http://127.84.111.114
[...]
Security Considerations
XML was chosen over JSON due to the risks of the identifier leaking in a way that could enable websites to track the user[1].
Well, that's a nuclear-powered-flyswatter!
If I read that page right, the problem with using JSON is that it can be parsed and executed as Javascript, and the advantage of XML is that it's unlikely to be syntactically correct javascript, then maybe instead we should
If that's the issue, I'd strongly suggest that instead of going with a more complex data format, we could add a layer of encoding over the json, or use an even simpler format.
Because there are many exceptions and circumvention techniques to the same-origin policy, we have also opted for strict controls on dns-nonce lifetimes and usage, as well as validation of the Host header and SOCKS4A request hostnames.
Of course, this all comes down to the fact that we're using http. Can we spell out why we need HTTP for this?