(This email got way out of hand from a basic 'I'll bounce an idea here',
here's to hoping I haven't made some huge oversight.)
I've been thinking about the https frontend after reading the basic problem
when I started looking into Tor dev but never took the time to read the
actual proposal. When I got some basic idea around how to solve the core
problem, I took the time to take a read and it turns out the proposal
actually has 90% of what I could think of, but I'm glad I took some time to
think from a (I hope) fresh perspective.
So anyway, on point. I think Designs #2&3 are the best ideas for Proposal
203 (probably leaning toward #2 more). They're basically the same concept
anyway. I came to the same conclusion that we definitely need a shared key
to be distributed per bridge address for this to work in any fashion,
ideally these keys could be rotated frequently. I also totally agree with
the server being a key implementation detail, ideally we want something
drop in that could go alongside an existing website. As for content I think
mock corporate login pages are a neat idea, while mock private forums are
not.
Regarding authentication and distinguishability, I don't agree with trying
to distinguish Tor clients from non-Tor based on anything the client
initially sends, as any sort of computation that isn't webserver-y could be
a timing attack or otherwise. I have some specific ideas around how we can
implement this to address the issues/concerns outlined in the current
proposal.
I think the best course of action is to use a webserver's core
functionalities to our advantage. I have not made much consideration for
client implementation. But here are some thoughts on how we could
potentially achieve our goals:
- Shared secrets are shared with users whenever bridge IPs are
exchanged, it is necessary for these to be large random values and not
user-like passwords. (As one of the Authorize proposals also mentions) This
exchange would ideally give a domain name for the bridge so as we're not
trying to connect to an IP, but to reduce user error the domain and key
should be concatenated and base64d so it's a single copy/paste for the user
without them trying to navigate to a url thinking it's a Tor enabled link
or something.
- The users Tor client (assuming they added the bridge), connects to the
server over https(tls) to the root domain. It should also downloads all the
resources attached to the main page, emulating a web browser for the
initial document.
- The server should reply with it's normal root page. This page can be
dynamically generated, no requirement for it to be static, the only
requirement is that one of the linked documents (css, js, img) be served
with a header that allows decent caching (>1hr). The far future file could
be the document itself but it doesn't have to be.
- (This part is probably way too trashy to the server performance,
I'm winging it as I can think of it.)
- For all files included in the main document, whichever has the
furthest future cache header, we'll call that file F.
- If we have precomputed the required values (see below) for F, then
we are ok to move to the next step (see next bullet point), otherwise,
serve all of the files with cache headers under one hour.
- If F doesn't have the pre-computations ready, this is the time to
spin off a subprocess to start calculating stuff (at lowish cpu priority
probably).
- The subprocess should calculate an intensive function (e.g. scrypt)
of hash(contents of F xored with the shared key) for (X...Y) iterations,
inclusive. X and Y should be chosen so that X is on the magnitude of
seconds to compute while Y is a couple of thousand iterations above it.
Store a 'map' of numberOfIterations => { result, hmac(result + tls cert
identifier) }. The hmac should be keyed with the shared secret. The tls
cert identifier should probably be its public key or signature? It should
store these results in fast cache (hopefully in memory).
- So we have our file F, and a precomputed value Z which was the
function applied Y times and has a hmac H. We set a cookie on the client
base64("Y || random padding || H")
- The server should remember which IPs which were given this Y value.
This cookie should pretty much look like any session cookie that
comes out
of rails, drupal, asp, anyone who's doing cookie sessions correctly. Once
the cookie is added to the headers, just serve the document as usual.
Essentially this should all be possible in an apache/nginx module as the
page content shouldn't matter.
- Here's a core idea: the server has a handler setup for each of the Z
values, hex encoded is probably best (longer!) e.g. /FFFEF421516AB3B2E42...
- The webserver should be setup to accept secure websocket upgrades
to these urls and route the connection to the local Tor socket.
- If the iteration value for the given url is not the same as the one
given to the ip trying the path, or the iteration value doesn't match Y,
the connection should be dropped/rejected. (This can be legitimate)
- If the connection is accepted, the current Y value should be
decremented. If Y < X for the current F then we should rotate our keys.
(This is a bit of a question, we could manipulate one of the files, but
that interferes with the website and could cause distinguishability).
- Basically after Y-X Tor clients (not related to how many https
users are served), we should be rotating our keys incase the keys leaked,
or changing handlers to stop old handlers being used.
- When rotating keys we should be sure to not accept requests on the
old handlers, by either removing them(404) or by 403ing them,
whatever. The
decrementing of Y is to try to make replay attacks less
feasible, although
that would mean tls was broken if they were able to get the
initial value,
but fuck, who knows with breach & crime et cetera.
- (Best read the rest before reading this part: To reduce key churn,
or allow long term guard-like functionality, the old handlers could be
saved and remain unique to a single ip, by sending cookies from client to
server that are a unique id accepted from that ip, the server
could know to
use an old shared key or something so the client wouldn't blacklist them.
Or the client could know not to blacklist previously successful
bridges by
remembering their tls cert or something. I haven't really thought much
about this, but it's probably manageable.)
- The idea here is that the webserver (apache/nginx) is working
EXACTLY as a normal webserver should, unless someone hits these
exact urls
which they should have a negligable chance of doing unless they have the
current shared secret. There might be a timing attack here, but in that
case we can just add a million other handlers that all lead to a
403? (But
either way, if someones spamming thousands of requests then you should be
able to ip block, but rotating keys should help reduce the feasability of
timing attacks or brute forcing?)
- So, how does the client figure out the url to use for wss://? Using
the cache headers, the client should be able to determine which file is F.
If all files are served with a cache header under one hour, then we wait a
time period T. Realistically, if the Tor client knows this is a bridge, the
only reason this wait should happen is if precomputings happening, so it
should just choose another bridge to use... or wait minutes and notify the
user that it's for good reason.
- Assuming we get a valid F, we look at our cookies. For all cookies,
if they're base64, convert to binary, then, try treating the
first K bytes
(we should have an upper bound for Y, lets say it's probably an 8 byte
unsigned long) as a number I. We replicate the computations that
the server
would have done to get our Zc(lient).
- Using this Zc, and the cert provided by the server, we can compute
our local Hc. If Hc doesn't match the last (Length of hmac used) bytes in
the cookie then try the next cookie.
- If no cookie matches, then we either have an old key or we're being
MITMd (the computation was ok but the cert didn't match). In these cases,
we should fake some user navigation for a couple of pages then close the
connection and blacklist the bridge (run for the hills and don't blow the
bridge!).
- If we get a match, then we know Zc, so we upgrade the connection to
wss://domain/Zc which should be a valid secure websocket
connection (usable
as tcp) unless another ip was already accepted on this iteration
value then
the server should reject us. If we get rejected at this stage,
we know the
server had good reason (trying to stop replays) so we just retry from the
start and cross our digital fingers. (If bridges are sufficiently private
then this should be a non-issue as it will likely only happen with 2 Tor
clients connecting within the same second or so)
- At this point there should be an encrypted tcp tunnel between the Tor
client and the bridge's apache/nginx, and an unencrypted connection between
the webserver and the bridge's Tor socket. Should be able to just talk Tor
protocol now and get on with things.
So to summarise,
- Using general web tools to negotiate, secret paths, headers and cookies
- Proof of work-ish system using the shared key to establish a unique url
- Checking for a MITM and allowing key rotation by using our shared key
with a hmac to determine:
- If the provided certificate matches what the server thought it gave
us
- If our result is correct with our current key
- Assuming it's sound, I think the serverside could be implemented as an
apache module that could be a relatively easy drop in.
Concerns:
- Distinguishability of client https & websockets implementation.
- Content for servers
- Everything above as I'm sure theres obviously critical flaw I'm
overlooking!
- Amount of work it would take on client side :(
Rym