Filename: 189-authorize-cell.txt Title: AUTHORIZE and AUTHORIZED cells Author: George Kadianakis Created: 04 Nov 2011 Status: Open
1. Overview
Proposal 187 introduced the concept of the AUTHORIZE cell, a cell whose purpose is to make Tor bridges resistant to scanning attacks.
This is achieved by having the bridge and the client share a secret out-of-band and then use AUTHORIZE cells to validate that the client indeed knows that secret before proceeding with the Tor protocol.
This proposal specifies the format of the AUTHORIZE cell and also introduces the AUTHORIZED cell, a way for bridges to announce to clients that the authorization process is complete and successful.
2. Motivation
AUTHORIZE cells should be able to perform a variety of authorization protocols based on a variety of shared secrets. This forces the AUTHORIZE cell to have a dynamic format based on the authorization method used.
AUTHORIZED cells are used by bridges to signal the end of a successful bridge client authorization and the beginning of the actual link handshake. AUTHORIZED cells have no other use and for this reason their format is very simple.
Both AUTHORIZE and AUTHORIZED cells are to be used under censorship conditions and they should look innocuous to any adversary capable of monitoring network traffic.
As an attack example, an adversary could passively monitor the traffic of a bridge host, looking at the packets directly after the TLS handshake and trying to deduce from their packet size if they are AUTHORIZE and AUTHORIZED cells. For this reason, AUTHORIZE and AUTHORIZED cells are padded with a random amount of padding before sending.
3. Design
3.1. AUTHORIZE cell
The AUTHORIZE cell is a variable-sized cell.
The generic AUTHORIZE cell format is:
AuthMethod [1 octet] MethodFields [...] PadLen [2 octets] Padding ['PadLen' octets]
where:
'AuthMethod', is the authorization method to be used.
'MethodFields', is dependent on the authorization Method used. It's a meta-field hosting an arbitrary amount of fields.
'PadLen', specifies the amount of padding in octets.
'Padding', is 'PadLen' octets of random content.
3.2. AUTHORIZED cell format
The AUTHORIZED cell is a variable-sized cell.
The AUTHORIZED cell format is:
'AuthMethod' [1 octet] 'PadLen' [2 octets] 'Padding' ['PadLen' octets]
where all fields have the same meaning as in section 3.1.
3.3. Cell parsing
Implementations MUST ignore the contents of 'Padding'.
Implementations MUST reject an AUTHORIZE or AUTHORIZED cell where the 'Padding' field is not 'PadLen' octets long.
Implementations MUST reject an AUTHORIZE cell with an 'AuthMethod' they don't recognize.
4. Discussion
4.1. Why not let the pluggable transports do the padding, like they are supposed to do for the rest of the Tor protocol?
The arguments of section "Alternative design: Just use pluggable transports" of proposal 187, apply here as well:
All bridges who use client authorization will also need camouflaged AUTHORIZE/AUTHORIZED cell.
4.2. How should multiple round-trip authorization protocols be handled?
Protocols that require multiple round-trips between the client and the bridge should use AUTHORIZE cells for communication.
The format of the AUTHORIZE cell is flexible enough to support messages from the client to the bridge and the inverse.
In the end of a successful multiple round-trip protocol, an AUTHORIZED cell must be issued from the bridge to the client.
4.3. AUTHORIZED seems useless. Why not use VPADDING instead?
As noted in proposal 187, the Tor protocol uses VPADDING cells for padding; any other use of VPADDING makes the Tor protocol kludgy.
In the future, and in the example case of a v3 handshake, a client can optimistically send a VERSIONS cell along with the final AUTHORIZE cell of an authorization protocol. That allows the bridge, in the case of successful authorization, to also process the VERSIONS cell and begin the v3 handshake promptly.
On 2011-11-04, George Kadianakis desnacked@gmail.com wrote:
Filename: 189-authorize-cell.txt Title: AUTHORIZE and AUTHORIZED cells Author: George Kadianakis Created: 04 Nov 2011 Status: Open
Overview
Proposal 187 introduced the concept of the AUTHORIZE cell, a cell whose purpose is to make Tor bridges resistant to scanning attacks.
This is achieved by having the bridge and the client share a secret out-of-band and then use AUTHORIZE cells to validate that the client indeed knows that secret before proceeding with the Tor protocol.
This proposal specifies the format of the AUTHORIZE cell and also introduces the AUTHORIZED cell, a way for bridges to announce to clients that the authorization process is complete and successful.
Motivation
AUTHORIZE cells should be able to perform a variety of authorization protocols based on a variety of shared secrets. This forces the AUTHORIZE cell to have a dynamic format based on the authorization method used.
AUTHORIZED cells are used by bridges to signal the end of a successful bridge client authorization and the beginning of the actual link handshake. AUTHORIZED cells have no other use and for this reason their format is very simple.
Both AUTHORIZE and AUTHORIZED cells are to be used under censorship conditions and they should look innocuous to any adversary capable of monitoring network traffic.
I wrote the following in my reply to proposal 190, but it probably belongs here instead:
| An adversary who MITMs the TLS connection and receives a Tor AUTHORIZE | cell will know that the client is trying to connect to a Tor bridge. | | Should the client send a string of the form "GET | /?q=correct+horse+battery+staple\r\n\r\n" instead of an AUTHORIZE | cell, where "correct+horse+battery+staple" is a semi-plausible search | phrase derived from the HMAC in some way?
As an attack example, an adversary could passively monitor the traffic of a bridge host, looking at the packets directly after the TLS handshake and trying to deduce from their packet size if they are AUTHORIZE and AUTHORIZED cells. For this reason, AUTHORIZE and AUTHORIZED cells are padded with a random amount of padding before sending.
What distribution should this 'random amount' be chosen from?
- Design
3.1. AUTHORIZE cell
The AUTHORIZE cell is a variable-sized cell.
The generic AUTHORIZE cell format is:
AuthMethod [1 octet] MethodFields [...] PadLen [2 octets] Padding ['PadLen' octets]
where:
'AuthMethod', is the authorization method to be used.
'MethodFields', is dependent on the authorization Method used. It's a meta-field hosting an arbitrary amount of fields.
'PadLen', specifies the amount of padding in octets.
'Padding', is 'PadLen' octets of random content.
3.2. AUTHORIZED cell format
The AUTHORIZED cell is a variable-sized cell.
The AUTHORIZED cell format is:
'AuthMethod' [1 octet] 'PadLen' [2 octets] 'Padding' ['PadLen' octets]
where all fields have the same meaning as in section 3.1.
3.3. Cell parsing
Implementations MUST ignore the contents of 'Padding'.
Implementations MUST reject an AUTHORIZE or AUTHORIZED cell where the 'Padding' field is not 'PadLen' octets long.
Implementations MUST reject an AUTHORIZE cell with an 'AuthMethod' they don't recognize.
What does "reject" mean here?
- Discussion
4.1. Why not let the pluggable transports do the padding, like they are supposed to do for the rest of the Tor protocol?
The arguments of section "Alternative design: Just use pluggable transports" of proposal 187, apply here as well:
All bridges who use client authorization will also need camouflaged AUTHORIZE/AUTHORIZED cell.
What does "camouflaged" mean here?
4.2. How should multiple round-trip authorization protocols be handled?
s/multiple round/multiple-round/ # it's part of a phrase acting as an ad-something
Protocols that require multiple round-trips between the client and the bridge should use AUTHORIZE cells for communication.
.-1s/round-trips/round trips/ # it's part of a top-level noun phrase
The format of the AUTHORIZE cell is flexible enough to support messages from the client to the bridge and the inverse.
s/inverse/reverse/
When can an AUTHORIZE cell be sent, and by whom?
Can a bridge which requires client authorization perform reachability and bandwidth self-tests? If so, how?
In the end of a successful multiple round-trip protocol, an AUTHORIZED cell must be issued from the bridge to the client.
.-1s/multiple round/multiple-round/ # it's part of a phrase acting as an ad-something s/In/At/
4.3. AUTHORIZED seems useless. Why not use VPADDING instead?
As noted in proposal 187, the Tor protocol uses VPADDING cells for padding; any other use of VPADDING makes the Tor protocol kludgy.
In the future, and in the example case of a v3 handshake, a client can optimistically send a VERSIONS cell along with the final AUTHORIZE cell of an authorization protocol. That allows the bridge, in the case of successful authorization, to also process the VERSIONS cell and begin the v3 handshake promptly.
Robert Ransom
On Fri, Nov 4, 2011 at 4:10 PM, Robert Ransom rransom.8774@gmail.com wrote:
On 2011-11-04, George Kadianakis desnacked@gmail.com wrote:
Filename: 189-authorize-cell.txt Title: AUTHORIZE and AUTHORIZED cells Author: George Kadianakis Created: 04 Nov 2011 Status: Open
- Overview
Proposal 187 introduced the concept of the AUTHORIZE cell, a cell whose purpose is to make Tor bridges resistant to scanning attacks.
This is achieved by having the bridge and the client share a secret out-of-band and then use AUTHORIZE cells to validate that the client indeed knows that secret before proceeding with the Tor protocol.
This proposal specifies the format of the AUTHORIZE cell and also introduces the AUTHORIZED cell, a way for bridges to announce to clients that the authorization process is complete and successful.
- Motivation
AUTHORIZE cells should be able to perform a variety of authorization protocols based on a variety of shared secrets. This forces the AUTHORIZE cell to have a dynamic format based on the authorization method used.
AUTHORIZED cells are used by bridges to signal the end of a successful bridge client authorization and the beginning of the actual link handshake. AUTHORIZED cells have no other use and for this reason their format is very simple.
Both AUTHORIZE and AUTHORIZED cells are to be used under censorship conditions and they should look innocuous to any adversary capable of monitoring network traffic.
I wrote the following in my reply to proposal 190, but it probably belongs here instead:
| An adversary who MITMs the TLS connection and receives a Tor AUTHORIZE | cell will know that the client is trying to connect to a Tor bridge. | | Should the client send a string of the form "GET | /?q=correct+horse+battery+staple\r\n\r\n" instead of an AUTHORIZE | cell, where "correct+horse+battery+staple" is a semi-plausible search | phrase derived from the HMAC in some way?
Seems to me at that point we are hosed anyway. If you see correct+horse+battery+staple and the response is garbled data, not an HTTP response, its probably something unusual. Bridge descriptors should include enough information for Tor to ensure that the TLS connection is safe. If we are protecting against passive scanning then we just need to make it look like a webserver. One good way of doing that: ask people who have webservers to run bridges, and have Tor simply pass any confused HTTP requests to the actual webserver. (These shouldn't be popular sites) Sincerely, Watson Ladd
On 04/11/11 21:37, Watson Ladd wrote:
On Fri, Nov 4, 2011 at 4:10 PM, Robert Ransom rransom.8774@gmail.com wrote:
| Should the client send a string of the form "GET | /?q=correct+horse+battery+staple\r\n\r\n" instead of an AUTHORIZE | cell, where "correct+horse+battery+staple" is a semi-plausible search | phrase derived from the HMAC in some way?
Seems to me at that point we are hosed anyway. If you see correct+horse+battery+staple and the response is garbled data, not an HTTP response, its probably something unusual. Bridge descriptors should include enough information for Tor to ensure that the TLS connection is safe.
What if the GET request can be anything innocuous (e.g. robots.txt, index.html) and a valid document is sent back. But the headers include an ETag derived in some way from the document content (or just the URL), the shared secret and the bridge's TLS cert. If there's a MITM then the client will compute a different ETag (due to the wrong cert) and can close the connection. Otherwise it can immediately initiate the full authorisation sequence.
(NB. I'm not a cryptographer; feel free to tell me where the flaw in my logic lies)
Julian
On Fri, Nov 4, 2011 at 8:01 PM, Julian Yon julian@yon.org.uk wrote:
On 04/11/11 21:37, Watson Ladd wrote:
On Fri, Nov 4, 2011 at 4:10 PM, Robert Ransom rransom.8774@gmail.com wrote:
| Should the client send a string of the form "GET | /?q=correct+horse+battery+staple\r\n\r\n" instead of an AUTHORIZE | cell, where "correct+horse+battery+staple" is a semi-plausible search | phrase derived from the HMAC in some way?
Seems to me at that point we are hosed anyway. If you see correct+horse+battery+staple and the response is garbled data, not an HTTP response, its probably something unusual. Bridge descriptors should include enough information for Tor to ensure that the TLS connection is safe.
What if the GET request can be anything innocuous (e.g. robots.txt, index.html) and a valid document is sent back. But the headers include an ETag derived in some way from the document content (or just the URL), the shared secret and the bridge's TLS cert. If there's a MITM then the client will compute a different ETag (due to the wrong cert) and can close the connection. Otherwise it can immediately initiate the full authorisation sequence.
(NB. I'm not a cryptographer; feel free to tell me where the flaw in my logic lies)
ETag is a great idea. We can then have bridges run their own web content or specify a page to serve up (with suitably redirected links) or forwards real requests to an actual webserver. This way every bridge can hide differently: serving tor.eff.org everywhere would be a dead giveaway.
Sincerely, Watson Ladd
On 11/04/2011 09:19 PM, Watson Ladd wrote:
On Fri, Nov 4, 2011 at 8:01 PM, Julian Yonjulian@yon.org.uk wrote:
What if the GET request can be anything innocuous (e.g. robots.txt, index.html) and a valid document is sent back. But the headers include an ETag derived in some way from the document content (or just the URL), the shared secret and the bridge's TLS cert. If there's a MITM then the client will compute a different ETag (due to the wrong cert) and can close the connection. Otherwise it can immediately initiate the full authorisation sequence.
ETag is a great idea. We can then have bridges run their own web content or specify a page to serve up (with suitably redirected links) or forwards real requests to an actual webserver. This way every bridge can hide differently: serving tor.eff.org everywhere would be a dead giveaway.
I love this line of thinking. But what if the MitM calls your bluff and returns his own cookie, ETag header and a 302 Redirect to the same page? What would the client do then? If the client did observe the redirect as a browser would, he may be unable to try again for some time. Otherwise, it would tend to confirm the status of the server as a Tor node.
Seems like what we want is like TLS channel bindings to detect the MitM. This is standardized. http://tools.ietf.org/html/rfc5056 Microsoft IE+IIS implemented it, it thwarts their MitM tool:
http://blogs.msdn.com/b/fiddler/archive/2010/10/15/fiddler-https-decryption-...
Tossing out an idea here: maybe this would work better backwards.
What if the client were to choose any innocuous-looking URL to request before initiating the handshake? Then he could bury an HMAC for a message including that URL in the TLS client_hello.random. The HMAC key could derived from the AUTHORIZE secret.
The client_random is supposed to contain a (generous) 28 random bytes transmitted in the clear. AFAICT, it's mainly to prevent replay attacks and is most important in special cases like session resumption and certs with fixed DH parameters. The encrypted premaster secret adds significant client-supplied entropy to the handshake too. Replacing some of the true random bytes with an HMAC formed from a secret key should not reduce the entropy (as perceived by the active attacker) too much I think.
The message input to the HMAC should probably include the rest of the client_random.
The client could also include some unpredictable stuff in the request (e.g., some meaningless cookies). This could prevent any net unpredictability loss in the client_random, so even if an attacker knew the AUTHORIZE secret it would not enable any additional attacks on the TLS handshake.
I would like other people to double-check my reasoning on this obviously.
- Marsh
On Fri, Nov 4, 2011 at 11:35 PM, Marsh Ray marsh@extendedsubset.com wrote:
On 11/04/2011 09:19 PM, Watson Ladd wrote:
On Fri, Nov 4, 2011 at 8:01 PM, Julian Yonjulian@yon.org.uk wrote:
What if the GET request can be anything innocuous (e.g. robots.txt, index.html) and a valid document is sent back. But the headers include an ETag derived in some way from the document content (or just the URL), the shared secret and the bridge's TLS cert. If there's a MITM then the client will compute a different ETag (due to the wrong cert) and can close the connection. Otherwise it can immediately initiate the full authorisation sequence.
ETag is a great idea. We can then have bridges run their own web content or specify a page to serve up (with suitably redirected links) or forwards real requests to an actual webserver. This way every bridge can hide differently: serving tor.eff.org everywhere would be a dead giveaway.
I love this line of thinking. But what if the MitM calls your bluff and returns his own cookie, ETag header and a 302 Redirect to the same page? What would the client do then? If the client did observe the redirect as a browser would, he may be unable to try again for some time. Otherwise, it would tend to confirm the status of the server as a Tor node.
Seems like what we want is like TLS channel bindings to detect the MitM. This is standardized. http://tools.ietf.org/html/rfc5056 Microsoft IE+IIS implemented it, it thwarts their MitM tool:
http://blogs.msdn.com/b/fiddler/archive/2010/10/15/fiddler-https-decryption-...
Tossing out an idea here: maybe this would work better backwards.
What if the client were to choose any innocuous-looking URL to request before initiating the handshake? Then he could bury an HMAC for a message including that URL in the TLS client_hello.random. The HMAC key could derived from the AUTHORIZE secret.
I don't know enough aboutTLS to comment on this. But I do know Telex used a covert channel in TLS to good effect. Maybe we can do some kind of similar stunt.
The client_random is supposed to contain a (generous) 28 random bytes transmitted in the clear. AFAICT, it's mainly to prevent replay attacks and is most important in special cases like session resumption and certs with fixed DH parameters. The encrypted premaster secret adds significant client-supplied entropy to the handshake too. Replacing some of the true random bytes with an HMAC formed from a secret key should not reduce the entropy (as perceived by the active attacker) too much I think.
The message input to the HMAC should probably include the rest of the client_random.
The client could also include some unpredictable stuff in the request (e.g., some meaningless cookies). This could prevent any net unpredictability loss in the client_random, so even if an attacker knew the AUTHORIZE secret it would not enable any additional attacks on the TLS handshake.
I would like other people to double-check my reasoning on this obviously.
- Marsh
On 05/11/11 04:35, Marsh Ray wrote:
I love this line of thinking. But what if the MitM calls your bluff and returns his own cookie, ETag header and a 302 Redirect to the same page? What would the client do then? If the client did observe the redirect as a browser would, he may be unable to try again for some time. Otherwise, it would tend to confirm the status of the server as a Tor node.
The problem is more general. What should the client do under any circumstance when it's unable to authenticate the bridge? Assuming a degree of justified paranoia, you probably want to leave it as long as possible before retrying. You may not even want to risk connecting to a *different* bridge, as it could be your own connection being intercepted and then you're just giving your adversary more information.
Seems like what we want is like TLS channel bindings to detect the MitM. This is standardized. http://tools.ietf.org/html/rfc5056 Microsoft IE+IIS implemented it, it thwarts their MitM tool:
http://blogs.msdn.com/b/fiddler/archive/2010/10/15/fiddler-https-decryption-...
I don't know enough about this. I'll have to read the documents before I can comment.
J
Julian Yon julian@yon.org.uk writes:
On 04/11/11 21:37, Watson Ladd wrote:
On Fri, Nov 4, 2011 at 4:10 PM, Robert Ransom rransom.8774@gmail.com wrote:
| Should the client send a string of the form "GET | /?q=correct+horse+battery+staple\r\n\r\n" instead of an AUTHORIZE | cell, where "correct+horse+battery+staple" is a semi-plausible search | phrase derived from the HMAC in some way?
Seems to me at that point we are hosed anyway. If you see correct+horse+battery+staple and the response is garbled data, not an HTTP response, its probably something unusual. Bridge descriptors should include enough information for Tor to ensure that the TLS connection is safe.
What if the GET request can be anything innocuous (e.g. robots.txt, index.html) and a valid document is sent back. But the headers include an ETag derived in some way from the document content (or just the URL), the shared secret and the bridge's TLS cert. If there's a MITM then the client will compute a different ETag (due to the wrong cert) and can close the connection. Otherwise it can immediately initiate the full authorisation sequence.
(NB. I'm not a cryptographer; feel free to tell me where the flaw in my logic lies)
Julian
There are some things in these HTTP solutions that make me nervous.
In the "GET /?q=correct+horse+battery+staple\r\n\r\n" client-side case we will have to build HTTP header spoofing into the tor client, which is not fun since modern browsers send loads of HTTP headers in their first GET.
In the Etags/Cookies/whatever server-side case we will probably have to build some sort of 'valid document'/'innocuous webpage' generator into the Tor bridge, which is also not fun. Fortunately, we might be able to reuse such a construction when Bridge Passwords fail: https://lists.torproject.org/pipermail/tor-dev/2011-October/002996.html
Still, I would very much enjoy if we could find a way to authenticate the bridge using the shared secret without relying on HTTP covert channel wizardry.
I've been thinking of having bridges that use Bridge Passwords, "tag" their SSL certificate, say the Serial Number, with their password, and have clients validate them before proceeding with AUTHORIZE cells.
On 05/11/11 03:21, George Kadianakis wrote:
There are some things in these HTTP solutions that make me nervous.
In the "GET /?q=correct+horse+battery+staple\r\n\r\n" client-side case we will have to build HTTP header spoofing into the tor client, which is not fun since modern browsers send loads of HTTP headers in their first GET.
A valid concern. Also applies to the ETag proposal.
In the Etags/Cookies/whatever server-side case we will probably have to build some sort of 'valid document'/'innocuous webpage' generator into the Tor bridge, which is also not fun. Fortunately, we might be able to reuse such a construction when Bridge Passwords fail: https://lists.torproject.org/pipermail/tor-dev/2011-October/002996.html
My thought was that it's not too hard to proxy to a real webserver for content and inject/modify headers as required.
Still, I would very much enjoy if we could find a way to authenticate the bridge using the shared secret without relying on HTTP covert channel wizardry.
We're really talking about steganography here rather than a true covert channel. I believe the purpose is to avoid bridge enumeration due to the initial connection having a fingerprint? So you need an invisible method of authentication. It may be that distributing more information to bridge users out-of-band is actually the best approach. But to me the advantage of a technical solution is increased resistance to social engineering.
I've been thinking of having bridges that use Bridge Passwords, "tag" their SSL certificate, say the Serial Number, with their password, and have clients validate them before proceeding with AUTHORIZE cells.
That's certainly subtle. You're left with the problem of what the client should do if it can't authenticate the bridge. It still needs to send something down the pipe that it opened, and the server still needs to respond to that, otherwise the unused connection will look suspicious.
J
Julian Yon julian@yon.org.uk writes:
On 05/11/11 03:21, George Kadianakis wrote:
There are some things in these HTTP solutions that make me nervous.
In the "GET /?q=correct+horse+battery+staple\r\n\r\n" client-side case we will have to build HTTP header spoofing into the tor client, which is not fun since modern browsers send loads of HTTP headers in their first GET.
A valid concern. Also applies to the ETag proposal.
In the Etags/Cookies/whatever server-side case we will probably have to build some sort of 'valid document'/'innocuous webpage' generator into the Tor bridge, which is also not fun. Fortunately, we might be able to reuse such a construction when Bridge Passwords fail: https://lists.torproject.org/pipermail/tor-dev/2011-October/002996.html
My thought was that it's not too hard to proxy to a real webserver for content and inject/modify headers as required.
Still, I would very much enjoy if we could find a way to authenticate the bridge using the shared secret without relying on HTTP covert channel wizardry.
We're really talking about steganography here rather than a true covert channel. I believe the purpose is to avoid bridge enumeration due to the initial connection having a fingerprint? So you need an invisible method of authentication. It may be that distributing more information to bridge users out-of-band is actually the best approach. But to me the advantage of a technical solution is increased resistance to social engineering.
I've been thinking of having bridges that use Bridge Passwords, "tag" their SSL certificate, say the Serial Number, with their password, and have clients validate them before proceeding with AUTHORIZE cells.
That's certainly subtle. You're left with the problem of what the client should do if it can't authenticate the bridge. It still needs to send something down the pipe that it opened, and the server still needs to respond to that, otherwise the unused connection will look suspicious.
J
Thanks for the ideas and the interest guys!
I think it's time to reroute this thread towards comments on proposal 189 and scanning resistance; that is resistance against adversaries who scan hosts to find bridges.
We will surely need a different thread and a different proposal for resistance against censors using MITM attacks to detect bridges.
I improved the original proposal based on the comments of Robert. Inlining:
Filename: 189-authorize-cell.txt Title: AUTHORIZE and AUTHORIZED cells Author: George Kadianakis Created: 04 Nov 2011 Status: Open
1. Overview
Proposal 187 introduced the concept of the AUTHORIZE cell, a cell whose purpose is to make Tor bridges resistant to scanning attacks.
This is achieved by having the bridge and the client share a secret out-of-band and then use AUTHORIZE cells to validate that the client indeed knows that secret before proceeding with the Tor protocol.
This proposal specifies the format of the AUTHORIZE cell and also introduces the AUTHORIZED cell, a way for bridges to announce to clients that the authorization process is complete and successful.
2. Motivation
AUTHORIZE cells should be able to perform a variety of authorization protocols based on a variety of shared secrets. This forces the AUTHORIZE cell to have a dynamic format based on the authorization method used.
AUTHORIZED cells are used by bridges to signal the end of a successful bridge client authorization and the beginning of the actual link handshake. AUTHORIZED cells have no other use and for this reason their format is very simple.
Both AUTHORIZE and AUTHORIZED cells are to be used under censorship conditions and they should look innocuous to any adversary capable of monitoring network traffic.
As an attack example, an adversary could passively monitor the traffic of a bridge host, looking at the packets directly after the TLS handshake and trying to deduce from their packet size if they are AUTHORIZE and AUTHORIZED cells. For this reason, AUTHORIZE and AUTHORIZED cells are padded with a random amount of padding before sending.
3. Design
3.1. AUTHORIZE cell
The AUTHORIZE cell is a variable-sized cell.
The generic AUTHORIZE cell format is:
AuthMethod [1 octet] MethodFields [...] PadLen [2 octets] Padding ['PadLen' octets]
where:
'AuthMethod', is the authorization method to be used.
'MethodFields', is dependent on the authorization Method used. It's a meta-field hosting an arbitrary amount of fields.
'PadLen', specifies the amount of padding in octets. Implementations SHOULD pick 'PadLen' to be a random integer from 1 to 3141 inclusive.
'Padding', is 'PadLen' octets of random content.
3.2. AUTHORIZED cell format
The AUTHORIZED cell is a variable-sized cell.
The AUTHORIZED cell format is:
'AuthMethod' [1 octet] 'PadLen' [2 octets] 'Padding' ['PadLen' octets]
where all fields have the same meaning as in section 3.1.
3.3. Cell parsing
Implementations MUST ignore the contents of 'Padding'.
Implementations MUST reject an AUTHORIZE or AUTHORIZED cell where the 'Padding' field is not 'PadLen' octets long.
Implementations MUST reject an AUTHORIZE cell with an 'AuthMethod' they don't recognize.
4. Discussion
4.1. What's up with the [1,3141] padding bytes range?
The upper limit is larger than the Ethernet MTU so that AUTHORIZE and AUTHORIZED cells are not always transmitted into a single packet. Other than that, it's indeed pretty much arbitrary.
4.2. Why not let the pluggable transports do the padding, like they are supposed to do for the rest of the Tor protocol?
The arguments of section "Alternative design: Just use pluggable transports" of proposal 187, apply here as well:
All bridges who use client authorization will also need padded AUTHORIZE and AUTHORIZED cells.
4.3. How should multiple round-trip authorization protocols be handled?
Protocols that require multiple-round trips between the client and the bridge should use AUTHORIZE cells for communication.
The format of the AUTHORIZE cell is flexible enough to support messages from the client to the bridge and the reverse.
At the end of a successful multiple round-trip protocol, an AUTHORIZED cell must be issued from the bridge to the client.
4.4. AUTHORIZED seems useless. Why not use VPADDING instead?
As noted in proposal 187, the Tor protocol uses VPADDING cells for padding; any other use of VPADDING makes the Tor protocol kludgy.
In the future, and in the example case of a v3 handshake, a client can optimistically send a VERSIONS cell along with the final AUTHORIZE cell of an authorization protocol. That allows the bridge, in the case of successful authorization, to also process the VERSIONS cell and begin the v3 handshake promptly.
4.5. What should actually happen when a bridge rejects an AUTHORIZE cell?
When a bridge detects a badly formed or malicious AUTHORIZE cell, it should assume that the other side is an adversary scanning for bridges. The bridge should then act accordingly to avoid detection.
This proposal does not try to specify how a bridge can avoid detection by an adversary.
On Sun, Nov 06, 2011 at 01:45:43AM +0100, George Kadianakis wrote:
3.1. AUTHORIZE cell
The AUTHORIZE cell is a variable-sized cell.
The generic AUTHORIZE cell format is:
AuthMethod [1 octet] MethodFields [...] PadLen [2 octets] Padding ['PadLen' octets]
Why include PadLen and Padding? A variable-sized cell already says how big it is. So the client can pick a size for the variable-sized cell (the client has to anyway), and then any unused space is unused.
The AUTHORIZED cell format is:
'AuthMethod' [1 octet] 'PadLen' [2 octets] 'Padding' ['PadLen' octets]
Same here.
--Roger