Filename: 202-improved-relay-crypto.txt Title: Two improved relay encryption protocols for Tor cells Author: Nick Mathewson Created: 19 Jun 2012 Status: Open
Overview:
This document describes two candidate designs for a better Tor relay encryption/decryption protocol, designed to stymie tagging attacks and better accord with best practices in protocol design.
My hope is that readers will examine these protocols, evaluate their security and practicality, improve on them, and help to pick one for implementation in Tor.
In section 1, I describe Tor's current relay crypto protocol, its drawbacks, how it fits in with the rest of Tor, and some requirements/desiderata for a replacement. In sections 2 and 3, I propose two alternative replacements for this protocol. In section 4, I discuss their pros and cons.
1. Background and Motivation
1.0. A short overview of Tor's current protocols
The core pieces of the Tor protocol are the link protocol, the circuit extend protocol, the relay protocol, and the stream protocol. All are documented in [TorSpec].
Briefly:
- The link protocol handles all direct pairwise communication between nodes. Everything else is transmitted over it. It uses TLS.
- The circuit extend protocol uses public-key crypto to set up multi-node virtual tunnels, called "circuits", from a client through one or more nodes.
*** The relay protocol uses established circuits to communicate from a client to a node on a circuit. That's the one we'll be talking about here. ***
- The stream protocol is tunneled over relay protocol; clients use it to tell servers to open anonymous TCP connections, to send data, and so forth. Servers use it to report success or failure opening anonymous TCP connections, to send data from TCP connections back to clients, and so forth.
In more detail: The link protocol's job is to connect two computers with an encrypted, authenticated stream, to authenticate one or both of them to the other, and to provide a channel for passing "cells" between them. The circuit extend protocol's job is to set up circuits: persistent tunnels that connect a Tor client to an exit node through a series of (typically three) hops, each of which knows only the previous and next hop, and each of which has a set of keys that they share only with the client. Finally, the relay protocol's job is to allow a client to communicate bidirectionally with the node(s) on the circuit, once their shared keys have been established.
(We'll ignore the stream protocol for the rest of this document.)
Note on terminology: Tor nodes are sometimes called "servers", "relays", or "routers". I'll use all these terms more or less interchangeably. For simplicity's sake, I will call the party who is constructing and using a circuit "the client" or "Alice", even though nodes sometimes construct circuits too.
Tor's internal packets are called "cells". All the cells we deal with here are 512 bytes long.
The nodes in a circuit are called its "hops"; most circuits are 3 hops long. This doesn't include the client: when Alice builds a circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is "Bob_1" and the last hop is "Bob_3".
1.1. The current relay protocol and its drawbacks
[This section describes Tor's current relay protocol. It is not a proposal; rather it is what we do now. Sections 2 and 3 have my proposed replacements for it.]
A "relay cell" is a cell that is generated by the client to send to a node, or by a node to send to the client. It's called a "relay" cell because a node that receives one may need to relay it to the next or previous node in the circuit (or to handle the cell itself).
A relay cell moving towards the client is called "inbound"; a cell moving away is called "outbound".
When a relay cell is constructed by the client, the client adds one layer of crypto for each node that will process the cell, and gives the cell to the first node in the circuit. Each node in turn then removes one layer of crypto, and either forwards the cell to the next node in the circuit or acts on that cell itself.
When a relay cell is constructed by a node, it encrypts it and sends it to the preceding node in the circuit. Each node between the originating node and the client then encrypts the cell and passes it back to the preceding node. When the client receives the cell, it removes layers of crypto until it has an unencrypted cell, which it then acts on.
In the current protocol, the body of each relay cell contains, in its unencrypted form:
Relay command [1 byte] Zeros [2 bytes] StreamID [2 bytes] Digest [4 bytes] Length [2 bytes] Data [498 bytes]
(This only adds up to 509 bytes. That's because the Tor link protocol transfers 512-byte cells, and has a 3 byte overhead per cell. Not how we would do it if we were starting over at this point.)
At every node of a circuit, the node relaying a cell encrypts/decrypts it with AES128-CTR. If the cell is outbound and the "zeros" field is set to all-zeros, the node additionally checks whether 'digest' is correct. A correct digest is the first 4 bytes of the running SHA1 digest of: a shared secret, concatenated with all the relay cells destined for this node on this circuit so far, including this cell. If _that's_ true, then the node accepts this cell. (See section 6 of [TorSpec] for full detail; see section A.1 for a more rigorous description.)
The current approach has some actual problems. Notably:
* It permits tagging attacks. Because AES_CTR is an XOR-based stream cipher, an adversary who controls the first node in a circuit can XOR anything he likes into the relay cell, and then see whether he/she encounters an correspondingly defaced cell at some exit that he also controls.
That is, the attacker picks some pattern P, and when he would transmit some outbound relay cell C at hop 1, he instead transmits C xor P. If an honest exit receives the cell, the digest will probably be wrong, and the honest exit will reject it. But if the attacker controls the exit, he will notice that he has received a cell C' where the digest doesn't match, but where C' xor P _does_ have a good digest. The attacker will then know that his two nodes are on the same circuit, and thereby be able to link the user (whom the first node sees) to her activities (which the last node sees).
Back when we did the Tor design, this didn't seem like a big deal, since an adversary who controls both the first and last node in a circuit is presumed to win already based on traffic correlation attacks. This attack seemed strictly worse than that, since it was trivially detectable in the case where the attacker _didn't_ control both ends. See section 4.4 of the Tor paper [TorDesign] for our early thoughts here; see Xinwen Fu et al's 2009 work for a more full explanation of the in-circuit tagging attack [XF]; and see "The 23 Raccoons'" March 2012 "Analysis of the Relative Severity of Tagging Attacks" mail on tor-dev (and the ensuing thread) for a discussion of why we may want to care after all, due to attacks that use tagging to amplify route capture. [23R]
It also has some less practical issues.
* The digest portion is too short. Yes, if you're an attacker trying to (say) change an "ls *" into an "rm *", you can only expect to get one altered cell out of 2^32 accepted -- and all future cells on the circuit will be rejected with similar probability due to the change in the running hash -- but 1/2^32 is a pretty high success rate for crypto attacks.
* It does MAC-then-encrypt. That makes smart people cringe.
* Its approach to MAC is H(Key | Msg), which is vulnerable to length extension attack if you've got a Merkle-Damgard hash (which we do). This isn't relevant in practice right now, since the only parties who see the digest are the two parties that rely on it (because of the MAC-then-encrypt).
1.2. Some feature requirements
Relay cells can originate at the client or at a relay. Relay cells that originate at the client are given to the first node in the circuit, and constructed so that they will be decrypted and forwarded by the first M-1 nodes in the circuit, and finally decrypted and processed by the Mth node, where the client chooses M. (Usually, the Mth node is the the last one, which will be an exit node.) Relay cells that originate at a hop in the circuit are given to the preceding node, and eventually delivered to the client.
Tor provides a so called "leaky pipe" circuit topology [TorDesign]: a client can send a cell to any node in the circuit, not just the last node. I'll try to keep that property, although historically we haven't really made use of it.
In order to implement telescoping circuit construction (where the circuit is built by iteratively asking the last node in the circuit to extend the circuit one hop more), it's vital that the last hop of the circuit be able to change.
Proposal 188 [Prop188] suggests that we implement a "bridge guards" feature: making some (non-public) nodes insert an extra hop into the circuit after themselves, in a way that will make it harder for other nodes in the network to enumerate them. We therefore want our circuits to be one-hop re-extensible: when the client extends a circuit from Bob1 to Bob2, we want Bob1 to be able to introduce a new node "Bob1.5" into the circuit such that the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has been called "loose source routing".)
Any new approach should be able to coexist on a circuit with the old approach. That is, if Alice wants to build a circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a revised relay protocol, then Alice should be able to build a circuit such that she can have Bob1 and Bob3 process each cell with the current protocol, and Bob2 process it with a revised protocol. (Why? Because if all nodes in a circuit needed to use the same relay protocol, then each node could learn information about the other nodes in the circuit from which relay protocol was chosen. For example, if Bob1 supports the new protocol, and sees that the old relay protocol is in use, and knows that Bob2 supports the new one, then Bob1 has learned that Bob3 is some node that does not support the new relay protocol.)
Cell length needs to be constant as cells move through the network. For historical reasons mentioned above in section 1.1, the length of the encrypted part of a relay cell needs to be 509 bytes.
1.3. Some security requirements
Two adjacent nodes on a circuit can trivially tell that they are on the same circuit, and the first node can trivially tell who the client is. Other than that, we'd like any attacker who controls a node on the circuit not to have a good way to learn any other nodes, even if he/she controls those nodes. [*]
Relay cells should not be malleable: no relay should be able to alter a cell between an honest sender and an honest recipient in a way that they cannot detect.
Relay cells should be secret: nobody but the sender and recipient of a relay cell should be able to learn what it says.
Circuits should resist transparent, recoverable tagging attacks: if an attacker controls one node in a circuit and alters a relay cell there, no non-adjacent point in the circuit should be able to recover the relay cell as it would have received it had the attacker not altered it.
The above properties should apply to sequences of cells too: an attacker shouldn't be able to change what sequence of cells arrives at a destination (for example, by removing, replaying, or reordering one or more cells) without being detected.
(Feel free to substitute whatever formalization of the above requirements makes you happiest, and add whatever caveats are necessary to make you comfortable. I probably missed at least two critical properties.)
[*] Of course, an attacker who controls two points on a circuit can probably confirm this through traffic correlation. But we'd prefer that the cryptography not create other, easier ways for them to do this.
1.4. A note on algorithms
This document is deliberately agnostic concerning the choice of cryptographic primitives -- not because I have no opinions about good ciphers, MACs, and modes of operation -- but because experience has taught me that mentioning any particular cryptographic primitive will prevent discussion of anything else.
Please DO NOT suggest algorithms to use in implementing these protocols yet. It will distract! There will be time later!
If somebody _else_ suggests algorithms to use, for goodness' sake DON'T ARGUE WITH THEM! There will be time later!
2. Design 1: Large-block encryption
In this approach, we use a tweakable large-block cipher for encryption and decryption, and a tweak-chaining function TC.
2.1. Chained large-block what now?
We assume the existence of a primitive that provides the desired properties of a tweakable[Tweak] block cipher, taking blocks of any desired size. (In our case, the block size is 509 bytes[*].)
It also takes a Key, and a per-block "tweak" parameter that plays the same role that an IV plays in CBC, or that the counter plays in counter mode.
The Tweak-chaining function TC takes as input a previous tweak, a tweak chaining key, and a cell; it outputs a new tweak. Its purpose is to make future cells undecryptable unless you have received all previous cells. It could probably be something like a MAC of the old tweak and the cell using the tweak chaining key as the MAC key.
(If the initial tweak is secret, I am not sure that TC needs to be keyed.)
[*] Some large-block cipher constructions use blocks whose size is the multiple of some underlying cipher's block size. If we wind up having to use one of those, we can use 496-byte blocks instead at the cost of 2.5% wasted space.
2.2. The protocol
2.2.1. Setup phase
The circuit construction algorithm needs to produce forward and backward keys Kf and Kb, the forward and backward tweak chaining keys TCKf and TCKb, as well as initial tweak values Tf and Tb.
2.2.2. The cell format
We replace the "Digest" and "Zeros" fields of the cell with a single Z-byte "Zeros" field to determine when the cell is recognized and correctly decrypted; its length is a security parameter.
2.2.3. The decryption operations
For a relay to handle an inbound RELAY cell, it sets Tb_next to TC(TCKb, Tb, Cell). Then it encrypts the cell using the large block cipher keyed with Kb and tweaked with Tb. Then it sets Tb to Tb_next.
For a relay to handle an outbound RELAY cell, it sets Tf_next to TC(TCKf, Tf, Cell). Then it decrypts the cell using the large block cipher keyed with Kf and tweaked with Tf. Then it sets Tf to Tf_next. Then it checks the 'Zeros' field on the cell; if that field is all [00] bytes, the cell is for us.
2.3. Security discussion
This approach is fairly simple (at least, no more complex than its primitives) and achieves some of our security goals. Because of the large block cipher approach, any change to a cell will render that cell undecryptable, and indistinguishable from random junk. Because of the tweak chaining approach, if even one cell is missed or corrupted or reordered, future cells will also decrypt into random junk.
The tagging attack in this case is turned into a circuit-junking attack: an adversary who tries to mount it can probably confirm that he was indeed first and last node on a target circuit (assuming that circuits which turn to junk in this way are rare), but cannot recover the circuit after that point.
As a neat side observation, note that this approach improves upon Tor's current forward secrecy, by providing forward secrecy while circuits are still operational, since each change to the tweak should make previous cells undecryptable if the old tweak value isn't recoverable.
The length of Zeros is a parameter for what fraction of "random junk" cells will potentially be accepted by a router or client. If Zeros is Z bytes long, then junk cells will be accepted with P < 2^-(8*Z + 7). (The '+7' is there because the top 7 bits of the Length field must also be 0.)
There's no trouble using this protocol in a mixed circuit, where some nodes speak the old protocol and some speak the large-block-encryption protocol.
3. Design 2: short-MAC-and-pad
In this design, we behave more similarly to a mix-net design (such as Mixmaster or Mixminion's headers). Each node checks a MAC, and then re-pads the cell to its chosen length before decoding the cell.
This design uses as a primitive a MAC and a stream cipher. It might also be possible to use an authenticating cipher mode, if we can find one that works like a stream cipher and allows us to efficiently output authenticators for the stream so far.
NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be replaced with an authenticated encryption mode without too much loss of generality.
3.1. The protocol
3.1.1 Setup phase
The circuit construction algorithm needs to produce forward and backward keys Kf and Kb, forward and backward stream cipher IVs IVf and IVb, and forward and backward MAC keys Mf and Mb.
Additionally, the circuit construction algorithm needs a way for the client to securely (and secretly? XXX) tell each hop in the circuit a value 'bf' for the number of bytes of MAC it should expect on outbound cells and 'bb' for the number of bytes it should use on cells it's generating. Each node can get a different 'bf' and 'bb'. These values can be 0: if a node's bf is 0, it doesn't authenticate cells; if its bb is 0, it doesn't originate them.
The circuit construction algorithm also needs a way to tell each the client to securely (and secretly? XXX) tell each hop in the circuit whether it is allowed to be the final destination for relay cells.
Set the stream Sf and the stream Sb to empty values.
3.1.2. The cell format
The Zeros and Digest field of the cell format are removed.
3.1.3. The relay operations
Upon receiving an outbound cell, a node removes the first b bytes of the cell, and puts them aside as 'M'. The node then computes between 0 and 2 MACs of the stream consisting of all previously MAC'd data, plus the remainder of the cell:
If b>0 and there is a next hop, the node computes M_relay.
If this node was told to deliver traffic, or it is the last node in the circuit so far, the node computes M_receive.
M_relay is computed as MAC(stream | "relay"); M_receive is computed as MAC(stream | "receive").
If M = M_receive, this cell is for the node; it should process it.
If M = M_relay, or if b == 0, this cell should be relayed.
If a MAC was computed and neither of the above cases was met, then the cell is bogus; the node should discard it and destroy the circuit.
The node then removes the first bf bytes of the cell, and pads the cell at the end with bf zero bytes. Finally, the node decrypts the whole remaining padded cell with the stream cipher.
To handle an inbound cell, the node simply does a stream cipher with no checking.
3.1.4. Generating inbound cells.
To generate an inbound cell, a node makes a 509-bb byte RELAY cell C, encrypts that cell with Kb, appends the encrypted cell to Sb, and prepends M_receive(Sb) to the cell.
3.1.5. Generating outbound cells
Generating an outbound cell is harder, since we need to know what padding the earlier nodes will generate in order to know what padding the later nodes will receive and compute their MACs, but we also need to know what MACs we'll send to the later nodes in order to compute which MACs we'll send to the earlier ones.
Mixnet clients have needed to do this for ages, though, so the algorithms are pretty well settled. I'll give one below in A.3.
3.2. Security discussion
This approach is also simple and (given good parameter choices) can achieve our goals. The trickiest part is the algorithm that clients must follow to package cells, but that's no so bad.
It's not necessary to check MACs on inbound traffic, because nobody but the client can tell scrambled messages from good ones, and the client can be trusted to keep the client's own secrets.
With this protocol, if the attacker tries to do a tagging attack, the circuit should get destroyed by the next node in the circuit that has a nonzero "bf" value, with probability == 1-2^-(8*bf). (If there are further intervening honest nodes, they also have a chance to detect the attack.)
Similarly, any attempt to replay, or reorder outbound cells should fail similarly.
The "bf" values could reveal to each node its position in the circuit and the client preferences, depending on how we set them; on the other hand, having a fixed bf value would reveal to the last node the length of the circuit. Neither option seems great.
This protocol doesn't provide any additional forward secrecy beyond what Tor has today. We could fix that by changing our use of the stream cipher so that instead of running in counter mode between cells, we use a tweaked stream cipher and change the tweak with each cell (possibly based on the unused portion of the MAC).
This protocol does support loose source routing, so long as the no padding bytes are added by any router-added nodes.
In a circuit, every node has at least one relay cell sent to it: even non-exit nodes get a RELAY_EXTEND cell.
4. Discussion
I'm not currently seeing a reason to strongly prefer one of these approaches over another.
In favor of large-block encryption: - The space overhead seems smaller: we need to use up fewer bytes in order to get equivalent looking security.
(For example, if we want to have P < 2^64 that a cell altered by hop 1 could be accepted by hop 2 or hop 3, *and* we want P < 2^64 that a cell altered by hop 2 could be accepted by hop 3, the large-block approach needs about 8 bytes for the Zeros field, whereas the short-MAC-and-pad approach needs 16 bytes worth of MACs.)
- We get forward secrecy pretty easily.
- The format doesn't leak anything about the length of the circuit, or limit it.
- We don't need complicated logic to set the 'b' parameters.
- It doesn't need tricky padding code.
In the favor of short-MAC-and-pad: - All of the primitives used are much better analyzed and understood. There's nothing esoteric there. The format itself is similar to older, well-analyzed formats.
- Most of the constructions for the large-block-cipher approach seem less efficient in CPU cycles than a good stream cipher and a MAC. (But I don't want to discuss that now; see section 1.4 above!)
Unclear:
- Suppose that an attacker controls the first and last hop of a circuit, and tries an end-to-end tagging attack. With large-block encryption, the tagged cell and all future cells on the circuit turn to junk after the tagging attack, with P~~100%. With short-MAC-and-pad, the circuit is destroyed at the second hop, with P ~~ 1- 2^(-8*bf). Is one of these approaches materially worse for the attacker?
- Can we do better than the "compute two MACs" approach for distinguishing the relay and receive cases of the short-MAC-and-pad protocol?
- To what extent can we improve these protocols?
- If we do short-MAC-and-pad, should we apply the forward security hack alluded to in section 3.2?
5. Acknowledgments
Thanks to the many reviewers of the initial drafts of this proposal. If you can make any sense of what I'm saying, they deserve much of the credit.
A. Formal description
Note that in all these cases, more efficient descriptions exist.
A.1. The current Tor relay protocol.
Relay cell format:
Relay command [1 byte] Zeros [2 bytes] StreamID [2 bytes] Digest [4 bytes] Length [2 bytes] Data [498 bytes]
Circuit setup:
(Specified elsewhere; the client negotiates with each router in a circuit the secret AES keys Kf, Kb, and the secret 'digest keys' Df, and Db. They initialize AES counters Cf and Cb to 0. They initialize the digest stream Sf to Df, and Sb to Db.)
HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]):
1. If the Zeros field of Cell is not [00 00], return False. 2. Let Cell' = Cell with its Digest field set to [00 00 00 00]. 3. Let S' = Stream | Cell'. 4. If SHA1(S') = the Digest field of Cell, set Stream to S', and return True. 5. Otherwise return False.
HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
1. Set the Zeros and Digest field of Cell' to [00] bytes. 2. Set Stream to Stream | Cell'. 3. Construct Cell from Cell' by setting the Digest field to SHA1(Stream), and taking all other fields as-is. 4. Return Cell.
HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and counter 'Ctr'. Increment 'Ctr' by the length of the cell.
HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Same as ENCRYPT.
Router operation, upon receiving an inbound cell -- that is, one sent towards the client.
1. ENCRYPT(cell, Kb, Cb) 2. Send the decrypted contents towards the client.
Router operation, upon receiving an outbound cell -- that is, one sent away from the client.
1. DECRYPT(cell, Kf, Cf) 2. If CHECK(Cell, Sf) is true, this cell is for us. Do not relay the cell. 3. Otherwise, this cell is not for us. Send the decrypted cell to the next hop on the circuit, or discard it if there is no next hop.
Router operation, to create a relay cell that will be delivered to the client.
1. Construct a Relay cell Cell' with the relay command, length, stream ID, and body fields set as appropriate. 2. Let Cell = CONSTRUCT(Cell', Sb). 3. ENCRYPT(Cell, Kb, Cb) 4. Send the encrypted cell towards the client.
Client operation, receiving an inbound cell.
For each hop H in a circuit, starting with the first hop and ending (possibly) with the last:
1. DECRYPT(Cell, Kb_H, Cb_H)
2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop H. Exit the loop, and return the cell in its current form.
If we reach the end of the loop without finding the right hop, the cell is bogus or corrupted.
Client operation, sending an outbound cell to hop H.
1. Construct a Relay cell Cell' with the relay command, length, stream ID, and body fields set as appropriate. 2. Let Cell = CONSTRUCT(Cell', Sf_H) 3. For i = H..1: A. ENCRYPT(Cell, Sf_i, Cf_i) 4. Deliver Cell to the first hop in the circuit.
A.2. The large-block-cipher protocol
Same as A.1, except for the following changes.
The cell format is now: Zeros [Z bytes] Length [2 bytes] StreamID [2 bytes] Relay command [1 byte] Data [504-Z bytes]
Ctr is no longer a counter, but a Tweak value.
Each key is now a tuple of (Key_Crypt, Key_TC)
Streams are no longer used.
HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]) 1. If the Zeros field of Cell contains only [00] bytes, return True. 2. Otherwise return false.
HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out]) 1. Let Cell be Cell', with its "Zeros" field set to Z [00] bytes. 2. Return Cell'.
HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out]) 2. Encrypt Cell using Key and Tweak 1. Let Tweak' = TC(Key_TC, Tweak, Cell) 3. Set Tweak to Tweak'.
HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out]) 1. Let Tweak' = TC(Key_TC, Tweak, Cell) 2. Decrypt Cell using Key and Tweak 3. Set Tweak to Tweak'.
A.3. The short-MAC-and-pad protocol.
Define M_relay(K,S) as MAC(K, S|"relay"). Define M_receive(K,S) as MAC(K, S|"receive"). Define Z(n) as a series of n [00] bytes. Define BODY_LEN as 509
The cell message format is now:
Relay command [1 byte] StreamID [2 bytes] Length [2 bytes] Data [variable bytes]
Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]): Let M = Cell[0:b] Let Rest = Cell[b:...] If b == 0: Return (nil, Rest) Let S' = S | Rest If M == M_relay(K,S')[0:b]: Set S = S' Return ("relay", Rest) If M == M_receive(K,S')[0:b]: Set S = S' Return ("receive", Rest) Return ("BAD", nil)
HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and counter 'Ctr'. Increment 'Ctr' by the length of the cell.
HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out]) 1. Same as ENCRYPT.
Router operation, upon receiving an inbound cell: 1. ENCRYPT(cell, Kb, Cb) 2. Send the decrypted contents towards the client.
Router operation, upon receiving an outbound cell: 1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf) 2. If Status == "BAD", drop the cell and destroy the circuit. 3. Let Cell' = Msg | Z(BODY_LEN - len(Msg)) 4. DECRYPT(Cell', Kf, Cf) [*] 5. If Status == "receive" or (Status == nil and there is no next hop), Cell' is for us: process it. 6. Otherwise, send Cell' to the next node.
Router operation, sending a cell towards the client: 1. Let Body = a relay cell body of BODY_LEN-bb_i bytes. 2. Let Cell' = ENCRYPT(Body, Kb, Cb) 3. Let Sb = Sb | Cell' 4. Let M = M_receive(Mb, Sb)[0:b] 5. Send the cell M | Cell' back towards the client.
Client operation, upon receiving an inbound cell:
For each hop H in the circuit, from first to last:
1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i) 2. If Status = "relay", drop the cell and destroy the circuit. (BAD is okay; it means that this hop didn't originate the cell.) 3. DECRYPT(Msg, Kb_i, Cb_i) 4. If Status = "receive", this cell is from hop i; process it. 5. Otherwise, set Cell = Msg.
Client operation, sending an outbound cell:
Let BF = the total of all bf_i values.
1. Construct a relay cell body Msg of BODY_LEN-BF bytes. 2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i) 3. Let Pad_0 = "". 4. For i in range 1..N, where N is the number of hops: Let Pad_i = Pad_{i-1} | Z(bf_i) Let S_last = the last len(Pad_i) bytes of Stream_i. Let Pad_i = Pad_i xor S_last Now Pad_i is the padding as it will stand after node i has processed it.
5. For i in range N..1, where N is the number of hops: If this is the last hop, let M_* = M_receive. Else let M_* = M_relay.
Let Body = Msg xor the first len(Msg) bytes of Stream_i
Let M = M_*(Mf, Body | Pad_(i-1))
Set Msg = M[:bf_i] | Body
6. Send Msg outbound to the first relay in the circuit.
[*] Strictly speaking, we could omit the pad-and-decrypt operation once we know we're the final hop.
R. References
[Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-gu...
[TorSpec] The Tor Protocol Specification https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.tx...
[TorDesign] Dingledine et al, "Tor: The Second Generation Onion Router", https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf
[Tweak] Liskov et al, "Tweakable Block Ciphers", http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf
[XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity"
[23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging Attacks" http://archives.seul.org/or/dev/Mar-2012/msg00019.html (You'll want to read the rest of the thread too.)
On 6/19/12, Nick Mathewson nickm@freehaven.net wrote:
Filename: 202-improved-relay-crypto.txt
Any new approach should be able to coexist on a circuit with the old approach. That is, if Alice wants to build a circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a revised relay protocol, then Alice should be able to build a circuit such that she can have Bob1 and Bob3 process each cell with the current protocol, and Bob2 process it with a revised protocol. (Why? Because if all nodes in a circuit needed to use the same relay protocol, then each node could learn information about the other nodes in the circuit from which relay protocol was chosen. For example, if Bob1 supports the new protocol, and sees that the old relay protocol is in use, and knows that Bob2 supports the new one, then Bob1 has learned that Bob3 is some node that does not support the new relay protocol.)
This feature is unsafe to use. Each client must use the same circuit-extension protocol for every relay on every circuit it builds.
2.1. Chained large-block what now?
We assume the existence of a primitive that provides the desired properties of a tweakable[Tweak] block cipher, taking blocks of any desired size. (In our case, the block size is 509 bytes[*].)
It also takes a Key, and a per-block "tweak" parameter that plays the same role that an IV plays in CBC, or that the counter plays in counter mode.
The Tweak-chaining function TC takes as input a previous tweak, a tweak chaining key, and a cell; it outputs a new tweak. Its purpose is to make future cells undecryptable unless you have received all previous cells. It could probably be something like a MAC of the old tweak and the cell using the tweak chaining key as the MAC key.
No. In every tweakable block cipher construction which I have seen proposed, an attacker who knows the key and has one plaintext and its corresponding ciphertext can recover the tweak.
Varying the tweak would allow an honest recipient to fail to decrypt a cell if any previous cell was altered, but cells are not undecryptable if only the tweak is unknown.
Robert Ransom
On Tue, Jun 19, 2012 at 2:06 PM, Robert Ransom rransom.8774@gmail.com wrote:
On 6/19/12, Nick Mathewson nickm@freehaven.net wrote:
Filename: 202-improved-relay-crypto.txt
Any new approach should be able to coexist on a circuit with the old approach. That is, if Alice wants to build a circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a revised relay protocol, then Alice should be able to build a circuit such that she can have Bob1 and Bob3 process each cell with the current protocol, and Bob2 process it with a revised protocol. (Why? Because if all nodes in a circuit needed to use the same relay protocol, then each node could learn information about the other nodes in the circuit from which relay protocol was chosen. For example, if Bob1 supports the new protocol, and sees that the old relay protocol is in use, and knows that Bob2 supports the new one, then Bob1 has learned that Bob3 is some node that does not support the new relay protocol.)
This feature is unsafe to use. Each client must use the same circuit-extension protocol for every relay on every circuit it builds.
Do you mean that every client must use at most one circuit-extension protocol on all circuits, or do you mean that every circuit must by built by at most one circuit-extension protocol?
And why?
And how would you have either*of those without having the choice of circuit extension protocol used at any hop in the circuit leak to the attacker which other nodes might be later in the circuit? (In other words, if I'm a guard, and I support an improved protocol, and I know the client supports it, and the middle node supports it, but the client does not use it, I can deduce that the exit node does not support the improved protocol.)
yrs,
On 6/19/12, Nick Mathewson nickm@freehaven.net wrote:
Filename: 202-improved-relay-crypto.txt
1.4. A note on algorithms
This document is deliberately agnostic concerning the choice of cryptographic primitives -- not because I have no opinions about good ciphers, MACs, and modes of operation -- but because experience has taught me that mentioning any particular cryptographic primitive will prevent discussion of anything else.
Not particularly agnostic, because you're specifying a different set of primitives than the protocol actually requires. See below.
Please DO NOT suggest algorithms to use in implementing these protocols yet. It will distract! There will be time later!
OK, fine. I won't prove that the cryptographic primitives which your protocols really require can be implemented in terms of existing low-level primitives and constructions.
If somebody _else_ suggests algorithms to use, for goodness' sake DON'T ARGUE WITH THEM! There will be time later!
Design 1: Large-block encryption
In this approach, we use a tweakable large-block cipher for encryption and decryption, and a tweak-chaining function TC.
2.1. Chained large-block what now?
We assume the existence of a primitive that provides the desired properties of a tweakable[Tweak] block cipher, taking blocks of any desired size. (In our case, the block size is 509 bytes[*].)
It also takes a Key, and a per-block "tweak" parameter that plays the same role that an IV plays in CBC, or that the counter plays in counter mode.
The Tweak-chaining function TC takes as input a previous tweak, a tweak chaining key, and a cell; it outputs a new tweak. Its purpose is to make future cells undecryptable unless you have received all previous cells. It could probably be something like a MAC of the old tweak and the cell using the tweak chaining key as the MAC key.
(If the initial tweak is secret, I am not sure that TC needs to be keyed.)
[*] Some large-block cipher constructions use blocks whose size is the multiple of some underlying cipher's block size. If we wind up having to use one of those, we can use 496-byte blocks instead at the cost of 2.5% wasted space.
You're talking about a particular cryptographic primitive here. (Some underlying ciphers (e.g. Skipjack) operate on 8-byte blocks, so you would only reduce the large-block block cipher to 504-byte blocks at the cost of about 1% wasted space.)
Since you've crossed the ‘mentioning specific primitives’ line by suggesting a kludge to support AES-biIGE, I might as well point out that large-block block cipher constructions can generally be modified to extract entropy from the message and key during encryption/decryption into an extra output (which may need to be kept secret). This extra output can be hashed with a longer-term ‘chaining key’ and/or the previous large-block block cipher key to produce a new large-block block cipher key (or to produce a new tweak for the large-block block cipher, if you insist on using underlying primitives (mumble *AES* mumble mumble) which are too inefficient to use without per-key precomputation).
Design 2: short-MAC-and-pad
In this design, we behave more similarly to a mix-net design (such as Mixmaster or Mixminion's headers). Each node checks a MAC, and then re-pads the cell to its chosen length before decoding the cell.
This design uses as a primitive a MAC and a stream cipher. It might also be possible to use an authenticating cipher mode, if we can find one that works like a stream cipher and allows us to efficiently output authenticators for the stream so far.
NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be replaced with an authenticated encryption mode without too much loss of generality.
Here you are assuming that MAC(a | b) can be efficiently computed from PreMAC(a) and b for some function PreMAC, and that either the MAC does not require a per-message nonce or PreMAC is independent of the nonce. (In particular, you are assuming HMAC or a Poly1305-AES-like MAC, and forbidding some faster and/or safer polynomial-evaluation MACs.)
Many MACs (possibly not polynomial-evaluation MACs, though) can be used to authenticate the stream of cells: let p be the full output of the MAC for the preceding cell, and compute MAC(cell_index, p | cell_data) as the MAC for the current cell (then stick a possibly truncated copy of that on the cell for transmission). (And now, instead of specifying a particular authenticated encryption primitive as Nick did, I'm specifying a particular AEAD primitive with a(n extra) ‘chaining’ output. AE and AEAD are primitives themselves, not things that must be kludged up as cipher modes for a (small-block) block cipher (unless you're NSA and you're trying to force-feed everyone AES).)
Robert Ransom