On Thu, Mar 31, 2011 at 5:52 AM, Robert Ransom rransom.8774@gmail.com wrote:
Hi! I'm going to wait on a full review of your create/extend proposal till it's done, but I though I could potentially answer some questions and offer some comments:
1) I think CREATE cells need to get a field asking for a specific way of handling RELAY cells. If we ever update the relay crypto (that is, the stuff described in 5.5 and 6.1 today) to do something other than AES128_CTR for the encryption, SHA1 for the digest, we'll need a way to ask for it. I suppose that we could do that via adding a new handshake type for a new desired link protocol, but it seems like the two could be more or less orthogonal.
2) I don't get the rationale behind the variable-length type names. Elsewhere in our protocol we'd just use a 1- or 2-byte type field. Do we win a lot by letting these be longer?
3) By convention, everything in our protocol is in "network" (big-endian) byte-order.
4) There needs to be an authenticated way for the client to know which handshake types and link specifier types a given node supports. This could be as simple as a series of statements like "Tor versions xyz and later will support handshake type Y", or as complex as advertising them in router descriptors.
5) I like the idea of separating the link specifiers from the handshake type in EXTENDED cells.
6) I also think it'd be smart to figure out the actual link specifier and handshake types that we want to move to before we decide that we are confident that this format is right. Those can be a different proposal, though.
7) Here's a first cut of what I think might go in a link specifier format:
* V4Address -- an ipv4 address, set to 0 if there is no IPv4 address for the node [4 bytes] * V4ORPort [2 bytes] * V6Address -- an ipv6 address, set to 0 if there is no IPv6 address for the node [16 bytes] * V6ORPort [2 bytes] * SHA256 of RSA1024 identity key [32 bytes]
When we decide what longer identity keys should look like, we can add a new link specifier format that supports those.
8) This is totally back-of-the-envelope stuff, but it might be a good starting point for crypto discussion.
Here's a first cut of what I think might go in an improved RSA handshake:
* First 8 bytes of the SHA256 hash of the onion key [8 bytes] (This is here so that onion key rotation can work without having to sometimes try the wrong onion key incorrectly.) * PK-encrypted: * Padding [PK_PAD_LEN bytes] * SHA256 hash of all remaining fields. [32 bytes] * Symmetric key seed [16 bytes] * The first part of g^x. [as much will fit in the PK-encrypted portion] * Symmetrically encrypted: * The rest of g^x. * 0 bytes for padding.
Here's a first cut of what I think might go in a hypothetical diffie-hellman based handshake, for use with something like Curve25519. (I'm using g^x and g^y notation here as if this were diffie-hellman in Z_p, since I don't yet trust myself to write ECC stuff correctly. I'm assuming that the node's public onion key is g^x.)
* SHA256 of all remaining fields. [32 bytes] * First 8 bytes of the SHA256 hash of the onion key (8 bytes) * g^y1 [DH_LEN bytes] * Encrypted using a symmetric key based on g^(x*y1): * g^y2 [DH_LEN bytes] * 0 bytes for padding
In both cases, we'll want a new key derivation function.
yrs,
On Thu, Apr 07, 2011 at 05:18:12PM -0400, Nick Mathewson wrote:
- This is totally back-of-the-envelope stuff, but it might be a good
starting point for crypto discussion.
Here's a first cut of what I think might go in an improved RSA handshake:
- First 8 bytes of the SHA256 hash of the onion key [8 bytes] (This is here so that onion key rotation can work without having
to sometimes try the wrong onion key incorrectly.)
- PK-encrypted:
- Padding [PK_PAD_LEN bytes]
- SHA256 hash of all remaining fields. [32 bytes]
- Symmetric key seed [16 bytes]
- The first part of g^x. [as much will fit in the PK-encrypted portion]
- Symmetrically encrypted:
- The rest of g^x.
- 0 bytes for padding.
Here's a first cut of what I think might go in a hypothetical diffie-hellman based handshake, for use with something like Curve25519. (I'm using g^x and g^y notation here as if this were diffie-hellman in Z_p, since I don't yet trust myself to write ECC stuff correctly. I'm assuming that the node's public onion key is g^x.)
- SHA256 of all remaining fields. [32 bytes]
- First 8 bytes of the SHA256 hash of the onion key (8 bytes)
- g^y1 [DH_LEN bytes]
- Encrypted using a symmetric key based on g^(x*y1):
- g^y2 [DH_LEN bytes]
- 0 bytes for padding
The phrase that jumps to mind is, "Danger Will Robinson!". ;-) If we're redesigning the AKE (authenticated key agreement) bits, we probably shouldn't just make up our own stuff.
- Ian
On Thu, Apr 7, 2011 at 6:04 PM, Ian Goldberg iang@cs.uwaterloo.ca wrote: [...]
The phrase that jumps to mind is, "Danger Will Robinson!". ;-) If we're redesigning the AKE (authenticated key agreement) bits, we probably shouldn't just make up our own stuff.
Indeed! I am hoping that by threatening to do so, I can get the cryptographers on the list to take an interest and tell us what to do instead. ;)
(For background on why we would want to do crypto migration at atll, see https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry... , for which there was never really enough comment. See also proposal 176, which is totally Made Of Crypto.)
On Thu, Apr 7, 2011 at 5:18 PM, Nick Mathewson nickm@freehaven.net wrote: [...]
Here's a first cut of what I think might go in a hypothetical diffie-hellman based handshake
I'm deliberately *not* using MQV, HMQV, FHMQV, etc etc here. They're faster than the "Just do DH twice" thing I wrote up, but the patent situation seems unfavorable from what I can tell. Also, curve25519 is about 5x faster than our current 1024-bit DH, and about 11 times faster than the 1536-bit DH we'd probably want to move towards for an upgraded variant of current our RSA+DH handshake. So replacing an RSA and a DH with two ECC DH operations seems a find thing to do, assuming that we decide that curve25519 is a good idea for us.
In both cases, we'll want a new key derivation function.
Oh! Also, for a bit of redundancy, I'm thinking that the symmetric crypto parts of the improved onion handshakes ought to be with a less malleable mode of operation than the counter-mode stuff we do now. Perhaps we could make use of an all-or-nothing mode of operation like LIONESS or biIGE. (They're both slower than counter mode, but for purposes of CREATE cells, I don't think the hit will matter in comparison with the cost of the public-key operations.)
On Thu, Apr 07, 2011 at 06:13:45PM -0400, Nick Mathewson wrote:
Oh! Also, for a bit of redundancy, I'm thinking that the symmetric crypto parts of the improved onion handshakes ought to be with a less malleable mode of operation than the counter-mode stuff we do now. Perhaps we could make use of an all-or-nothing mode of operation like LIONESS or biIGE. (They're both slower than counter mode, but for purposes of CREATE cells, I don't think the hit will matter in comparison with the cost of the public-key operations.)
This is another thing that triggers my crypto-spidey-sense. The particular problem that I'm thinking of is that for MAC-then-encrypt, only some modes of operation are secure (CTR is, CBC is not). In some ways, the malleability of CTR is a strength, and I'd be concerned that something else might be able to be leveraged in an attack.
Steven.
On Thu, Apr 07, 2011 at 11:20:00PM +0100, Steven J. Murdoch wrote:
On Thu, Apr 07, 2011 at 06:13:45PM -0400, Nick Mathewson wrote:
Oh! Also, for a bit of redundancy, I'm thinking that the symmetric crypto parts of the improved onion handshakes ought to be with a less malleable mode of operation than the counter-mode stuff we do now. Perhaps we could make use of an all-or-nothing mode of operation like LIONESS or biIGE. (They're both slower than counter mode, but for purposes of CREATE cells, I don't think the hit will matter in comparison with the cost of the public-key operations.)
This is another thing that triggers my crypto-spidey-sense. The particular problem that I'm thinking of is that for MAC-then-encrypt, only some modes of operation are secure (CTR is, CBC is not). In some ways, the malleability of CTR is a strength, and I'd be concerned that something else might be able to be leveraged in an attack.
But we're currently doing "encrypt", not "MAC-then-encrypt". And we should be doing "encrypt-then-MAC", in my opinion, which ensures the ciphertext can't be undetectably messed with.
In any event, yes, crypto-spidey-sense.
- Ian
On Thu, Apr 07, 2011 at 06:13:45PM -0400, Nick Mathewson wrote:
Oh! Also, for a bit of redundancy, I'm thinking that the symmetric crypto parts of the improved onion handshakes ought to be with a less malleable mode of operation than the counter-mode stuff we do now.
Yes. Absolute necessity.
Perhaps we could make use of an all-or-nothing mode of operation like LIONESS or biIGE. (They're both slower than counter mode, but for purposes of CREATE cells, I don't think the hit will matter in comparison with the cost of the public-key operations.)
A MAC (or a cipher mode that includes integrity like GCM) would be a good start.
- Ian
On Thu, 7 Apr 2011 17:18:12 -0400 Nick Mathewson nickm@freehaven.net wrote:
On Thu, Mar 31, 2011 at 5:52 AM, Robert Ransom rransom.8774@gmail.com wrote:
Hi! I'm going to wait on a full review of your create/extend proposal till it's done, but I though I could potentially answer some questions and offer some comments:
- I think CREATE cells need to get a field asking for a specific way
of handling RELAY cells. If we ever update the relay crypto (that is, the stuff described in 5.5 and 6.1 today) to do something other than AES128_CTR for the encryption, SHA1 for the digest, we'll need a way to ask for it. I suppose that we could do that via adding a new handshake type for a new desired link protocol, but it seems like the two could be more or less orthogonal.
I thought the desired relay ciphersuite should be specified in the handshake data, so that circuit handshakes could encrypt the relay ciphersuite specifier chosen by the client until we make a more thoughtful decision as to whether it's safe to expose that to the preceding relay.
- I don't get the rationale behind the variable-length type names.
Elsewhere in our protocol we'd just use a 1- or 2-byte type field. Do we win a lot by letting these be longer?
I don't know, but we certainly don't lose anything -- I don't expect to overflow the 509 bytes currently available for EXTEND-ish cells until we start using a post-quantum cryptosystem, and I assume the PQ encryption systems with short enough keys that we would be willing to use them purely for performance reasons are patented, so we can't switch to them for at least a few years.
- By convention, everything in our protocol is in "network"
(big-endian) byte-order.
Good.
- There needs to be an authenticated way for the client to know which
handshake types and link specifier types a given node supports. This could be as simple as a series of statements like "Tor versions xyz and later will support handshake type Y", or as complex as advertising them in router descriptors.
The protocols that a relay is willing to support do need to be specified in its descriptor.
- I like the idea of separating the link specifiers from the
handshake type in EXTENDED cells.
s/EXTENDED/EXTEND2/
There is no way to avoid separating the part which the relay receiving the EXTEND2 cell must process from the part which the relay being extended to must process. Splitting up those two pieces further is bad: in my earlier drafts, I made the mistake of specifying the format of link specifiers by separating the connection address from an identity key fingerprint. That would have precluded link specifiers that contain an encryption key instead of the hash of a signature key (see below for more details).
- I also think it'd be smart to figure out the actual link specifier
and handshake types that we want to move to before we decide that we are confident that this format is right. Those can be a different proposal, though.
See below, but yes, they should be in separate proposals.
- Here's a first cut of what I think might go in a link specifier format:
- V4Address -- an ipv4 address, set to 0 if there is no IPv4 address
for the node [4 bytes]
- V4ORPort [2 bytes]
- V6Address -- an ipv6 address, set to 0 if there is no IPv6 address
for the node [16 bytes]
- V6ORPort [2 bytes]
- SHA256 of RSA1024 identity key [32 bytes]
When we decide what longer identity keys should look like, we can add a new link specifier format that supports those.
There is no reason to fix the identity key length in that link specifier. Your link specifier format is clearly tied to TLS-over-TCP link protocols, so the SHA256 hash might as well be computed over any public key (or, alternatively, certificate) that the relay presents in its certificate chain. It would be useful to include some indication of which certificate in the chain needs to match the hash, though.
But when we switch to a UDP-based link protocol, we will want to include an encryption key (which serves as a link handshake-authentication key) in the link specifier, so that the first CREATE cell can be sent in the UDP packet that opens the link. This means that the first CREATE cell on a newly opened link won't get forward secrecy from the link protocol, but we need to design circuit handshakes to provide forward secrecy even if the attacker sees *both* messages in the handshake anyway, so I don't consider this a problem.
- This is totally back-of-the-envelope stuff, but it might be a good
starting point for crypto discussion.
Here's a first cut of what I think might go in an improved RSA handshake:
- First 8 bytes of the SHA256 hash of the onion key [8 bytes] (This is here so that onion key rotation can work without having
to sometimes try the wrong onion key incorrectly.)
- PK-encrypted:
- Padding [PK_PAD_LEN bytes]
- SHA256 hash of all remaining fields. [32 bytes]
- Symmetric key seed [16 bytes]
- The first part of g^x. [as much will fit in the PK-encrypted portion]
- Symmetrically encrypted:
- The rest of g^x.
- 0 bytes for padding.
If you really want to keep using RSA, I would suggest just ‘grizzling’ the data (affix a random nonce and an integrity check computed over the nonce and data, then BEAR/LION/LIONESS-encrypt with a fixed key), then RSA-encrypting the ‘grizzle’. According to http://www.cl.cam.ac.uk/~rja14/Papers/grizzle.pdf, that is provably secure, for some definitions of “provably” and “secure”. (I haven't followed the reference to the Jakobsson-Stern-Yung paper yet.) That appears to me to be a ‘plaintext-aware’ encryption algorithm (in the terms of http://freehaven.net/anonbib/#tap:pet2006).
But in that case, we would also benefit from replacing the Diffie-Hellman handshake with the obvious RSA-only one. The client generates an ephemeral RSA keypair, encrypts its public key, a random 256-bit string, and the desired relay ciphersuite specifier with the relay's public ‘onion key’, and sends the encrypted blob in the EXTEND-ish cell; the relay encrypts a random string with the client's ephemeral public key, and sends the encrypted blob back to the client; the resulting shared secret key is the SHA256 hash of the client's random 256-bit string followed by the string chosen and sent by the relay. This has two advantages: (1) an attacker who grabs a relay's onion key after the fact can no longer benefit from performing index-calculus precomputations in a fixed DH group, but must attack each client's RSA key separately, and (2) computing x^65537 (or better, x^3) is faster than computing g^x for large random x. Some schemes very close to this one are described (and proved secure, again for some values of “proved” and “secure”) in http://eprint.iacr.org/1999/012.
Here's a first cut of what I think might go in a hypothetical diffie-hellman based handshake, for use with something like Curve25519. (I'm using g^x and g^y notation here as if this were diffie-hellman in Z_p, since I don't yet trust myself to write ECC stuff correctly. I'm assuming that the node's public onion key is g^x.)
- SHA256 of all remaining fields. [32 bytes]
- First 8 bytes of the SHA256 hash of the onion key (8 bytes)
- g^y1 [DH_LEN bytes]
- Encrypted using a symmetric key based on g^(x*y1):
- g^y2 [DH_LEN bytes]
- 0 bytes for padding
In both cases, we'll want a new key derivation function.
I see no MAC here -- the symmetrically encrypted blob *must* be MAC-ed. (I suggest using Salsa20 encryption, and using 32 extra bytes of Salsa20 output as a key for a Poly1305 MAC.) Also, the SHA256 at the beginning is unnecessary, and we will want to include a relay ciphersuite specifier in the encrypted blob. Other than those changes, that is what I had in mind.
This is essentially TAP using a better DH group and a stronger encryption algorithm, and it clearly inherits TAP's security proof.
I have another possible design, mainly to force us to stop calling the circuit handshake-authentication key an ‘onion key’: The client sends an ephemeral DH public key and a relay ciphersuite specifier to the relay, and the relay replies with an EDH public key of its own and a hash of the client's message, signed with the relay's circuit HA key. (I assume this is basically what our TLS configuration does at the link layer.) I doubt that we will want to use this, but I'm quite sure that it meets all of the circuit-extension handshake requirements that I listed in torspec.git/proposals/ideas/xxx-crypto-requirements.txt , and the HA key for this protocol cannot be accurately referred to as an ‘onion key’ because nothing is ever encrypted with it.
Robert Ransom