Hi,
So: a bunch of us were discussing Prop224 Onion addresses, and their UX-malleability.
Specifically: that there are small bit fields in the current Prop224 Onion Address schema (eg: version, and other future structure?) which can be tweaked or amended without otherwise changing the functionality of the address, or without much changing what the user sees in the (say) browser address bar.
This is a point of significant concern because of issues like phishing and passing-off - by analogy: t0rpr0ject.0rg versus torproject.org - and other games that can be played with a prop224 address now, or in future, to game user experience.
We discussed the existing "hash the public key before base-32 encoding" approach, but hashing breaks the prop224 key blinding.
Ian Goldberg - thank you Ian - offered this attractive solution: apply a *reversible* "All Or Nothing Transform" (AONT) to the entire Prop224 Onion Address, prior to Base32 Encoding.
This way, even a single-bit mutation of (say) version number will have a "diffusion" effect, impacting ~ N/2 of the bits whilst having O(1) cost and being reversible so as not to impact the rest of Prop224.
The result would be onion addresses which are less "tamperable" / more deterministic, that closer to one-and-only-one published onion address will correspond to an onion endpoint.
What does the panel think?
- alec
On Sun, Mar 26, 2017 at 02:24:41PM +0200, Alec Muffett wrote:
Hi,
So: a bunch of us were discussing Prop224 Onion addresses, and their UX-malleability.
Specifically: that there are small bit fields in the current Prop224 Onion Address schema (eg: version, and other future structure?) which can be tweaked or amended without otherwise changing the functionality of the address, or without much changing what the user sees in the (say) browser address bar.
This is a point of significant concern because of issues like phishing and passing-off - by analogy: t0rpr0ject.0rg versus torproject.org - and other games that can be played with a prop224 address now, or in future, to game user experience.
We discussed the existing "hash the public key before base-32 encoding" approach, but hashing breaks the prop224 key blinding.
Ian Goldberg - thank you Ian - offered this attractive solution: apply a *reversible* "All Or Nothing Transform" (AONT) to the entire Prop224 Onion Address, prior to Base32 Encoding.
This way, even a single-bit mutation of (say) version number will have a "diffusion" effect, impacting ~ N/2 of the bits whilst having O(1) cost and being reversible so as not to impact the rest of Prop224.
The result would be onion addresses which are less "tamperable" / more deterministic, that closer to one-and-only-one published onion address will correspond to an onion endpoint.
What does the panel think?
One thing I thought of later is that, assuming the version field is "under" the AONT, then there is *no* visible version field in the final address, so you would have to commit to "For any possible future onion address of this fixed length, the first thing you have to do to decode it is this particular AONT." This seems a bit suboptimal to me. And since the version field basically *is* the tweakable field in the current prop224 addresses, maybe this actually isn't so useful after all for this version of the spec?
On Sun, Mar 26, 2017 at 04:19:58PM -0400, Ian Goldberg wrote:
On Sun, Mar 26, 2017 at 02:24:41PM +0200, Alec Muffett wrote:
Hi,
So: a bunch of us were discussing Prop224 Onion addresses, and their UX-malleability.
Specifically: that there are small bit fields in the current Prop224 Onion Address schema (eg: version, and other future structure?) which can be tweaked or amended without otherwise changing the functionality of the address, or without much changing what the user sees in the (say) browser address bar.
This is a point of significant concern because of issues like phishing and passing-off - by analogy: t0rpr0ject.0rg versus torproject.org - and other games that can be played with a prop224 address now, or in future, to game user experience.
We discussed the existing "hash the public key before base-32 encoding" approach, but hashing breaks the prop224 key blinding.
Ian Goldberg - thank you Ian - offered this attractive solution: apply a *reversible* "All Or Nothing Transform" (AONT) to the entire Prop224 Onion Address, prior to Base32 Encoding.
This way, even a single-bit mutation of (say) version number will have a "diffusion" effect, impacting ~ N/2 of the bits whilst having O(1) cost and being reversible so as not to impact the rest of Prop224.
The result would be onion addresses which are less "tamperable" / more deterministic, that closer to one-and-only-one published onion address will correspond to an onion endpoint.
What does the panel think?
One thing I thought of later is that, assuming the version field is "under" the AONT, then there is *no* visible version field in the final address, so you would have to commit to "For any possible future onion address of this fixed length, the first thing you have to do to decode it is this particular AONT." This seems a bit suboptimal to me. And since the version field basically *is* the tweakable field in the current prop224 addresses, maybe this actually isn't so useful after all for this version of the spec?
<talking-to-myself>
We could leave the version field outside the AONT, though, but commit to changing the paramaters of the AONT (in particular, the domain separation constant?) if we change the version number, so that an adversary changing the version number to "2" would just cause the client to throw an error (before version 2 exists) or be an invalid address (after version 2 exists)?
Then the address would look something like:
base32( AONT_1( pubkey || checksum ) || version=0x01 )
where AONT_1 is an unkeyed invertible function from 34(?)-byte strings to 34(?)-byte strings.
(Of course, then all addresses would end in "b", or something like that.)
</talking-to-myself>
We could leave the version field outside the AONT, though, but commit to changing the paramaters of the AONT (in particular, the domain separation constant?) if we change the version number, so that an adversary changing the version number to "2" would just cause the client to throw an error (before version 2 exists) or be an invalid address (after version 2 exists)?
To add an aside from a discussion with Teor: the entire "version" field could be reduced to a single - probably "zero" - bit, in a manner perhaps similar to the distinctions between Class-A, Class-B, Class-C... addresses in old IPv4.
Thus: if the first bit in the address is zero, then there is no version, and we are at version 0 of the format
If the first bit is one, we are using v1+ of the format and all bets are off, except that the obvious thing then to do is count the number of 1-bits (up to some limit) and declare that to be version number. Once we're up to 3 or 4 or 7 or 8 one-bits, then shift version encoding totally.
Teor will correct me if I misquote him, but the advantage here was:
a) the version number is 1 bit, ie: small, for the forseeable / if we get it right
b) in pursuit of smallness, we could maybe dump the hash in favour of a AONT + eyeballs, which would give back a bunch of extra bits
result: shorter addresses, happier users.
On Mon, Mar 27, 2017 at 12:27:33AM +0200, Alec Muffett wrote:
We could leave the version field outside the AONT, though, but commit to changing the paramaters of the AONT (in particular, the domain separation constant?) if we change the version number, so that an adversary changing the version number to "2" would just cause the client to throw an error (before version 2 exists) or be an invalid address (after version 2 exists)?
To add an aside from a discussion with Teor: the entire "version" field could be reduced to a single - probably "zero" - bit, in a manner perhaps similar to the distinctions between Class-A, Class-B, Class-C... addresses in old IPv4.
Thus: if the first bit in the address is zero, then there is no version, and we are at version 0 of the format
If the first bit is one, we are using v1+ of the format and all bets are off, except that the obvious thing then to do is count the number of 1-bits (up to some limit) and declare that to be version number. Once we're up to 3 or 4 or 7 or 8 one-bits, then shift version encoding totally.
Teor will correct me if I misquote him, but the advantage here was:
a) the version number is 1 bit, ie: small, for the forseeable / if we get it right
b) in pursuit of smallness, we could maybe dump the hash in favour of a AONT + eyeballs, which would give back a bunch of extra bits
result: shorter addresses, happier users.
You indeed do not require a checksum under an AONT, but you do require redundancy if you want to catch typos. Something like
base64( AONT( pubkey || 0x0000 ) || version)
is fine. If you want "version" to be a single bit, then the AONT would have to operate on non-full bytes, which is a bit (ha!) annoying, but not terrible. In that case, "0x0000" would actually be 15 bits of 0, and version would be 1 bit. This would only save 1.4 base32 characters, though. If you took off some more bits of the redundancy (down to 8 bits?), you would be able to shave one more base32 char. And indeed, if you make the redunancy just a single byte of 0x00, then the extra 0-bit for the "version" actually fits neatly in the one leftover bit of the base32 encoding, I think, so the AONT is back to working on full bytes.
But is a single byte of redundancy enough? It will let through one out of every 256 typos. (I thought we had spec'd 2 bytes for the checkcum now, but maybe I misremember? I'm also assuming we're using a simple 256-bit encoding of the pubkey, rather than something more complex that saves ~3 bits.)
(Heading to the airport.)
On Mon, Mar 27, 2017 at 01:59:42AM -0400, Ian Goldberg wrote:
To add an aside from a discussion with Teor: the entire "version" field could be reduced to a single - probably "zero" - bit, in a manner perhaps similar to the distinctions between Class-A, Class-B, Class-C... addresses in old IPv4.
Thus: if the first bit in the address is zero, then there is no version, and we are at version 0 of the format
If the first bit is one, we are using v1+ of the format and all bets are off, except that the obvious thing then to do is count the number of 1-bits (up to some limit) and declare that to be version number. Once we're up to 3 or 4 or 7 or 8 one-bits, then shift version encoding totally.
Teor will correct me if I misquote him, but the advantage here was:
a) the version number is 1 bit, ie: small, for the forseeable / if we get it right
b) in pursuit of smallness, we could maybe dump the hash in favour of a AONT + eyeballs, which would give back a bunch of extra bits
result: shorter addresses, happier users.
You indeed do not require a checksum under an AONT, but you do require redundancy if you want to catch typos. Something like
base64( AONT( pubkey || 0x0000 ) || version)
is fine. If you want "version" to be a single bit, then the AONT would have to operate on non-full bytes, which is a bit (ha!) annoying, but not terrible. In that case, "0x0000" would actually be 15 bits of 0, and version would be 1 bit. This would only save 1.4 base32 characters, though. If you took off some more bits of the redundancy (down to 8 bits?), you would be able to shave one more base32 char. And indeed, if you make the redunancy just a single byte of 0x00, then the extra 0-bit for the "version" actually fits neatly in the one leftover bit of the base32 encoding, I think, so the AONT is back to working on full bytes.
But is a single byte of redundancy enough? It will let through one out of every 256 typos. (I thought we had spec'd 2 bytes for the checkcum now, but maybe I misremember? I'm also assuming we're using a simple 256-bit encoding of the pubkey, rather than something more complex that saves ~3 bits.)
(Heading to the airport.)
OK, here are the details of this variant of the proposal. Onion addresses are 54 characters in this variant, and the typo-resistance is 13 bits (1/8192 typos are not caught).
Encoding:
raw is a 34-byte array. Put the ed25519 key into raw[0..31] and 0x0000 into raw[32..33]. Note that there are really only 13 bits of 0's for redundancy, plus the 0 bit for the version, plus 2 unused bits in raw[32..33].
Do the AONT. Here G is a hash function mapping 16-byte inputs to 18-byte outputs, and H is a hash function mapping 18-byte inputs to 16-byte outputs. Reasonable implementations would be something like:
G(input) = SHA3-256("Prop224Gv0" || input)[0..17] H(input) = SHA3-256("Prop224Hv0" || input)[0..15]
raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, since we really only want 13 bits of redundancy raw[33] &= 0xf8 raw[0..15] ^= H(raw[16..33])
Then base32-encode raw[0..33]. The 56-character result will always end in "a=" (the two unused bits at the end of raw[33]), so just remove that part.
Decoding:
Base32-decode the received address into raw[0..33]. Depending on your base32 decoder, you may have to stick the "a=" at the end of the address first. The low two bits were unused; be sure the base32 decoder sets them to 0. The next lowest bit (raw[33] & 0x04) is the version bit. Ensure that (raw[33] & 0x04 == 0); if not, this is a different address format version you don't understand.
Undo the AONT:
raw[0..15] ^= H(raw[16..33]) raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, as above raw[33] &= 0xf8
Check the redundancy by ensuring that raw[32..33] = 0x0000. If not, there was a typo in the address. (Note again that since we explicitly cleared the low 3 bits of raw[33], there are really only 13 bits of checking here.)
raw[0..31] is then the pubkey suitable for use in Ed25519. As before (and independently of the AONT stuff), you could sanity-check it to make sure that (a) it is not the identity element, and (b) L times it *is* the identity element. (L is the order of the Ed25519 group.) Checking (a) is important; checking (b) isn't strictly necessary for the reasons given before, but is still a sensible thing to do. If you don't check (b), you actually have to check in (a) that the pubkey isn't one of 8 bad values, not just the identity. So just go ahead and check (b) to rest easier. ;-)
This version contains two calls to SHA3, as opposed to the one such call in the non-AONT (but including a checksum) version. The benefit is Alec's (and others') desire that there cannot be any bits an attacker could twiddle that would leave both the key the same and the address looking OK to somone who just spot-checks say the beginning and/or the end.
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Mon, Mar 27, 2017 at 01:59:42AM -0400, Ian Goldberg wrote:
To add an aside from a discussion with Teor: the entire "version" field could be reduced to a single - probably "zero" - bit, in a manner perhaps similar to the distinctions between Class-A, Class-B, Class-C... addresses in old IPv4.
Thus: if the first bit in the address is zero, then there is no version, and we are at version 0 of the format
If the first bit is one, we are using v1+ of the format and all bets are off, except that the obvious thing then to do is count the number of 1-bits (up to some limit) and declare that to be version number. Once we're up to 3 or 4 or 7 or 8 one-bits, then shift version encoding totally.
Teor will correct me if I misquote him, but the advantage here was:
a) the version number is 1 bit, ie: small, for the forseeable / if we get it right
b) in pursuit of smallness, we could maybe dump the hash in favour of a AONT + eyeballs, which would give back a bunch of extra bits
result: shorter addresses, happier users.
You indeed do not require a checksum under an AONT, but you do require redundancy if you want to catch typos. Something like
base64( AONT( pubkey || 0x0000 ) || version)
is fine. If you want "version" to be a single bit, then the AONT would have to operate on non-full bytes, which is a bit (ha!) annoying, but not terrible. In that case, "0x0000" would actually be 15 bits of 0, and version would be 1 bit. This would only save 1.4 base32 characters, though. If you took off some more bits of the redundancy (down to 8 bits?), you would be able to shave one more base32 char. And indeed, if you make the redunancy just a single byte of 0x00, then the extra 0-bit for the "version" actually fits neatly in the one leftover bit of the base32 encoding, I think, so the AONT is back to working on full bytes.
But is a single byte of redundancy enough? It will let through one out of every 256 typos. (I thought we had spec'd 2 bytes for the checkcum now, but maybe I misremember? I'm also assuming we're using a simple 256-bit encoding of the pubkey, rather than something more complex that saves ~3 bits.)
(Heading to the airport.)
OK, here are the details of this variant of the proposal. Onion addresses are 54 characters in this variant, and the typo-resistance is 13 bits (1/8192 typos are not caught).
Encoding:
raw is a 34-byte array. Put the ed25519 key into raw[0..31] and 0x0000 into raw[32..33]. Note that there are really only 13 bits of 0's for redundancy, plus the 0 bit for the version, plus 2 unused bits in raw[32..33].
Do the AONT. Here G is a hash function mapping 16-byte inputs to 18-byte outputs, and H is a hash function mapping 18-byte inputs to 16-byte outputs. Reasonable implementations would be something like:
G(input) = SHA3-256("Prop224Gv0" || input)[0..17] H(input) = SHA3-256("Prop224Hv0" || input)[0..15]
raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, since we really only want 13 bits of redundancy raw[33] &= 0xf8 raw[0..15] ^= H(raw[16..33])
Then base32-encode raw[0..33]. The 56-character result will always end in "a=" (the two unused bits at the end of raw[33]), so just remove that part.
Decoding:
Base32-decode the received address into raw[0..33]. Depending on your base32 decoder, you may have to stick the "a=" at the end of the address first. The low two bits were unused; be sure the base32 decoder sets them to 0. The next lowest bit (raw[33] & 0x04) is the version bit. Ensure that (raw[33] & 0x04 == 0); if not, this is a different address format version you don't understand.
Undo the AONT:
raw[0..15] ^= H(raw[16..33]) raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, as above raw[33] &= 0xf8
Check the redundancy by ensuring that raw[32..33] = 0x0000. If not, there was a typo in the address. (Note again that since we explicitly cleared the low 3 bits of raw[33], there are really only 13 bits of checking here.)
raw[0..31] is then the pubkey suitable for use in Ed25519. As before (and independently of the AONT stuff), you could sanity-check it to make sure that (a) it is not the identity element, and (b) L times it *is* the identity element. (L is the order of the Ed25519 group.) Checking (a) is important; checking (b) isn't strictly necessary for the reasons given before, but is still a sensible thing to do. If you don't check (b), you actually have to check in (a) that the pubkey isn't one of 8 bad values, not just the identity. So just go ahead and check (b) to rest easier. ;-)
This version contains two calls to SHA3, as opposed to the one such call in the non-AONT (but including a checksum) version. The benefit is Alec's (and others') desire that there cannot be any bits an attacker could twiddle that would leave both the key the same and the address looking OK to somone who just spot-checks say the beginning and/or the end.
Hey people,
thanks for the R&D here. I'm currently trying to balance the tradeoffs here and decide whether to go ahead and implement this feature.
My main worry is the extra complexity this brings to our address encoding/decoding process and to our speficication, as well as when explaining the scheme to people.
Other than that, this seems like a reasonable improvement for a weird phishing scenario. I'm calling it weird because I'm not sure how an attacker can profit from being able to provide two addresses that correspond to the same key, but I can probably come up with a few scenarios if I think about it. Furthermore, this solution assumes a sloppy victim that does a partial spot-check (if the victim verified the whole address this design would make no difference).
BTW, isn't this phishing threat also possible in bitcoin (which is also using a 4-byte checksum that can be bruteforced)? Have there been any attacks of this nature?
Anyhow my first intuition is to just do this, as it seems like an improvement and it's probably not a huge amount of work. It can probably be done pretty cleanly if we abstract away the whole AONT construction and the custom-ish base32 encoding/decoding. I'm just worrying about putting more stuff in our already overloaded development bucket.
Is there a name for this AONT construction btw?
Thanks again :)
On Mon, Apr 03, 2017 at 03:04:47PM +0300, George Kadianakis wrote:
Hey people,
thanks for the R&D here. I'm currently trying to balance the tradeoffs here and decide whether to go ahead and implement this feature.
My main worry is the extra complexity this brings to our address encoding/decoding process and to our speficication, as well as when explaining the scheme to people.
Other than that, this seems like a reasonable improvement for a weird phishing scenario. I'm calling it weird because I'm not sure how an attacker can profit from being able to provide two addresses that correspond to the same key, but I can probably come up with a few scenarios if I think about it. Furthermore, this solution assumes a sloppy victim that does a partial spot-check (if the victim verified the whole address this design would make no difference).
BTW, isn't this phishing threat also possible in bitcoin (which is also using a 4-byte checksum that can be bruteforced)? Have there been any attacks of this nature?
Anyhow my first intuition is to just do this, as it seems like an improvement and it's probably not a huge amount of work. It can probably be done pretty cleanly if we abstract away the whole AONT construction and the custom-ish base32 encoding/decoding. I'm just worrying about putting more stuff in our already overloaded development bucket.
Is there a name for this AONT construction btw?
As my student Nik noticed, this isn't *technically* an AONT, since diffusion only happens "to the left", but that's where we want to randomize things if any bit of the address changes.
But if we're down to just pubkey + checksum + *1 bit of version*, then I'm not totally sold on the point of the AONT, since there are exactly 0 bits that can be twiddled while not changing the pubkey. *Note*: this is assuming that if we ever change the version number, *then* we do an AONT or something so that version 0 and version 1 addresses that have the same pubkey end up looking totally different (at least at the left end).
On 3 April 2017 at 13:04, George Kadianakis desnacked@riseup.net wrote:
I'm calling it weird because I'm not sure how an attacker can profit from being able to provide two addresses that correspond to the same key, but I can probably come up with a few scenarios if I think about it.
Hi George!
I'll agree it's a weird edge case :-)
I think the reason my spider-sense is tingling is because years of cleaning up after intrusions has taught me that sysadmins and human beings are very bad at non-canonical address formats, especially where they combine them with either blacklisting, or else case-statements-with-default-conditions.
If one creates scope for saying "the address is <foo>.onion but you can actually use <foo'>.onion or <foo''>.onion which are equivalent" - then someone will somehow leverage that either a) for hackery, or b) for social engineering.
Compare:
* http://017700000001 * http://2130706433 * http://0177.0.0.1 <- this one tends to surprise people * http://127.0.0.1
…and the sort of fun shenanigans that can be done with those "equivalent forms"
People who've been trained not to type [X] into their browser, might be convinced to type [X']
It's a lot easier for people to cope with there being one-and-only-one viable form for any given hostname or address-representation.
-a
On Mon, Apr 03, 2017 at 02:53:17PM +0100, Alec Muffett wrote:
On 3 April 2017 at 13:04, George Kadianakis desnacked@riseup.net wrote:
I'm calling it weird because I'm not sure how an attacker can profit from being able to provide two addresses that correspond to the same key, but I can probably come up with a few scenarios if I think about it.
Hi George!
I'll agree it's a weird edge case :-)
I think the reason my spider-sense is tingling is because years of cleaning up after intrusions has taught me that sysadmins and human beings are very bad at non-canonical address formats, especially where they combine them with either blacklisting, or else case-statements-with-default-conditions.
If one creates scope for saying "the address is <foo>.onion but you can actually use <foo'>.onion or <foo''>.onion which are equivalent" - then someone will somehow leverage that either a) for hackery, or b) for social engineering.
Compare:
- http://017700000001
- http://2130706433
- http://0177.0.0.1 <- this one tends to surprise people
- http://127.0.0.1
…and the sort of fun shenanigans that can be done with those "equivalent forms"
People who've been trained not to type [X] into their browser, might be convinced to type [X']
It's a lot easier for people to cope with there being one-and-only-one viable form for any given hostname or address-representation.
But as I said to Alec in AMS, anyone on the internet can register "facebook.mydomain.com" and have the A record point to the same thing as facebook.com. So there are always alternate names for any given website. TLS, of course, is designed to protect against these shenanigans.
Prop224 *also* (mostly) protects against these shenanigans, because even if there were two onion addresses that resolved to the same pubkey, the daily blinded version incorporates the original onion address (not just the pubkey, right? *Right?*), so the alternate address-with-same-pubkey won't actually point anywhere. However, an adversary can upload a descriptor there; I'm not sure what the implications of that are just now.
The other thing to remember is that didn't we already say that
facebookgbiyeqv3ebtjnlntwyvjoa2n7rvpnnaryd4a.onion
and
face-book-gbiy-eqv3-ebtj-nlnt-wyvj-oa2n-7rvp-nnar-yd4a.onion
will mean the same thing? So we're already past the "one (st)ring to rule them all" point?
- Ian
On 3 Apr 2017 3:48 p.m., "Ian Goldberg" iang@cs.uwaterloo.ca wrote:
The other thing to remember is that didn't we already say that
facebookgbiyeqv3ebtjnlntwyvjoa2n7rvpnnaryd4a.onion
and
face-book-gbiy-eqv3-ebtj-nlnt-wyvj-oa2n-7rvp-nnar-yd4a.onion
will mean the same thing? So we're already past the "one (st)ring to rule them all" point?
That's a great point, and I'm definitely interested and in favour of readability.
How about this, though: I know that Tor doesn't want to be in the business of site reputation, but what if (eg) Protonmail offers a Onion "Safe Browsing" extension some day, of known-bad Onions for malware reasons?
There's quite a gulf between stripping hyphens from a candidate onion address and doing strcmp(), versus either drilling into the candidate address to compute the alternative forms to check against the blacklist, or even requiring the blacklist to be 8x larger?
-a
On Mon, Apr 03, 2017 at 04:40:52PM +0100, Alec Muffett wrote:
On 3 Apr 2017 3:48 p.m., "Ian Goldberg" iang@cs.uwaterloo.ca wrote:
The other thing to remember is that didn't we already say that
facebookgbiyeqv3ebtjnlntwyvjoa2n7rvpnnaryd4a.onion
and
face-book-gbiy-eqv3-ebtj-nlnt-wyvj-oa2n-7rvp-nnar-yd4a.onion
will mean the same thing? So we're already past the "one (st)ring to rule them all" point?
That's a great point, and I'm definitely interested and in favour of readability.
How about this, though: I know that Tor doesn't want to be in the business of site reputation, but what if (eg) Protonmail offers a Onion "Safe Browsing" extension some day, of known-bad Onions for malware reasons?
That's a quite good motivating example, thanks!
There's quite a gulf between stripping hyphens from a candidate onion address and doing strcmp(), versus either drilling into the candidate address to compute the alternative forms to check against the blacklist, or even requiring the blacklist to be 8x larger?
Yes, that's true. I'm definitely in favour of the "multiply by L (the order of the group) and check that you get the identity element; error with 'malformed address' if you don't" to get rid of the torsion point problem.
If the daily descriptor uploaded to the point Hash(onionaddr, dailyrand) contained Hash(onionaddr, dailyrand) *in* it (and is signed by the master onion privkey, of course), then tor could/should check that it reached that location through the "right" onion address.
I'm afraid the details of what's in that daily descriptor are not in my brain at the moment. Does it contain its own (daily blinded) name under the signature?
- Ian
On 3 April 2017 at 16:59, Ian Goldberg iang@cs.uwaterloo.ca wrote:
How about this, though: I know that Tor doesn't want to be in the business
of site reputation, but what if (eg) Protonmail offers a Onion "Safe Browsing" extension some day, of known-bad Onions for malware reasons?
That's a quite good motivating example, thanks!
#Yay; I'm also thinking of other plugins (in the cleartext world, HTTPSEverywhere is the best example) which provide value to the user by mechanically mutating URIs which match some canonical DNS domain name; because Onion addresses are more like Layer-2 addresses*, development of similar plugins benefits greatly from enforced "canonicality" (sp?) than is necessary for equally-functional DNS equivalents; there is no means to "group" three disparate Onion addresses together just-because they are all owned by (say: Facebook), and if each address has 8 possible representations then that's 24 rules to match against...
There's quite a gulf between stripping hyphens from a candidate onion
address and doing strcmp(), versus either drilling into the candidate address to compute the alternative forms to check against the blacklist,
or
even requiring the blacklist to be 8x larger?
Yes, that's true. I'm definitely in favour of the "multiply by L (the order of the group) and check that you get the identity element; error with 'malformed address' if you don't" to get rid of the torsion point problem.
I heard that and AMS and it sounds a fabulous idea, although I am still too much of an EC noob to appreciate it fully. :-)
If the daily descriptor uploaded to the point
Hash(onionaddr, dailyrand) contained Hash(onionaddr, dailyrand) *in* it (and is signed by the master onion privkey, of course), then tor could/should check that it reached that location through the "right" onion address.
That sounds great, and I think it sounds an appropriate response, but again I am a Prop224 and EC noob. :-)
I would like, for two paragraph, to go entirely off-piste and ask a possibly irrelevant and probably wrong-headed question:
/* BEGIN PROBABLY WRONG SECTION */ I view Onions as Layer-2 addresses, and one popular attack on Ethernet Layer 2 is ARP-spoofing. Imagine $STATE_ACTOR exfiltrates the private key material from $ONIONSITE and wants to silently and partially MITM the existing site without wholesale owning or tampering with it. Can they make any benefit from multiple ("hardware MAC-address") keys colliding to one address? Is there any greater benefit to $STATE_ACTOR from this than (say) publishing lots of fake/extra introduction points for $ONIONSITE and using those to interpose themselves into communications? /* END PROBABLY WRONG SECTION */
I'm afraid the details of what's in that daily descriptor are not in my
brain at the moment. Does it contain its own (daily blinded) name under the signature?
<punt/> George?
-a
-- * Layer-2 analogy: https://twitter.com/AlecMuffett/status/802161730591793152
Following the Layer-2 Addressing analogy means that Ian, here:
If the daily descriptor uploaded to the point
Hash(onionaddr, dailyrand) contained Hash(onionaddr, dailyrand) *in* it (and is signed by the master onion privkey, of course), then tor could/should check that it reached that location through the "right" onion address.
…has essentially just invented what Solaris (for one) calls "IP Strict Destination Multihoming":
http://www.informit.com/articles/article.aspx?p=101138&seqNum=4
-a :-)
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Mon, Apr 03, 2017 at 04:40:52PM +0100, Alec Muffett wrote:
On 3 Apr 2017 3:48 p.m., "Ian Goldberg" iang@cs.uwaterloo.ca wrote:
The other thing to remember is that didn't we already say that
facebookgbiyeqv3ebtjnlntwyvjoa2n7rvpnnaryd4a.onion
and
face-book-gbiy-eqv3-ebtj-nlnt-wyvj-oa2n-7rvp-nnar-yd4a.onion
will mean the same thing? So we're already past the "one (st)ring to rule them all" point?
That's a great point, and I'm definitely interested and in favour of readability.
How about this, though: I know that Tor doesn't want to be in the business of site reputation, but what if (eg) Protonmail offers a Onion "Safe Browsing" extension some day, of known-bad Onions for malware reasons?
That's a quite good motivating example, thanks!
There's quite a gulf between stripping hyphens from a candidate onion address and doing strcmp(), versus either drilling into the candidate address to compute the alternative forms to check against the blacklist, or even requiring the blacklist to be 8x larger?
Yes, that's true. I'm definitely in favour of the "multiply by L (the order of the group) and check that you get the identity element; error with 'malformed address' if you don't" to get rid of the torsion point problem.
Hello again,
this is the second subthread of the AONT thread that grew too big for its own good, and it's about ed25519.
The topic of this subthread is the above ed25519 verification of onion addresses that Ian suggested a few times already.
So the idea is that before you use an onionaddress (as a client or whatever), you should extract its ed25519 pubkey and multiply it by the group order and make sure you get back the identity element to ensure that there are no torsion components to the key.
I'm pretty weak on crypto so I have some questions about this defence:
- Why are we doing this? Are we doing this because if we allow torsion components in the keys, someone could basically create multiple equivalent keys for each legit ed25519 key, using the Z/8Z torsion scalar as the tweak?
Or is the reason to defend against small subgroup attacks? I think not, because from my understanding these attacks mainly apply to DH protocols which is not what we are doing with onion addresses.
- Is this something that we should be doing for _any_ received ed25519 ever, even in other parts of the protocol?
- Should we do this verification also for received x25519 (DH) keys? It seems like RFC7748 is instead suggesting we ensure that the DH output is not all-zeroes. Are these two defences equivalent for our purposes?
Thanks for the help :)
(Also, please let me know if there are any other action items from the AONT thread that I missed.)
On Thu, Apr 06, 2017 at 03:37:35PM +0300, George Kadianakis wrote:
Hello again,
this is the second subthread of the AONT thread that grew too big for its own good, and it's about ed25519.
The topic of this subthread is the above ed25519 verification of onion addresses that Ian suggested a few times already.
So the idea is that before you use an onionaddress (as a client or whatever), you should extract its ed25519 pubkey and multiply it by the group order and make sure you get back the identity element to ensure that there are no torsion components to the key.
And also check that the key is not itself the identity element.
I'm pretty weak on crypto so I have some questions about this defence:
- Why are we doing this? Are we doing this because if we allow torsion components in the keys, someone could basically create multiple equivalent keys for each legit ed25519 key, using the Z/8Z torsion scalar as the tweak?
Yes, this one. Also it would be good to be alerted if someone's publishing malformed onion addresses for some reason.
Or is the reason to defend against small subgroup attacks? I think not, because from my understanding these attacks mainly apply to DH protocols which is not what we are doing with onion addresses.
Correct.
- Is this something that we should be doing for _any_ received ed25519 ever, even in other parts of the protocol?
Whenever you receive a value that is supposed to be in the EC group, it's safest to either (a) check that it is, and fail if not, or (b) perform the operation in a way that will behave well in either case. If you really have received an arbitrary 256-bit value from the Internet, and you want it to be in the ed25519 group, you really should do the "multiply by l" check, since not only might it have a torsion component (addressable by other means such as TSR in the other thread), but it may not be in the group at all, but rather in the "twist" group. Multiplying by l will simultaneously check both issues. But sometimes being in the wrong group isn't terrible; see the next paragraph.
- Should we do this verification also for received x25519 (DH) keys? It seems like RFC7748 is instead suggesting we ensure that the DH output is not all-zeroes. Are these two defences equivalent for our purposes?
The x25519 DH operation automatically clears the torsion component, since it insists your private key be a multiple of 8. If someone sends you a point on the twist instead of in the expected group, the DH will fail (you won't end up with the same shared key as the other party), but you will be saved from the small subgroup attack because of that "8". (That invariant of being a multiple of 8 is what you lose during multiplicative blinding, and what TSR restores.)
- Ian
On Mon, Apr 03, 2017 at 10:48:26AM -0400, Ian Goldberg wrote:
The other thing to remember is that didn't we already say that
facebookgbiyeqv3ebtjnlntwyvjoa2n7rvpnnaryd4a.onion
and
face-book-gbiy-eqv3-ebtj-nlnt-wyvj-oa2n-7rvp-nnar-yd4a.onion
will mean the same thing?
Did we? I admit that I haven't been paying enough attention to anything lately, but last I checked, we thought that was a terrible idea because people can make a bunch of different versions of the address, and use them as tracking mechanisms for users. (For example, I put two versions of the same address on my two different pages, and now when somebody goes to that onion address, I can distinguish which page they came from. In the extreme versions of this idea, I give a unique version of my address to the target, and then I can spot him when he uses it.)
Ultimately the problem is that the browser is too good at giving away the hostname that it thinks it's going to -- in various headers, in cross-site isolation, etc etc.
So, if we have indeed decided to allow many versions of format for onion addresses, I hope we thought through this attack and decided it was worth it. :)
--Roger
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Mon, Apr 03, 2017 at 02:53:17PM +0100, Alec Muffett wrote:
On 3 April 2017 at 13:04, George Kadianakis desnacked@riseup.net wrote:
I'm calling it weird because I'm not sure how an attacker can profit from being able to provide two addresses that correspond to the same key, but I can probably come up with a few scenarios if I think about it.
Hi George!
I'll agree it's a weird edge case :-)
I think the reason my spider-sense is tingling is because years of cleaning up after intrusions has taught me that sysadmins and human beings are very bad at non-canonical address formats, especially where they combine them with either blacklisting, or else case-statements-with-default-conditions.
If one creates scope for saying "the address is <foo>.onion but you can actually use <foo'>.onion or <foo''>.onion which are equivalent" - then someone will somehow leverage that either a) for hackery, or b) for social engineering.
Compare:
- http://017700000001
- http://2130706433
- http://0177.0.0.1 <- this one tends to surprise people
- http://127.0.0.1
…and the sort of fun shenanigans that can be done with those "equivalent forms"
People who've been trained not to type [X] into their browser, might be convinced to type [X']
It's a lot easier for people to cope with there being one-and-only-one viable form for any given hostname or address-representation.
But as I said to Alec in AMS, anyone on the internet can register "facebook.mydomain.com" and have the A record point to the same thing as facebook.com. So there are always alternate names for any given website. TLS, of course, is designed to protect against these shenanigans.
Hey,
sorry for the slow responses to this thread. Got lots of post-meeting backlog to handle, and I'm also working on the various ed25519 stuff.
Specifically, I'm now working on the suggested check of multiplying any received curve25519 point with the group order and ensuring the result is the identity element.
Prop224 *also* (mostly) protects against these shenanigans, because even if there were two onion addresses that resolved to the same pubkey, the daily blinded version incorporates the original onion address (not just the pubkey, right? *Right?*), so the alternate address-with-same-pubkey won't actually point anywhere. However, an adversary can upload a descriptor there; I'm not sure what the implications of that are just now.
Actually, I *don't* think that the blind factor of the derived key incorporates the actual onion address. Citing the proposal:
Let the basepoint be written as B. Assume B has prime order l, so lB=0. Let a master keypair be written as (a,A), where a is the private key and A is the public key (A=aB)
To derive the key for a nonce N and an optional secret s, compute the blinding factor h as H(A | s, B, N), and let:
Perhaps we can add another component to h as follows: h = H(A, s, B, N, ONIONADDRESS) where ONIONADDRESS is a string representation of the service's onion address.
I think this code is already implemented, but this might be worth fixing anyhow. I'll make a ticket.
The other thing to remember is that didn't we already say that
facebookgbiyeqv3ebtjnlntwyvjoa2n7rvpnnaryd4a.onion
and
face-book-gbiy-eqv3-ebtj-nlnt-wyvj-oa2n-7rvp-nnar-yd4a.onion
will mean the same thing? So we're already past the "one (st)ring to rule them all" point?
I don't think we have actually decided on such a feature yet. It was suggested but the tradeoffs are not clearly skewed to the "let's do it" direction.
Cheers!
On 27 Mar (04:58:34), Ian Goldberg wrote:
On Mon, Mar 27, 2017 at 01:59:42AM -0400, Ian Goldberg wrote:
To add an aside from a discussion with Teor: the entire "version" field could be reduced to a single - probably "zero" - bit, in a manner perhaps similar to the distinctions between Class-A, Class-B, Class-C... addresses in old IPv4.
Thus: if the first bit in the address is zero, then there is no version, and we are at version 0 of the format
If the first bit is one, we are using v1+ of the format and all bets are off, except that the obvious thing then to do is count the number of 1-bits (up to some limit) and declare that to be version number. Once we're up to 3 or 4 or 7 or 8 one-bits, then shift version encoding totally.
Teor will correct me if I misquote him, but the advantage here was:
a) the version number is 1 bit, ie: small, for the forseeable / if we get it right
b) in pursuit of smallness, we could maybe dump the hash in favour of a AONT + eyeballs, which would give back a bunch of extra bits
result: shorter addresses, happier users.
You indeed do not require a checksum under an AONT, but you do require redundancy if you want to catch typos. Something like
base64( AONT( pubkey || 0x0000 ) || version)
is fine. If you want "version" to be a single bit, then the AONT would have to operate on non-full bytes, which is a bit (ha!) annoying, but not terrible. In that case, "0x0000" would actually be 15 bits of 0, and version would be 1 bit. This would only save 1.4 base32 characters, though. If you took off some more bits of the redundancy (down to 8 bits?), you would be able to shave one more base32 char. And indeed, if you make the redunancy just a single byte of 0x00, then the extra 0-bit for the "version" actually fits neatly in the one leftover bit of the base32 encoding, I think, so the AONT is back to working on full bytes.
But is a single byte of redundancy enough? It will let through one out of every 256 typos. (I thought we had spec'd 2 bytes for the checkcum now, but maybe I misremember? I'm also assuming we're using a simple 256-bit encoding of the pubkey, rather than something more complex that saves ~3 bits.)
(Heading to the airport.)
OK, here are the details of this variant of the proposal. Onion addresses are 54 characters in this variant, and the typo-resistance is 13 bits (1/8192 typos are not caught).
Encoding:
raw is a 34-byte array. Put the ed25519 key into raw[0..31] and 0x0000 into raw[32..33]. Note that there are really only 13 bits of 0's for redundancy, plus the 0 bit for the version, plus 2 unused bits in raw[32..33].
Do the AONT. Here G is a hash function mapping 16-byte inputs to 18-byte outputs, and H is a hash function mapping 18-byte inputs to 16-byte outputs. Reasonable implementations would be something like:
G(input) = SHA3-256("Prop224Gv0" || input)[0..17] H(input) = SHA3-256("Prop224Hv0" || input)[0..15]
raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, since we really only want 13 bits of redundancy raw[33] &= 0xf8 raw[0..15] ^= H(raw[16..33])
Then base32-encode raw[0..33]. The 56-character result will always end in "a=" (the two unused bits at the end of raw[33]), so just remove that part.
Decoding:
Base32-decode the received address into raw[0..33]. Depending on your base32 decoder, you may have to stick the "a=" at the end of the address first. The low two bits were unused; be sure the base32 decoder sets them to 0. The next lowest bit (raw[33] & 0x04) is the version bit. Ensure that (raw[33] & 0x04 == 0); if not, this is a different address format version you don't understand.
I do understand the problem (I think) with the version field being longer than a single bit but it kind of causes some problem on the engineering and protocol side. Here is why:
The current plan is to put the HS protocol version in the address because when we fetch the descriptor from an HSDir, we use an URL that is on the form of "/tor/hs/<version>/<z>" where <z> is the blinded key.
The reason we put the version number in the URL like the in the above is because we might NOT use a 32 bytes key in future version when looking up the descriptor so the version tells us what <z> is. Second, imagine a world in few years where we have v3, v4 and v5 all living happily together and the addresses are all 54 characters. On the client side, it would be really not good that we do a fetch for all possible version and see which one works thus having the version in the address prevents that.
I get that the solution to "which version to look up" with this proposed change is that if v3 address, the version bit is 0, else if that bit is 1, try decode v4. Then repeat for v4 up to vN until you get something that works. However, this "locks" us in an interesting position which is every new version needs a "new" address scheme. And that is the part I'm unsure here... We are just going to let our future selfves deal with the version field problem in v4+? :)
Protocol version change can be as benign as changing a single field in the descriptor which can lead to minor changes on parsing the cells for instance. Do we really want to go again and think of a new address scheme everytime we want to improve the protocol and for which we have to bump the version? Risking lots of bikeshedding, security implications and so on _everytime_ ?
The extra complexity here seems intense for what we really win overall with this construction?
Thanks! David
Undo the AONT:
raw[0..15] ^= H(raw[16..33]) raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, as above raw[33] &= 0xf8
Check the redundancy by ensuring that raw[32..33] = 0x0000. If not, there was a typo in the address. (Note again that since we explicitly cleared the low 3 bits of raw[33], there are really only 13 bits of checking here.)
raw[0..31] is then the pubkey suitable for use in Ed25519. As before (and independently of the AONT stuff), you could sanity-check it to make sure that (a) it is not the identity element, and (b) L times it *is* the identity element. (L is the order of the Ed25519 group.) Checking (a) is important; checking (b) isn't strictly necessary for the reasons given before, but is still a sensible thing to do. If you don't check (b), you actually have to check in (a) that the pubkey isn't one of 8 bad values, not just the identity. So just go ahead and check (b) to rest easier. ;-)
This version contains two calls to SHA3, as opposed to the one such call in the non-AONT (but including a checksum) version. The benefit is Alec's (and others') desire that there cannot be any bits an attacker could twiddle that would leave both the key the same and the address looking OK to somone who just spot-checks say the beginning and/or the end. -- Ian Goldberg Professor and University Research Chair Cheriton School of Computer Science University of Waterloo _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 05 Apr (09:50:38), David Goulet wrote:
On 27 Mar (04:58:34), Ian Goldberg wrote:
On Mon, Mar 27, 2017 at 01:59:42AM -0400, Ian Goldberg wrote:
To add an aside from a discussion with Teor: the entire "version" field could be reduced to a single - probably "zero" - bit, in a manner perhaps similar to the distinctions between Class-A, Class-B, Class-C... addresses in old IPv4.
Thus: if the first bit in the address is zero, then there is no version, and we are at version 0 of the format
If the first bit is one, we are using v1+ of the format and all bets are off, except that the obvious thing then to do is count the number of 1-bits (up to some limit) and declare that to be version number. Once we're up to 3 or 4 or 7 or 8 one-bits, then shift version encoding totally.
Teor will correct me if I misquote him, but the advantage here was:
a) the version number is 1 bit, ie: small, for the forseeable / if we get it right
b) in pursuit of smallness, we could maybe dump the hash in favour of a AONT + eyeballs, which would give back a bunch of extra bits
result: shorter addresses, happier users.
You indeed do not require a checksum under an AONT, but you do require redundancy if you want to catch typos. Something like
base64( AONT( pubkey || 0x0000 ) || version)
is fine. If you want "version" to be a single bit, then the AONT would have to operate on non-full bytes, which is a bit (ha!) annoying, but not terrible. In that case, "0x0000" would actually be 15 bits of 0, and version would be 1 bit. This would only save 1.4 base32 characters, though. If you took off some more bits of the redundancy (down to 8 bits?), you would be able to shave one more base32 char. And indeed, if you make the redunancy just a single byte of 0x00, then the extra 0-bit for the "version" actually fits neatly in the one leftover bit of the base32 encoding, I think, so the AONT is back to working on full bytes.
But is a single byte of redundancy enough? It will let through one out of every 256 typos. (I thought we had spec'd 2 bytes for the checkcum now, but maybe I misremember? I'm also assuming we're using a simple 256-bit encoding of the pubkey, rather than something more complex that saves ~3 bits.)
(Heading to the airport.)
OK, here are the details of this variant of the proposal. Onion addresses are 54 characters in this variant, and the typo-resistance is 13 bits (1/8192 typos are not caught).
Encoding:
raw is a 34-byte array. Put the ed25519 key into raw[0..31] and 0x0000 into raw[32..33]. Note that there are really only 13 bits of 0's for redundancy, plus the 0 bit for the version, plus 2 unused bits in raw[32..33].
Do the AONT. Here G is a hash function mapping 16-byte inputs to 18-byte outputs, and H is a hash function mapping 18-byte inputs to 16-byte outputs. Reasonable implementations would be something like:
G(input) = SHA3-256("Prop224Gv0" || input)[0..17] H(input) = SHA3-256("Prop224Hv0" || input)[0..15]
raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, since we really only want 13 bits of redundancy raw[33] &= 0xf8 raw[0..15] ^= H(raw[16..33])
Then base32-encode raw[0..33]. The 56-character result will always end in "a=" (the two unused bits at the end of raw[33]), so just remove that part.
Decoding:
Base32-decode the received address into raw[0..33]. Depending on your base32 decoder, you may have to stick the "a=" at the end of the address first. The low two bits were unused; be sure the base32 decoder sets them to 0. The next lowest bit (raw[33] & 0x04) is the version bit. Ensure that (raw[33] & 0x04 == 0); if not, this is a different address format version you don't understand.
I do understand the problem (I think) with the version field being longer than a single bit but it kind of causes some problem on the engineering and protocol side. Here is why:
The current plan is to put the HS protocol version in the address because when we fetch the descriptor from an HSDir, we use an URL that is on the form of "/tor/hs/<version>/<z>" where <z> is the blinded key.
Another thing about this I just thought of. This AONT construction seems wise to use. But it's still not entirely clear to me why we need a 1bit version field. Taking this:
base64( AONT( pubkey || 0x0000 ) || version)
If the version is 1 byte, then only the end of the address can be mangled with and if it is, the tor client won't be able to fetch the descriptor because of how the URL is constructed (correct version number is needed).
So I really don't see the phishing attack here being successful at all...?
Can you enlighten what attack we are trying to avoid here that we require a 1bit version field?
Thanks! David
The reason we put the version number in the URL like the in the above is because we might NOT use a 32 bytes key in future version when looking up the descriptor so the version tells us what <z> is. Second, imagine a world in few years where we have v3, v4 and v5 all living happily together and the addresses are all 54 characters. On the client side, it would be really not good that we do a fetch for all possible version and see which one works thus having the version in the address prevents that.
I get that the solution to "which version to look up" with this proposed change is that if v3 address, the version bit is 0, else if that bit is 1, try decode v4. Then repeat for v4 up to vN until you get something that works. However, this "locks" us in an interesting position which is every new version needs a "new" address scheme. And that is the part I'm unsure here... We are just going to let our future selfves deal with the version field problem in v4+? :)
Protocol version change can be as benign as changing a single field in the descriptor which can lead to minor changes on parsing the cells for instance. Do we really want to go again and think of a new address scheme everytime we want to improve the protocol and for which we have to bump the version? Risking lots of bikeshedding, security implications and so on _everytime_ ?
The extra complexity here seems intense for what we really win overall with this construction?
Thanks! David
Undo the AONT:
raw[0..15] ^= H(raw[16..33]) raw[16..33] ^= G(raw[0..15]) # Clear the last few bits, as above raw[33] &= 0xf8
Check the redundancy by ensuring that raw[32..33] = 0x0000. If not, there was a typo in the address. (Note again that since we explicitly cleared the low 3 bits of raw[33], there are really only 13 bits of checking here.)
raw[0..31] is then the pubkey suitable for use in Ed25519. As before (and independently of the AONT stuff), you could sanity-check it to make sure that (a) it is not the identity element, and (b) L times it *is* the identity element. (L is the order of the Ed25519 group.) Checking (a) is important; checking (b) isn't strictly necessary for the reasons given before, but is still a sensible thing to do. If you don't check (b), you actually have to check in (a) that the pubkey isn't one of 8 bad values, not just the identity. So just go ahead and check (b) to rest easier. ;-)
This version contains two calls to SHA3, as opposed to the one such call in the non-AONT (but including a checksum) version. The benefit is Alec's (and others') desire that there cannot be any bits an attacker could twiddle that would leave both the key the same and the address looking OK to somone who just spot-checks say the beginning and/or the end. -- Ian Goldberg Professor and University Research Chair Cheriton School of Computer Science University of Waterloo _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
-- zKn486Kfk5AQCOU9QOcbno9j3/MtJxuqpxCY2RSsPoo=
On Wed, Apr 05, 2017 at 10:02:07AM -0400, David Goulet wrote:
Another thing about this I just thought of. This AONT construction seems wise to use. But it's still not entirely clear to me why we need a 1bit version field. Taking this:
base64( AONT( pubkey || 0x0000 ) || version)
If the version is 1 byte, then only the end of the address can be mangled with and if it is, the tor client won't be able to fetch the descriptor because of how the URL is constructed (correct version number is needed).
So I really don't see the phishing attack here being successful at all...?
Can you enlighten what attack we are trying to avoid here that we require a 1bit version field?
I believe the danger Alec was wanting to avoid was that someone (not the onion service owner) could take an existing onion address, bump the version number (which wouldn't change the vanity beginning of the address), and upload the very same descriptor to the resulting blinded address (under the new version number). Then the modified address would work just like the original.
As mentioned elsewhere in the thread, this is solved if that descriptor contains (under the signature by the "master" onion key) the actual onion address you were expected to use to get there. Does it? If so, I think we don't have to worry about this problem at all.
On 5 April 2017 at 15:11, Ian Goldberg iang@cs.uwaterloo.ca wrote:
I believe the danger Alec was wanting to avoid was that someone (not the onion service owner) could take an existing onion address, bump the version number (which wouldn't change the vanity beginning of the address), and upload the very same descriptor to the resulting blinded address (under the new version number). Then the modified address would work just like the original.
In a nutshell, yes.
I've been having a discussion with Taylor Campbell off-list, and I wrote:
- *… let me try something on you: * - *The year is 2019. * - *What would _you_ do * - *in order to surface to the user * - *that the onion address in front of them, * - *one with a given public key which they've previously used and trusted before * - *such that the leftmost 32 bytes, base32 encoded, are familiar to them, * - *is actually a downgraded version-2 format of address * - *against which a bug is being exploited by (say) the FBI * - *rather than the more secure version-3 form which they were expecting and had previously used, * - *when all of the information pertinent to versions and checksums is at the right-hand-end of the encoded address? * - *This is basically where I am coming from.* - *My thinking: Make it brittle. Mix the version (etc) into the represented form so that if one messes with a single bit, one perceptibly impacts the entire string representation of the onion address. How would you attack this? *
...and also:
*do we want to be teaching users that:* *--- eh2tndsmiher4dqv266z5ii2xkt6brx2llwliq3jim233e5c5bc5, and* *--- eh2tndsmiher4dqv266z5ii2xkt6brx2llwliq3jim233e5d5bc5**...are actually the same thing, but if and only if they differ in the N-5'th character?*
...and:
*… up front I'll just say that my perspective of this class of threat comes from observations like * *a) people are creative, and if you give them malleability they will use it to create onion addresses including embedded "poop-emoji" and the like.* *b) people generalise, so that having learned that $SOME_CHARACTER in an onion address is malleable, they will assume that most/all of them are and subsequently fall for phishing attacks.**c) people are, as a group, given the entire Tor prop224 ecosystem, infinitely more creative than I can be at finding ways to exploit it, therefore it makes sense to screw down the crypto to present as small an attack surface as possible.*
...and:
*An old programmer maxim is that one should provide for Zero, One, or an Infinite number of any resource. * *Since we do not desire an infinite number of representations of an onion address (per Roger) - and zero would not be useful, we should shoot for one, and only one.**Not a cryptographic argument, but I think it's a human one. :-)*
There's a lot more, but I don't want to bury folk with a huge multi-message e-mail exchange; plus there is a lot of useful context "up-thread". :-)
As mentioned elsewhere in the thread, this is solved if that descriptor contains (under the signature by the "master" onion key) the actual onion address you were expected to use to get there. Does it? If so, I think we don't have to worry about this problem at all.
I hope it does. That sounds very much like what I expect to see in other network stacks. :-)
-a
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Wed, Apr 05, 2017 at 10:02:07AM -0400, David Goulet wrote:
Another thing about this I just thought of. This AONT construction seems wise to use. But it's still not entirely clear to me why we need a 1bit version field. Taking this:
base64( AONT( pubkey || 0x0000 ) || version)
If the version is 1 byte, then only the end of the address can be mangled with and if it is, the tor client won't be able to fetch the descriptor because of how the URL is constructed (correct version number is needed).
So I really don't see the phishing attack here being successful at all...?
Can you enlighten what attack we are trying to avoid here that we require a 1bit version field?
I believe the danger Alec was wanting to avoid was that someone (not the onion service owner) could take an existing onion address, bump the version number (which wouldn't change the vanity beginning of the address), and upload the very same descriptor to the resulting blinded address (under the new version number). Then the modified address would work just like the original.
As mentioned elsewhere in the thread, this is solved if that descriptor contains (under the signature by the "master" onion key) the actual onion address you were expected to use to get there. Does it? If so, I think we don't have to worry about this problem at all.
Hello people,
the AONT thread has grown to an immense size and includes all sorts of discussions, so I will split it into two smaller threads with just action items so that we move this forward ASAP (as this interacts with our current implementation efforts).
From skimming the thread, this seems like the general discussion flow:
- "Let's do AONT so that no one can tweak the onion address while keeping the same blinded pubkey so that people can't create multiple onion addresses that point to the same key and look almost the same"
- "Hm, but there are no bits to tweak apart from the version field"
- "But maybe v4->v3 downgrade attacks are possible using the version field, so let's include the whole onionaddress (including version) into the blinded key derivation"
- But then maybe in 2020 an attacker is able to replay a v4 descriptor into an HSDir as a v3 descriptor, and then do a downgrade attack by persuading a victim to fetch the v3 descriptor (see Ian/Alec latest mails)
And I think then we ended up with:
"Then let's include the canonical onionaddress (including version) into the descriptor so that clients can verify that they used the onionaddress that the onionservice was intending for them to use"
So I guess the current suggested plan is to add an extra descriptor field with the onionaddress (or its hash) into the _encrypted parts_ of the descriptor so that clients can do this extra verification to defend against downgrade attacks.
I think this seems like a reasonable defence here, and more safe + engineering-friendly than the AONT stuff (see David's email). We should just make sure that this plan does not interact badly with things like onionbalance and future name systems.
Do you think this makes sense? If yes, I will write a spec patch in the next few days.
And I think this sums up the discussion wrt onion address encoding. I'm going to start a new thread about the ed25519-related suggestions that were thrown into this thread.
Cheers!
George Kadianakis desnacked@riseup.net writes:
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Wed, Apr 05, 2017 at 10:02:07AM -0400, David Goulet wrote:
Another thing about this I just thought of. This AONT construction seems wise to use. But it's still not entirely clear to me why we need a 1bit version field. Taking this:
base64( AONT( pubkey || 0x0000 ) || version)
If the version is 1 byte, then only the end of the address can be mangled with and if it is, the tor client won't be able to fetch the descriptor because of how the URL is constructed (correct version number is needed).
So I really don't see the phishing attack here being successful at all...?
Can you enlighten what attack we are trying to avoid here that we require a 1bit version field?
I believe the danger Alec was wanting to avoid was that someone (not the onion service owner) could take an existing onion address, bump the version number (which wouldn't change the vanity beginning of the address), and upload the very same descriptor to the resulting blinded address (under the new version number). Then the modified address would work just like the original.
As mentioned elsewhere in the thread, this is solved if that descriptor contains (under the signature by the "master" onion key) the actual onion address you were expected to use to get there. Does it? If so, I think we don't have to worry about this problem at all.
Hello people,
the AONT thread has grown to an immense size and includes all sorts of discussions, so I will split it into two smaller threads with just action items so that we move this forward ASAP (as this interacts with our current implementation efforts).
<snip>
"Then let's include the canonical onionaddress (including version) into the descriptor so that clients can verify that they used the onionaddress that the onionservice was intending for them to use"
So I guess the current suggested plan is to add an extra descriptor field with the onionaddress (or its hash) into the _encrypted parts_ of the descriptor so that clients can do this extra verification to defend against downgrade attacks.
I think this seems like a reasonable defence here, and more safe + engineering-friendly than the AONT stuff (see David's email). We should just make sure that this plan does not interact badly with things like onionbalance and future name systems.
Do you think this makes sense? If yes, I will write a spec patch in the next few days.
And I think this sums up the discussion wrt onion address encoding. I'm going to start a new thread about the ed25519-related suggestions that were thrown into this thread.
And here is a torspec branch that specifies this behavior: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-desc-ph...
We basically add the canonical onion address in the inner encrypted layer of the descriptor, and expect the client to verify it. I made this feature optional in case we ever decide it was a bad idea.
Please let me know if you think this behavior is worthwhile merging upstream and implementing.
Thanks! :)
On 11/04/17 11:45, George Kadianakis wrote:
We basically add the canonical onion address in the inner encrypted layer of the descriptor, and expect the client to verify it. I made this feature optional in case we ever decide it was a bad idea.
Is the version number also included in the blinded key derivation? I haven't been keeping up with prop224 developments, so apologies if that's already been settled, but in your previous email it sounded like it was one of the suggestions but not one of the action items.
If the version number is included in the descriptor but not in the blinded key derivation, can a service publish descriptors for multiple protocol versions? Would there be a conflict if the HS directories store the descriptors under the same blinded key?
Cheers, Michael
Michael Rogers michael@briarproject.org writes:
On 11/04/17 11:45, George Kadianakis wrote:
We basically add the canonical onion address in the inner encrypted layer of the descriptor, and expect the client to verify it. I made this feature optional in case we ever decide it was a bad idea.
Is the version number also included in the blinded key derivation? I haven't been keeping up with prop224 developments, so apologies if that's already been settled, but in your previous email it sounded like it was one of the suggestions but not one of the action items.
That's a fine question, and it made me think deeper about our options.
I think both of the following suggestions from my previous email aim to protect from the same attacks: a) Include version number in blinded key derivation formula b) Include canonical onion address in descriptor
Both (a) and (b) above aim to protect against scenarios where an attacker (without private keys) takes a legitimate onion address, tweaks its metadata bits (version/whatever), and creates a different-but-equivalent onion address that has the exact same behavior as the original one (points to the same 25519 key, same descriptor, etc.). See Alec's and Ian's emails for a demonstration of how this can be exploited: https://lists.torproject.org/pipermail/tor-dev/2017-April/012160.html https://lists.torproject.org/pipermail/tor-dev/2017-April/012159.html
I'm pretty sure that both (a) and (b) defend against Alec's and Ian's attack since:
- With (a), the different-but-equivalent onion address would produce a different blinded key from the original onion address. The attacker would not be able to forge a signature for the descriptor since they don't know the private part of the new blinded key, so the descriptor would not be accepted by the HSDir or the client.
- With (b), the different-but-equivalent onion address would work, but when the client fetches the descriptor, the client would verify the new "canonical-onion-addr" field, notice that they reached this onion service from another address, and reject the descriptor. It's like we are including an SSL CN field in our HS descriptors.
I considered (b) easier to understand and reason about, and it also seems to protect against a wider variety of descriptor replay attacks, and that's why I suggested we go with (b).
My main fear with (b) is that in the future we might come up with a new load-balancing scheme of sorts that would get screwed by the "canonical-onion-addr" field. It does not seem to pose a problem with onionbalance or stealth-auth kind of schemes, so all is good so far. Please let me know if you can think of a scenario where (b) is a bad idea.
If the version number is included in the descriptor but not in the blinded key derivation, can a service publish descriptors for multiple protocol versions? Would there be a conflict if the HS directories store the descriptors under the same blinded key?
Yes it's possible to publish descs for multiple protocol versions, since we use a different URL for each version. Quoting from spec:
Hidden service descriptors conforming to this specification are uploaded with an HTTP POST request to the URL /tor/hs/<version>/publish relative to the hidden service directory's root, and downloaded with an HTTP GET request for the URL /tor/hs/<version>/<z> where <z> is a base64 encoding of the hidden service's blinded public key and <version> is the protocol version which is "3" in this case.
Also the HSDirs store the descriptors using both the publickey and the version as indices, so this should not be a problem.
---
Thanks for all the feedback people.
Greetings from Athens!
On 11 Apr (13:45:41), George Kadianakis wrote:
George Kadianakis desnacked@riseup.net writes:
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Wed, Apr 05, 2017 at 10:02:07AM -0400, David Goulet wrote:
Another thing about this I just thought of. This AONT construction seems wise to use. But it's still not entirely clear to me why we need a 1bit version field. Taking this:
base64( AONT( pubkey || 0x0000 ) || version)
If the version is 1 byte, then only the end of the address can be mangled with and if it is, the tor client won't be able to fetch the descriptor because of how the URL is constructed (correct version number is needed).
So I really don't see the phishing attack here being successful at all...?
Can you enlighten what attack we are trying to avoid here that we require a 1bit version field?
I believe the danger Alec was wanting to avoid was that someone (not the onion service owner) could take an existing onion address, bump the version number (which wouldn't change the vanity beginning of the address), and upload the very same descriptor to the resulting blinded address (under the new version number). Then the modified address would work just like the original.
As mentioned elsewhere in the thread, this is solved if that descriptor contains (under the signature by the "master" onion key) the actual onion address you were expected to use to get there. Does it? If so, I think we don't have to worry about this problem at all.
Hello people,
the AONT thread has grown to an immense size and includes all sorts of discussions, so I will split it into two smaller threads with just action items so that we move this forward ASAP (as this interacts with our current implementation efforts).
<snip>
"Then let's include the canonical onionaddress (including version) into the descriptor so that clients can verify that they used the onionaddress that the onionservice was intending for them to use"
So I guess the current suggested plan is to add an extra descriptor field with the onionaddress (or its hash) into the _encrypted parts_ of the descriptor so that clients can do this extra verification to defend against downgrade attacks.
I think this seems like a reasonable defence here, and more safe + engineering-friendly than the AONT stuff (see David's email). We should just make sure that this plan does not interact badly with things like onionbalance and future name systems.
Do you think this makes sense? If yes, I will write a spec patch in the next few days.
And I think this sums up the discussion wrt onion address encoding. I'm going to start a new thread about the ed25519-related suggestions that were thrown into this thread.
And here is a torspec branch that specifies this behavior: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-desc-ph...
We basically add the canonical onion address in the inner encrypted layer of the descriptor, and expect the client to verify it. I made this feature optional in case we ever decide it was a bad idea.
Yeah I was puzzled about the optional idea but agree that being able to rollback without having client freak out could be a good idea.
Also, if we ever come up with stealth client authorization one day, that field could be affected in some ways... unsure but still, worth being more safe than very sorry and stuck with that field :).
Last comment. Can we maybe add a sentence somewhere that explains what the client should do with this field? I assume it is has simple as doing a strcmp() with the original address you requested the descriptor but still.
Thanks! David
Please let me know if you think this behavior is worthwhile merging upstream and implementing.
Thanks! :) _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Date: Sun, 26 Mar 2017 14:24:41 +0200 From: Alec Muffett alec.muffett@gmail.com
This is a point of significant concern because of issues like phishing and passing-off - by analogy: t0rpr0ject.0rg versus torproject.org - and other games that can be played with a prop224 address now, or in future, to game user experience. [...] The result would be onion addresses which are less "tamperable" / more deterministic, that closer to one-and-only-one published onion address will correspond to an onion endpoint.
What does the panel think?
What is the threat model an AONT defends against here, and what security properties do we aim to provide against that threat?
Here are a few candidates. Suppose I own 0123456789deadbeef2.onion, where 2 is the onion version number.
T1. Adversary does not know 0123456789deadbeef2.onion but controls all onion service directories. (SP1) Adversary can't discover 0123456789deadbeef2.onion or thereby distinguish descriptors for 0123456789deadbeef2.onion from other descriptors simply by controlling what is in the directories. -> With or without AONT, since the onion service descriptors are encrypted, the adversary can't learn their content anyway.
T2. Adversary knows 0123456789deadbeef2.onion and controls all Tor nodes except for the onion service server and client. (SP2) Adversary cannot impersonate 0123456789deadbeef2.onion. -> With or without AONT, adversary can't make onion descriptor signatures that are verified by the 0123456789deadbeef2.onion key unless they have broken Ed25519. (SP3) Adversary cannot impersonate 0123456789deadbeefN.onion for any N *other* than 2. -> With or without AONT, if the signature on the onion descriptor always covers the complete .onion address, including the version number, the adversary can't do this without also being able to forge signatures for 0123456789deadbeef2.onion anyway and thus break Ed25519. (SP4) Adversary cannot DoS 0123456789deadbeef2.onion. -> With or without AONT, if adversary knows legitimate .onion address key, they can already remove any onion descriptors with signatures verified by the .onion address key, even if the signatures are decrypted. So we can't provide this security property anyway as long as the adversary knows the legitimate .onion address.
T3. Adversary (a) knows 0123456789deadbeef2.onion, (b) can spend compute to find a private key whose public key has some chosen bits, and (c) can submit descriptors to onion directories. (SP5) Adversary cannot match all except replacement of l by 1, o by 0, &c. -> With or without AONT, this confusion is already excluded by base32 encoding. (SP6) Adversary cannot match all except long enough suffix. -> Finding priv to fix prefix of Ed25519_priv2pub(priv) || cksum is almost surely just as hard as finding priv to fix prefix of AONT(Ed25519_priv2pub(priv) || cksum || version) or any other arrangement of cksum and version.
(This assumes the AONT has low AT cost to evaluate -- but if you choose an AONT with high AT cost, that will severely penalize legitimate users of onion services, and also limit vanity onions to major corporations like Facebook and Google.)
So what security properties does an AONT give against what threat models? I'm probably missing something obvious here, but I expect it will be helpful to articulate exactly what function it serves, for future readers.