Hello list,
we've had discussions over the past years about how to encode prop224 onion addresses. Here is the latest thread: https://lists.torproject.org/pipermail/tor-dev/2016-December/011734.html
Bikeshedding is over; it's time to finally pick a scheme! My suggested scheme basically follows from the discussion on that thread, and is heavily based on the Bitcoin address format: https://en.bitcoin.it/wiki/Base58Check_encoding https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_address...
Here is the suggested scheme:
onion_address = base32(version + pubkey + checksum) checksum = SHA3(".onion checksum" + version + pubkey)
where: pubkey is 32 bytes (ed25519) version is one byte checksum is _truncated_ to two bytes
With the above construction onion_address ends up being 56 bytes long (excluding the ".onion"):
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4gg.onion tcrdnadkefvbdm3u56kz6lfh6v5lr24fpog5vzsy4n3djr2ymueu34ws.onion tcdw7lwmtp5pbwj2w7wf6amxdhmc62qitj2teu376r5s2fqke4r3uiq6.onion
If people like the above suggestion, I will take the effort to engrave it in prop224.
Here is the discussion section. Please provide feedback!
[D1] How to use version field:
The version field is one byte long. If we use it as an integer we can encode 256 values in it; if we use it as a bitmap we could encode properties and such.
My suggestion is to simply use it as an integer like Bitcoin does. So we can assign value \x01 to normal onion services, and in the future we can assign more version tags if we need to. For example, we can give a different version field to onion services in the testnet. We can also reserve a range of values for application-specific purposes.
[D1.1] Default version value:
The next question is what version value to assign to normal onion services. In the above scheme where:
onion_address = base32(version + pubkey + checksum)
the value of 'version' basically determines the first two characters of the onion address. In Bitcoin, they've made it such that the default version value basically prefixes addresses with "1"; so all normal Bitcoin addresses start with 1 as in 14tDWDT9zqDufWZmiLqoaT9qJyHi7RRZPE
What should we do in Tor? My suggestion is to use '\x98' as the default version value which prefixes all addresses with 't' (as in Tor). Check the examples I cited above.
An alternative is to turn the scheme to: onion_address = base32(pubkey + checksum + version) where the version byte is at the end with no effect at usability.
A heavier alternative would be to have two bytes of version so that we can just prefix them all with 'tor'...
[D2] Checksum strength:
In the suggested scheme we use a hash-based checksum of two bytes (16 bits). This means that in case of an address typo, we have 1/65536 probability to not detect the error (false negative). It also means that after 256 typos we will have 50% probability to miss an error (happy birthday!).
I feel like the above numbers are pretty good given the small checksum size.
The alternative would be to make the checksum four bytes (like in Bitcoin). This would _greatly_ increase the strength of our checksum but it would also increase our address length by 4 base32 characters (and also force us to remove leading padding from base32 output). This is how these 60-character addresses look like:
tc2dty3zowj6oyhbyb5n3a2h3luztlx22hy2cwdvn37omsv7quy7rxiysn3a.onion tbdczrndtadzdhb6iyemnxf7f4i6x7yojnunarlrvt2virtmrecmwgx5golq.onion tc6pcgyorusw3jj5tosxakmcwfmcend2q4g2qnbjtkhuuh4dcgvs4rl4rdaa.onion
You probably don't notice the size difference compared to the 56-character addresses, which perhaps is an argument for adopting a four byte checksum. Let me know what you think about this.
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to.
For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background
If we wanted to get a more compact encoding, we could adopt base58 or make our own adaptation of it. In this proposal I'm using base32 for everything, but I could be persuaded that now is the time to use a better encoding.
Let me know what you think!
Thanks :)
George Kadianakis desnacked@riseup.net writes:
Hello list,
<snip>
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to. For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background If we wanted to get a more compact encoding, we could adopt base58 or make our own adaptation of it. In this proposal I'm using base32 for everything, but I could be persuaded that now is the time to use a better encoding.
Oops, pressed "Send" a bit too quickly as always...
Just to give you a better idea here, I did some calculations about the compactness of base58.
It seems that if we use Bitcoin's base58 we will be able to encode a 37-byte address (32 byte pubkey, one version byte and 4 bytes of checksum) into 51 base58 characters, instead of 60 base32 characters.
Comparison:
(base32): tc2dty3zowj6oyhbyb5n3a2h3luztlx22hy2cwdvn37omsv7quy7rxiysn3a.onion tbdczrndtadzdhb6iyemnxf7f4i6x7yojnunarlrvt2virtmrecmwgx5golq.onion
(base58): tkb8klf9zgwqnogidda76mzpl6tszzy36hwxmsssznydyxyb9kf.onion touecgu8rmjxexxipud5bdku4mkfqezyd4dz1jvhtvqvbtlvytj.onion
On 2017-01-23 07:50, George Kadianakis wrote:
George Kadianakis desnacked@riseup.net writes:
Hello list,
<snip>
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since
we've been using it for a while; but this is the perfect time to switch if we feel the need to.
I am generally in favor or keeping the same encoding unless there is an unmistakable and objectively advantageous reason to switch. It throws users off when there is an "unnecessary" switch. Additionally, .onion addresses of variable lengths might be confusing.
For example, Bitcoin is using base58 which is much more compact
than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background
Is the better "UX" the fact that "A set of 58 alphanumeric symbols consisting of easily distinguished uppercase and lowercase letters (0OIl are not used)"? Currently, the addresses are too long to memorize, hard to type out, and not pronounceable enough, to consider such properties.
But for the sake of discussion, if we were to consider some usability properties, but I think base 32 is "easier to use" because it doesn't use both upper and lower case letters.
Base 32 (RFC 4648 Base32 alphabet): ABCDEFGHJKLMNPQRSTUVWXYZ234567 Base 58: 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
My justification is that it would be harder to memorize upper/lower case letters in addresses, that it's hard to type with alternating cases, and there isn't a good way to distinguish the two when you pronounce it.
That being said, I stand by my original stance that the addresses are too long to memorize, type, or pronounce, so this shouldn't be a huge consideration. So I vote to keep the base32 encoding for the reason of keeping it the same as it was before.
...but I could be persuaded that now is the time to use a better encoding.
I won't be persuading you. :) Thanks for doing good work!
Cheers, Linda
On Mon, Jan 23, 2017 at 03:36:07PM +0200, George Kadianakis wrote:
[D2] Checksum strength:
In the suggested scheme we use a hash-based checksum of two bytes (16 bits). This means that in case of an address typo, we have 1/65536 probability to not detect the error (false negative). It also means that after 256 typos we will have 50% probability to miss an error (happy birthday!).
That doesn't sound right to me. We're not comparing onion addresses with each other; we're looking at them one at a time. Birthday would come in, for example, if we're asking "How many onion addresses would we need to see before two have the same checksum?". But if each false negative happens with probability 1/65536 (which is correct), then it's just the straightforward 32768 typos before we have 50% probability to miss an error.
I feel like the above numbers are pretty good given the small checksum size.
Agree.
The alternative would be to make the checksum four bytes (like in Bitcoin). This would _greatly_ increase the strength of our checksum but it would also increase our address length by 4 base32 characters (and also force us to remove leading padding from base32 output). This is how these 60-character addresses look like: tc2dty3zowj6oyhbyb5n3a2h3luztlx22hy2cwdvn37omsv7quy7rxiysn3a.onion tbdczrndtadzdhb6iyemnxf7f4i6x7yojnunarlrvt2virtmrecmwgx5golq.onion tc6pcgyorusw3jj5tosxakmcwfmcend2q4g2qnbjtkhuuh4dcgvs4rl4rdaa.onion You probably don't notice the size difference compared to the 56-character addresses, which perhaps is an argument for adopting a four byte checksum. Let me know what you think about this.
Seems unnecessary to me.
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to. For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background If we wanted to get a more compact encoding, we could adopt base58 or make our own adaptation of it. In this proposal I'm using base32 for everything, but I could be persuaded that now is the time to use a better encoding.
Using base58 is likely to be fraught, since DNS names are "supposed" to be case-insensitive; this onion address takes the place of a DNS name, and who knows what software will just assume it's also case-insensitive.
- Ian
Hi George,
George Kadianakis:
What should we do in Tor? My suggestion is to use '\x98' as the default version value which prefixes all addresses with 't' (as in Tor). Check the examples I cited above. An alternative is to turn the scheme to: onion_address = base32(pubkey + checksum + version) where the version byte is at the end with no effect at usability. A heavier alternative would be to have two bytes of version so that we can just prefix them all with 'tor'...
Yes, this is definitely good idea to introduce version octet. Though it seems pretty redundant to me to prefix onion addresses with 't'/'tor'. I think that the version octet should increment as you described above. I think that version should placed at the end of the address. This would make addresses more distinguishable addresses among each other.
[D2] Checksum strength:
In the suggested scheme we use a hash-based checksum of two bytes (16 bits). This means that in case of an address typo, we have 1/65536 probability to not detect the error (false negative). It also means that after 256 typos we will have 50% probability to miss an error (happy birthday!). I feel like the above numbers are pretty good given the small checksum size. The alternative would be to make the checksum four bytes (like in Bitcoin). This would _greatly_ increase the strength of our checksum but it would also increase our address length by 4 base32 characters (and also force us to remove leading padding from base32 output). This is how these 60-character addresses look like:
Is that necessary? Two bytes seem to be more than enough for typo-level error.
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to. For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background
I'm personally consider both base64 and base58 having poor UX and agree with Linda. Mostly it's because they are case-sensitive - this makes them too hard to type in. Also base58 has non-integer bit capacity that makes implementation way more complicated and error-prone (we've seen enough bugs even in b32 and b64 implementations).
---
I had an idea recently that having variable-length flexible addresses in fashion similar to TLVs in OTR protocol would be nice. In that case there are no more length constraints at all, so we may use keys of different types/sizes (pq?), embed authentication data, etc, etc.
Type: 1 byte Length: 1 byte (=up to 255 bytes= 2040 bits) Value: Length bytes
0x01 0x20 [0x01..0xff] 0x33 0x02 [0x11 0x99] ".onion" T T T T T T | | | | | +-- two-byte checksum | | | | +----------- length of the checksum | | | +---------------- checksum type | | +-------------------------- ed25519 pk | +---------------------------------- size of pk (32 bytes) +--------------------------------------- prop224 identity key type
So, its length now 1+1+32 + 1+1+2 = 38 byte = 61 base32 chars with one (1) unused bit. E.g.: obdczrndtadzdhb6iyemnxf7f4i6x7yojnunarlrvt2virtmrecmwgx5golqe.onion
Despite of using more bytes (type/length) it provides freedom for future adjustments (e.g. another checksum/key algo). Also these TLVs are commutative so changing order has no effect (maybe should, like in DER?). As a side effect plain-old-onion-addresses can be encoded here (even with a checksum).
I'm not sure whether it's reasonable as it seems to me.
-- Ivan Markin
On 24 Jan 2017, at 00:36, George Kadianakis desnacked@riseup.net wrote:
...
[D1.1] Default version value:
The next question is what version value to assign to normal onion services. In the above scheme where: onion_address = base32(version + pubkey + checksum) the value of 'version' basically determines the first two characters of the onion address. In Bitcoin, they've made it such that the default version value basically prefixes addresses with "1"; so all normal Bitcoin addresses start with 1 as in 14tDWDT9zqDufWZmiLqoaT9qJyHi7RRZPE What should we do in Tor? My suggestion is to use '\x98' as the default version value which prefixes all addresses with 't' (as in Tor). Check the examples I cited above.
As Linda said, using a common prefix makes it much harder for people to distinguish addresses.
(People check the start, then the end, and tend to ignore the middle.)
An alternative is to turn the scheme to: onion_address = base32(pubkey + checksum + version) where the version byte is at the end with no effect at usability.
Using a common suffix makes it somewhat harder for people to distinguish addresses.
I suggest:
onion_address = base32(pubkey + version + checksum)
That way, the identical part of the address is in an area people typically ignore when doing comparisons.
A heavier alternative would be to have two bytes of version so that we can just prefix them all with 'tor'…
This is even worse for distinguishability.
[D2] Checksum strength:
In the suggested scheme we use a hash-based checksum of two bytes (16 bits). This means that in case of an address typo, we have 1/65536 probability to not detect the error (false negative). It also means that after 256 typos we will have 50% probability to miss an error (happy birthday!). I feel like the above numbers are pretty good given the small checksum size.
Two bytes or 1/65536 is quite fine. 1/256 would even be acceptable.
The alternative would be to make the checksum four bytes (like in Bitcoin). This would _greatly_ increase the strength of our checksum but it would also increase our address length by 4 base32 characters (and also force us to remove leading padding from base32 output). This is how these 60-character addresses look like: tc2dty3zowj6oyhbyb5n3a2h3luztlx22hy2cwdvn37omsv7quy7rxiysn3a.onion tbdczrndtadzdhb6iyemnxf7f4i6x7yojnunarlrvt2virtmrecmwgx5golq.onion tc6pcgyorusw3jj5tosxakmcwfmcend2q4g2qnbjtkhuuh4dcgvs4rl4rdaa.onion You probably don't notice the size difference compared to the 56-character addresses, which perhaps is an argument for adopting a four byte checksum. Let me know what you think about this.
Four bytes seems unnecessary, we only gain a very small advantage from adding those extra bytes to every address.
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to. For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background If we wanted to get a more compact encoding, we could adopt base58 or make our own adaptation of it. In this proposal I'm using base32 for everything, but I could be persuaded that now is the time to use a better encoding.
As far as I understand it, .onion domain registrations require addresses that conform to DNS rules: in particular, they must be case-insensitive, and within DNS component length and total length restrictions.
So base58 and base64 are out.
T
-- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------
Hello George,
George Kadianakis wrote:
Hello list,
we've had discussions over the past years about how to encode prop224 onion addresses. Here is the latest thread: https://lists.torproject.org/pipermail/tor-dev/2016-December/011734.html
Bikeshedding is over; it's time to finally pick a scheme! My suggested scheme basically follows from the discussion on that thread, and is heavily based on the Bitcoin address format: https://en.bitcoin.it/wiki/Base58Check_encoding https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_address...
Here is the suggested scheme:
onion_address = base32(version + pubkey + checksum) checksum = SHA3(".onion checksum" + version + pubkey)
where: pubkey is 32 bytes (ed25519) version is one byte checksum is _truncated_ to two bytes
With the above construction onion_address ends up being 56 bytes long (excluding the ".onion"):
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4gg.onion tcrdnadkefvbdm3u56kz6lfh6v5lr24fpog5vzsy4n3djr2ymueu34ws.onion tcdw7lwmtp5pbwj2w7wf6amxdhmc62qitj2teu376r5s2fqke4r3uiq6.onion
If people like the above suggestion, I will take the effort to engrave it in prop224.
Here is the discussion section. Please provide feedback!
[D1] How to use version field:
The version field is one byte long. If we use it as an integer we can encode 256 values in it; if we use it as a bitmap we could encode properties and such. My suggestion is to simply use it as an integer like Bitcoin does. So we can assign value \x01 to normal onion services, and in the future we can assign more version tags if we need to. For example, we can give a different version field to onion services in the testnet. We can also reserve a range of values for application-specific purposes.
[D1.1] Default version value:
The next question is what version value to assign to normal onion services. In the above scheme where: onion_address = base32(version + pubkey + checksum) the value of 'version' basically determines the first two characters of the onion address. In Bitcoin, they've made it such that the default version value basically prefixes addresses with "1"; so all normal Bitcoin addresses start with 1 as in 14tDWDT9zqDufWZmiLqoaT9qJyHi7RRZPE What should we do in Tor? My suggestion is to use '\x98' as the default version value which prefixes all addresses with 't' (as in Tor). Check the examples I cited above. An alternative is to turn the scheme to: onion_address = base32(pubkey + checksum + version) where the version byte is at the end with no effect at usability. A heavier alternative would be to have two bytes of version so that we can just prefix them all with 'tor'...
The version field is useful and allows room for much stuff that we might need to do. I think it would be better to place it at the end of the address. I don't think all addresses should start with the same prefix tbh - this will make them slightly less distinguishable (as much as possible users should be able to differentiate onion addresses, which are re-usable for long term, as opposite to Bitcoin where the recommended way is to use 1 address 1 time, different one every time and the users just need to see a string that looks and reads like a Bitcoin address and just make sure it's copied (scanned) from/to the right place).
[D2] Checksum strength:
In the suggested scheme we use a hash-based checksum of two bytes (16 bits). This means that in case of an address typo, we have 1/65536 probability to not detect the error (false negative). It also means that after 256 typos we will have 50% probability to miss an error (happy birthday!). I feel like the above numbers are pretty good given the small checksum size.
Yes, the numbers are very good.
The alternative would be to make the checksum four bytes (like in Bitcoin). This would _greatly_ increase the strength of our checksum but it would also increase our address length by 4 base32 characters (and also force us to remove leading padding from base32 output). This is how these 60-character addresses look like: tc2dty3zowj6oyhbyb5n3a2h3luztlx22hy2cwdvn37omsv7quy7rxiysn3a.onion tbdczrndtadzdhb6iyemnxf7f4i6x7yojnunarlrvt2virtmrecmwgx5golq.onion tc6pcgyorusw3jj5tosxakmcwfmcend2q4g2qnbjtkhuuh4dcgvs4rl4rdaa.onion You probably don't notice the size difference compared to the 56-character addresses, which perhaps is an argument for adopting a four byte checksum. Let me know what you think about this.
I don't think so. I think our best bet is a checksum of 2 bytes, this offers sufficient strength for our use cases.
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to. For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background If we wanted to get a more compact encoding, we could adopt base58 or make our own adaptation of it. In this proposal I'm using base32 for everything, but I could be persuaded that now is the time to use a better encoding.
Let me know what you think!
When talking about > 50 chars strings, I think memorizing is (for most users at least) very hard, regardless if encoding is base58 or base32. What I think is more important is that onion addresses should not contain upper case and lower case characters, they should look as much as possible like regular DNS hostnames. For this reason I would go for base32 here.
s7r s7r@sky-ip.org writes:
Hello George,
George Kadianakis wrote:
Hello list,
we've had discussions over the past years about how to encode prop224 onion addresses. Here is the latest thread: https://lists.torproject.org/pipermail/tor-dev/2016-December/011734.html
Bikeshedding is over; it's time to finally pick a scheme! My suggested scheme
<snip>
The version field is useful and allows room for much stuff that we might need to do. I think it would be better to place it at the end of the address. I don't think all addresses should start with the same prefix tbh - this will make them slightly less distinguishable (as much as possible users should be able to differentiate onion addresses, which are re-usable for long term, as opposite to Bitcoin where the recommended way is to use 1 address 1 time, different one every time and the users just need to see a string that looks and reads like a Bitcoin address and just make sure it's copied (scanned) from/to the right place).
OK thanks for the useful discussion. I identified at least three feedback points:
+ Screw base58 it's not gonna work. We stick to base32. Usability will be "restored" with a proper name system.
+ Move version byte to the end of the address to avoid constant prefix. Moving version byte to the middle as teor suggested would cause forward-compatibility issues.
+ My checksum calculations were wrong. Checksum is strong! 2 bytes are enough.
And given the above, here is the new microproposal:
onion_address = base32(pubkey || checksum || version) checksum = SHA3(".onion checksum" || pubkey || version)
where: pubkey is 32 bytes ed25519 pubkey version is one byte (default value for prop224: '\x03') checksum hash is truncated to two bytes
Here are a few example addresses (with broken checksum):
l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion btojiu7nu5y5iwut64eufevogqdw4wmqzugnoluw232r4t3ecsfv37ad.onion vckjr6bpchiahzhmtzslnl477hdfvwhzw7dmymz3s5lp64mwf6wfeqad.onion
Checksum strength: The checksum has a false negative rate of 1/65536.
Address handling: Clients handling onion addresses first parse the version field, then extract pubkey, then verify checksum.
Let me know how you feel about this one. If people like it I will transcribe it to prop224.
Thanks again Ivan, Ian, Linda, teor, s7r, Chelsea :)
On 24 Jan (14:27:43), George Kadianakis wrote:
s7r s7r@sky-ip.org writes:
Hello George,
George Kadianakis wrote:
Hello list,
we've had discussions over the past years about how to encode prop224 onion addresses. Here is the latest thread: https://lists.torproject.org/pipermail/tor-dev/2016-December/011734.html
Bikeshedding is over; it's time to finally pick a scheme! My suggested scheme
<snip>
The version field is useful and allows room for much stuff that we might need to do. I think it would be better to place it at the end of the address. I don't think all addresses should start with the same prefix tbh - this will make them slightly less distinguishable (as much as possible users should be able to differentiate onion addresses, which are re-usable for long term, as opposite to Bitcoin where the recommended way is to use 1 address 1 time, different one every time and the users just need to see a string that looks and reads like a Bitcoin address and just make sure it's copied (scanned) from/to the right place).
OK thanks for the useful discussion. I identified at least three feedback points:
Screw base58 it's not gonna work. We stick to base32. Usability will be "restored" with a proper name system.
Move version byte to the end of the address to avoid constant prefix. Moving version byte to the middle as teor suggested would cause forward-compatibility issues.
My checksum calculations were wrong. Checksum is strong! 2 bytes are enough.
And given the above, here is the new microproposal:
onion_address = base32(pubkey || checksum || version) checksum = SHA3(".onion checksum" || pubkey || version)
where: pubkey is 32 bytes ed25519 pubkey version is one byte (default value for prop224: '\x03') checksum hash is truncated to two bytes
Here are a few example addresses (with broken checksum):
l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion btojiu7nu5y5iwut64eufevogqdw4wmqzugnoluw232r4t3ecsfv37ad.onion vckjr6bpchiahzhmtzslnl477hdfvwhzw7dmymz3s5lp64mwf6wfeqad.onion
Checksum strength: The checksum has a false negative rate of 1/65536.
Address handling: Clients handling onion addresses first parse the version field, then extract pubkey, then verify checksum.
Let me know how you feel about this one. If people like it I will transcribe it to prop224.
I like this quite a bit! Simple, easy, and trivial to understand. 56 characters address, after that it will be the time to improve UX/UI with all sorts of possible tricks to make them easier to remember or copy paste or visualize or what not.
Unless some feedback NACK this, I say push that in the proposal soon. I'll personally start implementing that scheme this week.
Thanks! David
Thanks again Ivan, Ian, Linda, teor, s7r, Chelsea :) _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 2017-01-24 08:00, David Goulet wrote:
On 24 Jan (14:27:43), George Kadianakis wrote:
s7r s7r@sky-ip.org writes:
Hello George,
George Kadianakis wrote:
Hello list,
we've had discussions over the past years about how to encode prop224 onion addresses. Here is the latest thread: https://lists.torproject.org/pipermail/tor-dev/2016-December/011734.html
Bikeshedding is over; it's time to finally pick a scheme! My suggested scheme
<snip>
The version field is useful and allows room for much stuff that we might need to do. I think it would be better to place it at the end of the address. I don't think all addresses should start with the same prefix tbh - this will make them slightly less distinguishable (as much as possible users should be able to differentiate onion addresses, which are re-usable for long term, as opposite to Bitcoin where the recommended way is to use 1 address 1 time, different one every time and the users just need to see a string that looks and reads like a Bitcoin address and just make sure it's copied (scanned) from/to the right place).
OK thanks for the useful discussion. I identified at least three feedback points:
Screw base58 it's not gonna work. We stick to base32. Usability will be "restored" with a proper name system.
Move version byte to the end of the address to avoid constant prefix. Moving version byte to the middle as teor suggested would cause forward-compatibility issues.
My checksum calculations were wrong. Checksum is strong! 2 bytes are
enough.
And given the above, here is the new microproposal:
onion_address = base32(pubkey || checksum || version) checksum = SHA3(".onion checksum" || pubkey || version)
where: pubkey is 32 bytes ed25519 pubkey version is one byte (default value for prop224: '\x03') checksum hash is truncated to two bytes
Here are a few example addresses (with broken checksum):
l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion btojiu7nu5y5iwut64eufevogqdw4wmqzugnoluw232r4t3ecsfv37ad.onion vckjr6bpchiahzhmtzslnl477hdfvwhzw7dmymz3s5lp64mwf6wfeqad.onion
Checksum strength: The checksum has a false negative rate of 1/65536.
Address handling: Clients handling onion addresses first parse the version field, then extract pubkey, then verify checksum.
Let me know how you feel about this one. If people like it I will transcribe it to prop224.
I like this quite a bit! Simple, easy, and trivial to understand. 56 characters address, after that it will be the time to improve UX/UI with all sorts of possible tricks to make them easier to remember or copy paste or visualize or what not.
Unless some feedback NACK this, I say push that in the proposal soon. I'll personally start implementing that scheme this week.
Thanks! David
Thanks again Ivan, Ian, Linda, teor, s7r, Chelsea :) _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
+1; I like this proposal.
The hyphens help, but I don't think that is the only solution (though I have no alternatives at the moment, except having short and pronounceable names, which doesn't have the technical properties we want) and we should be careful as it can introduce complexity/forward compatibility issues.
Hello,
David Goulet wrote:
<snip> > > OK thanks for the useful discussion. I identified at least three feedback points: > > + Screw base58 it's not gonna work. We stick to base32. Usability will > be "restored" with a proper name system. > > + Move version byte to the end of the address to avoid constant > prefix. Moving version byte to the middle as teor suggested would > cause forward-compatibility issues. > > + My checksum calculations were wrong. Checksum is strong! 2 bytes are enough. > > And given the above, here is the new microproposal: > > onion_address = base32(pubkey || checksum || version) > checksum = SHA3(".onion checksum" || pubkey || version) > > where: > pubkey is 32 bytes ed25519 pubkey > version is one byte (default value for prop224: '\x03') > checksum hash is truncated to two bytes > > Here are a few example addresses (with broken checksum): > > l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion > btojiu7nu5y5iwut64eufevogqdw4wmqzugnoluw232r4t3ecsfv37ad.onion > vckjr6bpchiahzhmtzslnl477hdfvwhzw7dmymz3s5lp64mwf6wfeqad.onion > > Checksum strength: The checksum has a false negative rate of 1/65536. > > Address handling: Clients handling onion addresses first parse the > version field, then extract pubkey, then verify checksum. > > Let me know how you feel about this one. If people like it I will > transcribe it to prop224.
I like this quite a bit! Simple, easy, and trivial to understand. 56 characters address, after that it will be the time to improve UX/UI with all sorts of possible tricks to make them easier to remember or copy paste or visualize or what not.
Unless some feedback NACK this, I say push that in the proposal soon. I'll personally start implementing that scheme this week.
I like the proposal in this form - Yes for all points.
I also dislike being possible to have multiple addresses (versions) for the same public key, that would create implementation and usability problems.
I wouldn't go for the hypens, but even if we decide at a later point that this was a good idea we can handle it at an upper layer, like with a browser tool or something, it's outside the scope of this microproposal. We all know only a naming system will really fix this issue from all points of view, so let's stick to that.
Thanks for this! Really great work.
Hi,
s7r:
I wouldn't go for the hypens, but even if we decide at a later point that this was a good idea we can handle it at an upper layer, like with a browser tool or something, it's outside the scope of this microproposal. We all know only a naming system will really fix this issue from all points of view, so let's stick to that.
I don't think a naming system can fix this issue from all points of view. There will always be use cases where a naming system can't be used, like short-lived, non-public onion services, created for example with OnionShare or Tails Server. There are cases where there is no secure channel pre-established between the server and the client, so the onion address is exchanged in person, by reading it of the screen or writing it on a sheet of paper. Or there is a secure channel established on another device, for example Signal on the users' phones. Then they would still have to type the onion address.
Also, a browser is not the only client accessing onion services. I would like some more appreciation of the fact that not all onion services are publicly accessible web services.
Cheers
David Goulet dgoulet@ev0ke.net writes:
On 24 Jan (14:27:43), George Kadianakis wrote:
s7r s7r@sky-ip.org writes:
<snip>
I like this quite a bit! Simple, easy, and trivial to understand. 56 characters address, after that it will be the time to improve UX/UI with all sorts of possible tricks to make them easier to remember or copy paste or visualize or what not.
Unless some feedback NACK this, I say push that in the proposal soon. I'll personally start implementing that scheme this week.
Thanks! David
Hello,
I made a torspec branch that alters prop224 accordingly: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-onion-a...
I will merge this to torspec RSN if I don't hear any grave objections.
Cheers!
Hello!
I have some more thoughts on versioning, specifically in regards to the possibility of not including the version in the onion address and using only the version field in the descriptor.
I'm not able to write out these scenarios now but I will do this in the next day. Thanks for making last call!
Chelsea
On 01/27/2017 08:09 AM, George Kadianakis wrote:
David Goulet dgoulet@ev0ke.net writes:
On 24 Jan (14:27:43), George Kadianakis wrote:
s7r s7r@sky-ip.org writes:
<snip>
I like this quite a bit! Simple, easy, and trivial to understand. 56 characters address, after that it will be the time to improve UX/UI with all sorts of possible tricks to make them easier to remember or copy paste or visualize or what not.
Unless some feedback NACK this, I say push that in the proposal soon. I'll personally start implementing that scheme this week.
Thanks! David
Hello,
I made a torspec branch that alters prop224 accordingly: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-onion-a...
I will merge this to torspec RSN if I don't hear any grave objections.
Cheers! _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 27 Jan (09:04:51), chelsea komlo wrote:
Hello!
I have some more thoughts on versioning, specifically in regards to the possibility of not including the version in the onion address and using only the version field in the descriptor.
I'm not able to write out these scenarios now but I will do this in the next day. Thanks for making last call!
Here is some extra pressure for you ;).
The HSDir fetch/post URL has gone in 0.3.0 (feature freeze today in theory ;) with the version in it:
--> /tor/hs/<version>/publish
So few things. First, if we don't have the version in the onion address, this means the client needs to try to fetch the descriptor for multiple version that is starting at the highest it knows and then going down as it's failing. That, I'm really not too keen to this, uneeded load on the network.
Second thing is that HSDir might not all support the same version by the time we roll out prop224 thus the importance of having it in 0.3.0 (a version *before* the next gen release). Even with that, this is going to be an interesting experiement to have a set of HSDir supporting v3 and a set not supporting it because we kind of have this requirement of using 3 nearest relays for a replica but what if one of them doesn't support v3?
Third thing, we could have a fix for this with a single descriptor supporting multiple version but then this has implication outside the onion address discussion and unfortunately 0.3.0 material again (that freezes today).
So I'm eager to hear your idea on this! But it's important to keep in mind that 0.3.0 has already some building blocks with some version restrictions :S. Changing those would mean delaying adoption by a 6 months (and it could be OK!).
Thanks! David
Chelsea
On 01/27/2017 08:09 AM, George Kadianakis wrote:
David Goulet dgoulet@ev0ke.net writes:
On 24 Jan (14:27:43), George Kadianakis wrote:
s7r s7r@sky-ip.org writes:
<snip>
I like this quite a bit! Simple, easy, and trivial to understand. 56 characters address, after that it will be the time to improve UX/UI with all sorts of possible tricks to make them easier to remember or copy paste or visualize or what not.
Unless some feedback NACK this, I say push that in the proposal soon. I'll personally start implementing that scheme this week.
Thanks! David
Hello,
I made a torspec branch that alters prop224 accordingly: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-onion-a...
I will merge this to torspec RSN if I don't hear any grave objections.
Cheers! _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hey!
Here is some extra pressure for you ;).
:) thanks, I will try!
Before starting, someone today very kindly pointed me to Prop 271, the naming system API for Tor Onion services. Overall, my larger concern is whether adding the version in the onion address makes both using and distributing onion addresses harder. If the long-term plan is for onion addresses to not be used directly, then having the version in the onion address is completely fine as this wouldn't present a barrier to entry for end users.
The HSDir fetch/post URL has gone in 0.3.0 (feature freeze today in theory ;) with the version in it:
--> /tor/hs/<version>/publish
So few things. First, if we don't have the version in the onion address, this means the client needs to try to fetch the descriptor for multiple version that is starting at the highest it knows and then going down as it's failing. That, I'm really not too keen to this, uneeded load on the network.
Yep, fair. So the idea of "fetch multiple descriptors, where a descriptor is for a single version," isn't viable for performance reasons.
Second thing is that HSDir might not all support the same version by the time we roll out prop224 thus the importance of having it in 0.3.0 (a version *before* the next gen release). Even with that, this is going to be an interesting experiement to have a set of HSDir supporting v3 and a set not supporting it because we kind of have this requirement of using 3 nearest relays for a replica but what if one of them doesn't support v3?
Yeah, that is hard. Although I'm not entirely sure how this complexity is correlated with how the client consumes the HS version...
Third thing, we could have a fix for this with a single descriptor supporting multiple version but then this has implication outside the onion address discussion and unfortunately 0.3.0 material again (that freezes today).
So I'm eager to hear your idea on this! But it's important to keep in mind that 0.3.0 has already some building blocks with some version restrictions :S. Changing those would mean delaying adoption by a 6 months (and it could be OK!).
Yeah! So if the plan is that onion addresses will not be used directly by end users and there is an abstraction layer that hides things like version upgrade from end users, then going ahead with the current plan sounds good.
However, if there is a chance that end users will consume onion addresses directly, then having this discussion seems like a good idea. The scenario that worries me is something like this:
1) Facebook creates a hidden service and distributes this address 2) A new hidden service version is created 3) Facebook is reluctant to upgrade because this would mean re-distributing a new onion address to a _lot_ of people. Also, there are problems of securely distributing and verifying new onion addresses- malicious parties could use this opportunity to distribute lookalikes, for example.
When we upgrade key primitives (such as when we move to a PQ scheme), then it will definitely be necessary for HS operators to re-distribute addresses. However, minimizing the need for addresses to change will lower the barrier to use/operate hidden services.
If you think it is worth pursuing this discussion, I can start a new thread to discuss this further. One idea that seems viable is for descriptors to specify multiple supported HS versions (taking into account the points you and George have already made). In short, the scheme could be something like this:
1) An onion address is represented by base32(pub_key || checksum) 2) A descriptor specifies a list of versions supported by the HS with that address (a descriptor can represent only one address/public key but multiple versions) 3) The client selects the highest available version supported
The proposed change to section 2.2.6 in prop 224 (URLS for anonymous uploading and downloading) would be for the publish URL to be HTTP POST to /tor/hs/publish, and HTTP GET to /tor/hs/<z>, where <z> is a base64 encoding of the hidden service's blinded public key. This would also mean that HSDir code won't need to change when new versions are added.
But again, this change probably isn't necessary if onion addresses will live below an abstraction layer!
I apologize if this isn't good timing with feature freezes- I'll follow your lead with this! Chelsea
Hi,
chelsea komlo:
So if the plan is that onion addresses will not be used directly by end users and there is an abstraction layer that hides things like version upgrade from end users, then going ahead with the current plan sounds good.
However, if there is a chance that end users will consume onion addresses directly, then having this discussion seems like a good idea. The scenario that worries me is something like this:
- Facebook creates a hidden service and distributes this address
- A new hidden service version is created
- Facebook is reluctant to upgrade because this would mean
re-distributing a new onion address to a _lot_ of people. Also, there are problems of securely distributing and verifying new onion addresses- malicious parties could use this opportunity to distribute lookalikes, for example.
When we upgrade key primitives (such as when we move to a PQ scheme), then it will definitely be necessary for HS operators to re-distribute addresses. However, minimizing the need for addresses to change will lower the barrier to use/operate hidden services.
I share your concerns here. I think we could work around this by pulling the version byte out of the base32 encoding, like this:
onion_address = base32(pubkey + checksum) + "-" + version
This would result in onion addresses like this:
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4g-1.onion
If the HS version changes to version 2, the onion address would only change in the version char:
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4g-2.onion
This way onion service operators can keep their onion address prefix and users can verify that the new address uses the same public key as the address of the previous version.
What do you think about this?
Cheers
Hey,
On 01/28/2017 10:16 AM, segfault wrote:
I share your concerns here. I think we could work around this by pulling the version byte out of the base32 encoding, like this:
onion_address = base32(pubkey + checksum) + "-" + version
This would result in onion addresses like this:
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4g-1.onion
If the HS version changes to version 2, the onion address would only change in the version char:
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4g-2.onion
This way onion service operators can keep their onion address prefix and users can verify that the new address uses the same public key as the address of the previous version.
What do you think about this?
I think this is could be an option if the HS version is needed in the client request when fetching the descriptor from the HSDir. It is my understanding that the HS version is already encoded in the descriptor, so theoretically the version can be removed from the client request entirely.
The scheme you describe above still may have distribution problems, and possibly still puts responsibility on the end user. For example, say the following happens:
1. Tor releases HS version 5 (a new version) 2. The New York Times HS has not yet upgraded, and remains on version 4 for a period of time.
As a Tor client, how do I know whether to use
tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4g-4.onion or tbi5tdxbosiotphawjyu7f5pw5tlnvbvfjrj7meskbsnwr2bqbu2t4g-5.onion
? Either my Tor client can try version 5 by default (the highest version supported) and then fall back to version 4, or it can be the responsibility of the end user to check whether the New York Times has upgraded. Both of these scenarios are prone to error, and the second is difficult to scale. If we can leave out the HS version from the client request and publish HS versions only in the descriptor, it seems like these issues could be mitigated.
Let me know what you think, or if the above scenarios can be mitigated in other ways. Chelsea
Skimming thread...
Version or not is fine, provided if you want versions you know you must store the bits somewhere, or ensure regex parser rules to recognize and match an intrinsic version represented by entire address format specification itself.
Note onion search spiders rely on such address recognition and parsing. So it's not all just about the browser brain urlbar.
GPU capacity hasn't hit 16 char yet, mnemonic brain memory has, but that's only happened based on address luck and/or GPU prefixing. We're more or less at the limits, new random bits past 16 won't matter and shouldn't be considered much an argument to brain relavance. Some other brain layer will come along, and if not, there's always search.
If version goes in address, I'd wary against putting it last. A lot of things naturally sort and route and default based on higher order bits appear prefixing on the left.. IPv4 IPv6 bitcoin PTR DHT filesystem unix tools... the list goes on. A single leading character is not a problem and gives plenty of bits of version capacity regardless of encoding. Trailing version just plain feels shaky to rely on or to advocate to the world as a new standard. Certainly not without consultation with other anonymous overlay projects as to their future needs and direction as well, or to develop such an interop standard.
At least until bumping against DNS length limitations, all lower case should be obvious, without symbolics, without nonprintables, etc. Try to stick to most common compatible [a-z0-9] or less unless forced otherwise.
Don't try to create new parsing headaches for application authors / porters to work around who might already be using rather basic charsets and routines with existing protocols.
Whatever works.
grarpamp grarpamp@gmail.com writes:
Skimming thread...
Version or not is fine, provided if you want versions you know you must store the bits somewhere, or ensure regex parser rules to recognize and match an intrinsic version represented by entire address format specification itself.
Note onion search spiders rely on such address recognition and parsing. So it's not all just about the browser brain urlbar.
GPU capacity hasn't hit 16 char yet, mnemonic brain memory has, but that's only happened based on address luck and/or GPU prefixing. We're more or less at the limits, new random bits past 16 won't matter and shouldn't be considered much an argument to brain relavance. Some other brain layer will come along, and if not, there's always search.
If version goes in address, I'd wary against putting it last. A lot of things naturally sort and route and default based on higher order bits appear prefixing on the left.. IPv4 IPv6 bitcoin PTR DHT filesystem unix tools... the list goes on. A single leading character is not a problem and gives plenty of bits of version capacity regardless of encoding. Trailing version just plain feels shaky to rely on or to advocate to the world as a new standard. Certainly not without consultation with other anonymous overlay projects as to their future needs and direction as well, or to develop such an interop standard.
Hm, can you please expand on this? I think I understood none of your arguments.
What's the problem with version field being in the end and tools sorting addresses based on higher order bits? Also why does version field being in the end makes it shaky to rely on?
Hello,
George Kadianakis wrote:
grarpamp grarpamp@gmail.com writes:
Skimming thread...
Version or not is fine, provided if you want versions you know you must store the bits somewhere, or ensure regex parser rules to recognize and match an intrinsic version represented by entire address format specification itself.
Note onion search spiders rely on such address recognition and parsing. So it's not all just about the browser brain urlbar.
GPU capacity hasn't hit 16 char yet, mnemonic brain memory has, but that's only happened based on address luck and/or GPU prefixing. We're more or less at the limits, new random bits past 16 won't matter and shouldn't be considered much an argument to brain relavance. Some other brain layer will come along, and if not, there's always search.
If version goes in address, I'd wary against putting it last. A lot of things naturally sort and route and default based on higher order bits appear prefixing on the left.. IPv4 IPv6 bitcoin PTR DHT filesystem unix tools... the list goes on. A single leading character is not a problem and gives plenty of bits of version capacity regardless of encoding. Trailing version just plain feels shaky to rely on or to advocate to the world as a new standard. Certainly not without consultation with other anonymous overlay projects as to their future needs and direction as well, or to develop such an interop standard.
Hm, can you please expand on this? I think I understood none of your arguments.
What's the problem with version field being in the end and tools sorting addresses based on higher order bits? Also why does version field being in the end makes it shaky to rely on?
None of the arguments make any sense to me either. It doesn't matter if the version is prefixed or trailed, it can be interpreted the same.
What does Tor using version at the end of address have to do with advocating to the world as a new standard? New standard for what? What good would be consulting with other anonymous overlay projects be? Which projects? This questions are not meant to be answered, let's not turn this thread counter-productive out of respect for all very busy people reading.
For me, this looks very good: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-onion-a...
chelsea's comments have a good point, but we are pretty sure that a new version will mean entire different crypto, different public keys thus different addresses anyway. Moving the version on descriptors entirely and exclusively won't help since it could only represent one public key (address), if not it could either create a chicken-and-egg problem either a false sense of security (equal to the security of the public key / version that you use to query to the HSDir). So if we're in a PQ era, have PQ crypto as V4 onion service but do the rendezvous dance starting with old, vulnerable crypto, we will be doing it wrong. Otherwise, the operator needs to re-distribute new version different address so the encoded version won't matter/help. I dislike and don't see the point of pulling the version byte out of the address. That is exactly what these lengthy hostnames were missing...
To be frank, the version is not so super important, because: - prop224 can work perfectly fine even without a version encoded in address - we are using it so clients can take informed action before arriving to HSDirs. You cannot confuse a V2 address with a V3 one, and this should stick for the future from my point of view. Otherwise, why not start prop 224 with version 0 or 1 encoded into addresses.
- a version change will surely change the entire crypto thus making address re-use for different versions impossible. It's an anti-censorship, self-authenticated, uncensored system so key material under the exclusive control of the user has to be used. At this moment, and this is unlikely to change, we can accomplish this only by using whole public keys or at least hash-sums of public keys. With a proper name system the human-memorable name can be updated with the new version address.
chelsea komlo me@chelseakomlo.com writes:
Hey!
Here is some extra pressure for you ;).
:) thanks, I will try!
Before starting, someone today very kindly pointed me to Prop 271, the naming system API for Tor Onion services. Overall, my larger concern is whether adding the version in the onion address makes both using and distributing onion addresses harder. If the long-term plan is for onion addresses to not be used directly, then having the version in the onion address is completely fine as this wouldn't present a barrier to entry for end users.
The HSDir fetch/post URL has gone in 0.3.0 (feature freeze today in theory ;) with the version in it:
--> /tor/hs/<version>/publish
So few things. First, if we don't have the version in the onion address, this means the client needs to try to fetch the descriptor for multiple version that is starting at the highest it knows and then going down as it's failing. That, I'm really not too keen to this, uneeded load on the network.
Yep, fair. So the idea of "fetch multiple descriptors, where a descriptor is for a single version," isn't viable for performance reasons.
Second thing is that HSDir might not all support the same version by the time we roll out prop224 thus the importance of having it in 0.3.0 (a version *before* the next gen release). Even with that, this is going to be an interesting experiement to have a set of HSDir supporting v3 and a set not supporting it because we kind of have this requirement of using 3 nearest relays for a replica but what if one of them doesn't support v3?
Yeah, that is hard. Although I'm not entirely sure how this complexity is correlated with how the client consumes the HS version...
Third thing, we could have a fix for this with a single descriptor supporting multiple version but then this has implication outside the onion address discussion and unfortunately 0.3.0 material again (that freezes today).
So I'm eager to hear your idea on this! But it's important to keep in mind that 0.3.0 has already some building blocks with some version restrictions :S. Changing those would mean delaying adoption by a 6 months (and it could be OK!).
Yeah! So if the plan is that onion addresses will not be used directly by end users and there is an abstraction layer that hides things like version upgrade from end users, then going ahead with the current plan sounds good.
However, if there is a chance that end users will consume onion addresses directly, then having this discussion seems like a good idea. The scenario that worries me is something like this:
- Facebook creates a hidden service and distributes this address
- A new hidden service version is created
- Facebook is reluctant to upgrade because this would mean
re-distributing a new onion address to a _lot_ of people. Also, there are problems of securely distributing and verifying new onion addresses- malicious parties could use this opportunity to distribute lookalikes, for example.
Hmm, on the above scenario, why would Tor change the version of the onion address if the pubkey and checksum algorithm do not change? The way I see it, the main scenario where we bump the onion address version is if we upgrade the cryptosystem of the identity key or the checksum algorithm. In that case, Facebook will have to migrate to another address anyhow, so moving the version field to the HSDir layer does not really help.
Furthermore, as David said, HS descriptors do have a version field anyway, so we can always take version-specific decisions on the HSDir layer without changing the onion address.
Finally, keeping a version field on the onion address, lets clients take version-specific decisions _before_ contacting HSDirs, which is not possible right now. The use of this is not obvious to me at this point, but I'm sure that onion service applications can find some use. Or it can be used by hidden services that want their clients to use an alternative HSDir hash ring logic (e.g. increase or decrease the default number of responsible HSDirs) by encoding this info in the version field.
On 28 Jan (00:25:04), chelsea komlo wrote:
Hey!
Here is some extra pressure for you ;).
:) thanks, I will try!
Before starting, someone today very kindly pointed me to Prop 271, the naming system API for Tor Onion services. Overall, my larger concern is whether adding the version in the onion address makes both using and distributing onion addresses harder. If the long-term plan is for onion addresses to not be used directly, then having the version in the onion address is completely fine as this wouldn't present a barrier to entry for end users.
The HSDir fetch/post URL has gone in 0.3.0 (feature freeze today in theory ;) with the version in it:
--> /tor/hs/<version>/publish
So few things. First, if we don't have the version in the onion address, this means the client needs to try to fetch the descriptor for multiple version that is starting at the highest it knows and then going down as it's failing. That, I'm really not too keen to this, uneeded load on the network.
Yep, fair. So the idea of "fetch multiple descriptors, where a descriptor is for a single version," isn't viable for performance reasons.
Second thing is that HSDir might not all support the same version by the time we roll out prop224 thus the importance of having it in 0.3.0 (a version *before* the next gen release). Even with that, this is going to be an interesting experiement to have a set of HSDir supporting v3 and a set not supporting it because we kind of have this requirement of using 3 nearest relays for a replica but what if one of them doesn't support v3?
Yeah, that is hard. Although I'm not entirely sure how this complexity is correlated with how the client consumes the HS version...
Third thing, we could have a fix for this with a single descriptor supporting multiple version but then this has implication outside the onion address discussion and unfortunately 0.3.0 material again (that freezes today).
So I'm eager to hear your idea on this! But it's important to keep in mind that 0.3.0 has already some building blocks with some version restrictions :S. Changing those would mean delaying adoption by a 6 months (and it could be OK!).
Yeah! So if the plan is that onion addresses will not be used directly by end users and there is an abstraction layer that hides things like version upgrade from end users, then going ahead with the current plan sounds good.
However, if there is a chance that end users will consume onion addresses directly, then having this discussion seems like a good idea. The scenario that worries me is something like this:
- Facebook creates a hidden service and distributes this address
- A new hidden service version is created
- Facebook is reluctant to upgrade because this would mean
re-distributing a new onion address to a _lot_ of people. Also, there are problems of securely distributing and verifying new onion addresses- malicious parties could use this opportunity to distribute lookalikes, for example.
When we upgrade key primitives (such as when we move to a PQ scheme), then it will definitely be necessary for HS operators to re-distribute addresses. However, minimizing the need for addresses to change will lower the barrier to use/operate hidden services.
If you think it is worth pursuing this discussion, I can start a new thread to discuss this further. One idea that seems viable is for descriptors to specify multiple supported HS versions (taking into account the points you and George have already made). In short, the scheme could be something like this:
- An onion address is represented by base32(pub_key || checksum)
- A descriptor specifies a list of versions supported by the HS with
that address (a descriptor can represent only one address/public key but multiple versions) 3) The client selects the highest available version supported
The proposed change to section 2.2.6 in prop 224 (URLS for anonymous uploading and downloading) would be for the publish URL to be HTTP POST to /tor/hs/publish, and HTTP GET to /tor/hs/<z>, where <z> is a base64 encoding of the hidden service's blinded public key. This would also mean that HSDir code won't need to change when new versions are added.
Quick follow up after George's response. This scheme doesn't work if the protocol is changed with new crypto. Today we use ed25519 blinded keys but tomorrow we could be in a PQ world so we are kind of putting ourself in a bad position with this URL scheme and would have to change more things HSDIr side at next version.
I do like the idea of "if we version++, how can we provide a way to not have the onion address to change" but imo I think that the day we'll move to v4, it will _most_ likely be for new crypto like George mentionned changing the address.
Finally, for a client to *not* know the version also prevents us to act *before* any fetch is done client side (again like George said). HSDir fetch is one thing but if we ever implement the Name Transport Plugin idea for instance, it will be extremely valuable imo that we can extract the version protocol from the onion there. That's one of the few things that can happen pre-fetch. It could be that at version X, client needs to do some extra steps before fetching the descriptor for instance. Etc...
Thanks for the feedback! David
But again, this change probably isn't necessary if onion addresses will live below an abstraction layer!
I apologize if this isn't good timing with feature freezes- I'll follow your lead with this! Chelsea
chelsea komlo:
Before starting, someone today very kindly pointed me to Prop 271, the naming system API for Tor Onion services. Overall, my larger concern is whether adding the version in the onion address makes both using and distributing onion addresses harder. If the long-term plan is for onion addresses to not be used directly, then having the version in the onion address is completely fine as this wouldn't present a barrier to entry for end users.
[snip]
Yeah! So if the plan is that onion addresses will not be used directly by end users and there is an abstraction layer that hides things like version upgrade from end users, then going ahead with the current plan sounds good.
However, if there is a chance that end users will consume onion addresses directly, then having this discussion seems like a good idea.
Naming systems like Namecoin and OnioNS have better usability due to being human-meaningful, but they achieve this by sacrificing the straightforward cryptographic proofs that make .onion names secure. This doesn't imply that Namecoin and OnioNS are worse for security overall (I think for a lot of use cases they're more secure than .onion once you factor in attacks on human psychology), but there are some use cases where users will want to use .onion directly without a naming layer. (I also suspect that this tradeoff is unavoidable to some extent; Dan Kaminsky and Aaron Swartz made some compelling arguments that Greg Maxwell's proof of impossibility of decentralized consensus algorithms also applies to Zooko's Triangle.)
Cheers, -Jeremy Rand
On Jan 24, 2017, at 4:27 AM, George Kadianakis desnacked@riseup.net wrote:
onion_address = base32(pubkey || checksum || version) checksum = SHA3(".onion checksum" || pubkey || version)
Any reason not to have the order of, pubkey || checksum || version be the same in both?
On 26 Jan 2017, at 09:59, Arlo Breault arlo@torproject.org wrote:
On Jan 24, 2017, at 4:27 AM, George Kadianakis desnacked@riseup.net wrote:
onion_address = base32(pubkey || checksum || version) checksum = SHA3(".onion checksum" || pubkey || version)
Any reason not to have the order of, pubkey || checksum || version be the same in both?
Yes: ".onion checksum" is not the same as checksum.
checksum = SHA3(".onion checksum" || pubkey || version)
Is the standard H(UNIQUE_PREFIX || DATA) construct that resists hash reuse and rainbow table attacks. ".onion checksum" represents the bytes from an ASCII-encoded literal string.
Putting those bytes later in the hash opens us up to hash reuse attacks where the key or version bytes are made to match the prefix from another hash.
(Every time we do SHA3(... || pubkey || ...) in the hidden service protocol, we want a prefix that is static and unique, so people can't use hashes from one part of the protocol to spoof hashes in another part of the protocol.)
onion_address = base32(pubkey || checksum || version)
Is the order in which the address is encoded once the checksum is calculated. checksum represents (the first two bytes of) the result of the SHA3 hash.
We put pubkey first so that humans can distinguish addresses. (We could put checksum first, but that's non-standard.)
T
-- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------
On 26 Jan 2017, at 10:19, teor teor2345@gmail.com wrote:
onion_address = base32(pubkey || checksum || version)
Is the order in which the address is encoded once the checksum is calculated. checksum represents (the first two bytes of) the result of the SHA3 hash.
We put pubkey first so that humans can distinguish addresses. (We could put checksum first, but that's non-standard.)
I just talked with some people who run a large onion site.
They asked if we can put the checksum at the front of the encoded address.
This makes phishing with different bit(s) in the tail of the address much harder. (That is, searching for a matching prefix for an existing address is much harder if the checksum changes the first two characters unpredictably. People ignore the checksum if it's at the end.)
T -- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------
On Sun, Mar 26, 2017 at 09:27:37PM +1100, teor wrote:
On 26 Jan 2017, at 10:19, teor teor2345@gmail.com wrote:
onion_address = base32(pubkey || checksum || version)
Is the order in which the address is encoded once the checksum is calculated. checksum represents (the first two bytes of) the result of the SHA3 hash.
We put pubkey first so that humans can distinguish addresses. (We could put checksum first, but that's non-standard.)
I just talked with some people who run a large onion site.
They asked if we can put the checksum at the front of the encoded address.
This makes phishing with different bit(s) in the tail of the address much harder. (That is, searching for a matching prefix for an existing address is much harder if the checksum changes the first two characters unpredictably. People ignore the checksum if it's at the end.)
Wait; why is searching for a matching checksum at the beginning harder than searching for a matching pubkey? When trying to collide an onion address, the pubkey is essentially random, as is the checksum.
On 26 Mar 2017, at 21:41, Ian Goldberg iang@cs.uwaterloo.ca wrote:
On Sun, Mar 26, 2017 at 09:27:37PM +1100, teor wrote:
On 26 Jan 2017, at 10:19, teor teor2345@gmail.com wrote:
onion_address = base32(pubkey || checksum || version)
Is the order in which the address is encoded once the checksum is calculated. checksum represents (the first two bytes of) the result of the SHA3 hash.
We put pubkey first so that humans can distinguish addresses. (We could put checksum first, but that's non-standard.)
I just talked with some people who run a large onion site.
They asked if we can put the checksum at the front of the encoded address.
This makes phishing with different bit(s) in the tail of the address much harder. (That is, searching for a matching prefix for an existing address is much harder if the checksum changes the first two characters unpredictably. People ignore the checksum if it's at the end.)
Wait; why is searching for a matching checksum at the beginning harder than searching for a matching pubkey? When trying to collide an onion address, the pubkey is essentially random, as is the checksum.
You're right - it only matters if the checksum is hard to compute. (We could make it an scrypt or something, if we wanted to. But if we don't, there's no need to make this change.)
T -- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------
On Sun, Mar 26, 2017 at 09:27:37PM +1100, teor wrote:
On 26 Jan 2017, at 10:19, teor teor2345@gmail.com wrote:
onion_address = base32(pubkey || checksum || version)
Is the order in which the address is encoded once the checksum is calculated. checksum represents (the first two bytes of) the result of the SHA3 hash.
We put pubkey first so that humans can distinguish addresses. (We could put checksum first, but that's non-standard.)
I just talked with some people who run a large onion site.
They asked if we can put the checksum at the front of the encoded address.
This makes phishing with different bit(s) in the tail of the address much harder. (That is, searching for a matching prefix for an existing address is much harder if the checksum changes the first two characters unpredictably. People ignore the checksum if it's at the end.)
The issue extends to vanity domains, which may do more harm than good as they condition people to recognise onion domains by their prefix, but I'm not aware of any research to back up that assumption. See also: https://moderncrypto.org/mail-archive/messaging/2015/001928.html
Hi,
George Kadianakis wrote:
I made a torspec branch that alters prop224 accordingly: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-onion-a...
It seems that SHA3 digest length is missing for onion address generation. I guess (?) that it supposed to be SHA3-256 but it definitely should be specified here. I think that it just a typo since there is definition of H() above.
Thanks, -- Ivan Markin
Ivan Markin twim@riseup.net writes:
Hi,
George Kadianakis wrote:
I made a torspec branch that alters prop224 accordingly: https://gitweb.torproject.org/user/asn/torspec.git/commit/?h=prop224-onion-a...
It seems that SHA3 digest length is missing for onion address generation. I guess (?) that it supposed to be SHA3-256 but it definitely should be specified here. I think that it just a typo since there is definition of H() above.
OK guys, thanks for all the great feedback!
I merged my prop224 onion encoding patch to torspec just now, after fixing the bug that Ivan mentioned above.
Hope this works for you :)
On Tue, Jan 31, 2017 at 02:54:50PM +0200, George Kadianakis wrote:
I merged my prop224 onion encoding patch to torspec just now, after fixing the bug that Ivan mentioned above.
Thanks!
btw it's not clear how H() output should be truncated to form a checksum. Should it be the first 2 bytes or the last 2 bytes? It should be specified in the definition of CHECKSUM (because length of digest obviously is not 2 bytes):
- CHECKSUM = H(".onion checksum" || PUBKEY || VERSION) + CHECKSUM = H(".onion checksum" || PUBKEY || VERSION)[:2]
Also it worthwhile to include examples with correct checksums.
-- Ivan Markin
On 31 Jan (14:26:52), Ivan Markin wrote:
On Tue, Jan 31, 2017 at 02:54:50PM +0200, George Kadianakis wrote:
I merged my prop224 onion encoding patch to torspec just now, after fixing the bug that Ivan mentioned above.
Thanks!
btw it's not clear how H() output should be truncated to form a checksum. Should it be the first 2 bytes or the last 2 bytes? It should be specified in the definition of CHECKSUM (because length of digest obviously is not 2 bytes):
- CHECKSUM = H(".onion checksum" || PUBKEY || VERSION)
- CHECKSUM = H(".onion checksum" || PUBKEY || VERSION)[:2]
Good point. Current implementation assumes *first two* :). Easy to change but let's make it clear!
Thanks! David
Also it worthwhile to include examples with correct checksums.
-- Ivan Markin _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Tue, Jan 24, 2017 at 02:27:43PM +0200, George Kadianakis wrote:
And given the above, here is the new microproposal:
onion_address = base32(pubkey || checksum || version) checksum = SHA3(".onion checksum" || pubkey || version)
where: pubkey is 32 bytes ed25519 pubkey version is one byte (default value for prop224: '\x03') checksum hash is truncated to two bytes
Here are a few example addresses (with broken checksum):
l5satjgud6gucryazcyvyvhuxhr74u6ygigiuyixe3a6ysis67ororad.onion btojiu7nu5y5iwut64eufevogqdw4wmqzugnoluw232r4t3ecsfv37ad.onion vckjr6bpchiahzhmtzslnl477hdfvwhzw7dmymz3s5lp64mwf6wfeqad.onion
Checksum strength: The checksum has a false negative rate of 1/65536.
Address handling: Clients handling onion addresses first parse the version field, then extract pubkey, then verify checksum.
Let me know how you feel about this one. If people like it I will transcribe it to prop224.
FYI, I've implemented derivation and verification of v3 onion addresses (https://github.com/nogoegst/onionutil/blob/master/address.go). Some test vectors I got:
private key onion address
33a7e5c16e0308a3e6a0e7f4a621b3caad9ed1acdb3f78369b1377c5e605027879bcc625184b05194975c28b66b66b0469f7f6556fb1ac3189a79b40dda32f1f pg6mmjiyjmcrsslvykfwnntlaru7p5svn6y2ymmju6nubxndf4pscryd
62a70904f219a788f3c3c46b64c7bc6e800fed54079f2bb88c4fe3800fe2264593f6ad7b54b6391d2b78147a0b2e808e143780de07f1bda6ee7f052d2e9da67b sp3k262uwy4r2k3ycr5awluarykdpag6a7y33jxop4cs2lu5uz5sseqd
8d31e643f3693944817172030bab236a818d4a1d1ecbd7b8ce3ccb005dfb15fbb8391d2003bb3bd285b035ac8eb30c80c4e2a29bb7a2f0ce0df8743c37ec3593 xa4r2iadxm55fbnqgwwi5mymqdcofiu3w6rpbtqn7b2dyn7mgwj64jyd
a7f82fdf8f93a299e947f302313971b6759b8140d86468ead9cc960474c274b5f2ba31b35974d6a5214360cc3098fc69cf0a51d9944672a8904c97cba06c3945 6k5ddm2zotlkkikdmdgdbgh4nhhquuozsrdhfkeqjsl4xidmhfc6ntqd
ba85d39f1e45ca1627a4d5e28fb891fa810669feec96a146551c87109376f01b07ec065de1daa2b12da5fc2d8b8ae516b23d4a2cbe00edc11c87636c2f3d2129 a7wamxpb3krlclnf7qwyxcxfc2zd2srmxyao3qi4q5rwylz5eeu35xqd -- Ivan Markin
Hey George,
Thanks for sending this and summarizing everything!
[D1] How to use version field:
The version field is one byte long. If we use it as an integer we can encode 256 values in it; if we use it as a bitmap we could encode properties and such. My suggestion is to simply use it as an integer like Bitcoin does. So we can assign value \x01 to normal onion services, and in the future we can assign more version tags if we need to. For example, we can give a different version field to onion services in the testnet. We can also reserve a range of values for application-specific purposes.
Will hidden service addresses only encode a single version?
If yes to the above, only allowing a limited number of versions on the network at a single time might be a good idea. Otherwise we run into the dilemma where hidden service operators need to maintain and distribute multiple addresses, and users need to understand what version their Tor client supports (and potentially their friend's as as well, if they want to share a HS link).
As s7r said, Bitcoin addresses are single user/single use [1], whereas HS addresses are multiple user/multiple use. Because of this difference in purpose/use, I would argue we'll need to consider circumstances such as version incompatibility, upgrade path, longevity, etc more strongly for HS addresses than for Bitcoin addresses.
The idea of supporting multiple versions in a HS address was discussed earlier- is this still a viable scheme, or did the cons eventually outweigh the pros for this?
[D1.1] Default version value:
The next question is what version value to assign to normal onion services. In the above scheme where: onion_address = base32(version + pubkey + checksum)
It would be good to understand what the process of upgrading default versions looks like, from both a client and hidden service operator perspective.
Thanks, great work! Chelsea
[1] https://en.bitcoin.it/wiki/Address#A_Bitcoin_address_is_a_single-use_token
chelsea komlo me@chelseakomlo.com writes:
Hey George,
Thanks for sending this and summarizing everything!
[D1] How to use version field:
The version field is one byte long. If we use it as an integer we can encode 256 values in it; if we use it as a bitmap we could encode properties and such. My suggestion is to simply use it as an integer like Bitcoin does. So we can assign value \x01 to normal onion services, and in the future we can assign more version tags if we need to. For example, we can give a different version field to onion services in the testnet. We can also reserve a range of values for application-specific purposes.
Will hidden service addresses only encode a single version?
If yes to the above, only allowing a limited number of versions on the network at a single time might be a good idea. Otherwise we run into the dilemma where hidden service operators need to maintain and distribute multiple addresses, and users need to understand what version their Tor client supports (and potentially their friend's as as well, if they want to share a HS link).
As s7r said, Bitcoin addresses are single user/single use [1], whereas HS addresses are multiple user/multiple use. Because of this difference in purpose/use, I would argue we'll need to consider circumstances such as version incompatibility, upgrade path, longevity, etc more strongly for HS addresses than for Bitcoin addresses.
The idea of supporting multiple versions in a HS address was discussed earlier- is this still a viable scheme, or did the cons eventually outweigh the pros for this?
Hey Chelsea,
while writing the proposal, I felt like supporting multiple versions in the version field would be more trouble than worth it. I also dislike the fact that multiple addresses could then represent the same public key.
Also, I doubt we will ever reach the point where we have multiple HS versions existing simultaneously in our network that can also share the same onion address. This time, we will have two versions, the legacy onion services and prop224; and their addresses are definitely incompatible.
Worst case, if the need for this becomes apparent at some point, we can abuse the integer valued version field to encode such information (hey, we have 256 values after all). Or perhaps this is the best case since it means that the hidden service protocol has evolved a lot...
Cheers!
Hi George,
George Kadianakis: [...]
[D3] Do we like base32???
In this proposal I suggest we keep the base32 encoding since we've been using it for a while; but this is the perfect time to switch if we feel the need to. For example, Bitcoin is using base58 which is much more compact than base32, and also has much better UX properties than base64: https://en.bitcoin.it/wiki/Base58Check_encoding#Background If we wanted to get a more compact encoding, we could adopt base58 or make our own adaptation of it. In this proposal I'm using base32 for everything, but I could be persuaded that now is the time to use a better encoding.
While the addresses are definitely too long to be fun to type, there are still use cases where the addresses will be typed. So I think we should consider everything which would make them easier to type and compare. Like others stated, base58 will not make the addresses easier to type, because they would be case sensitive.
But maybe it would help to separate them into groups of 4 characters, separated maybe by a dash, which would make them look like this:
tbi5-tdxb-osio-tpha-wjyu-7f5p-w5tl-nvbv-fjrj-7mes-kbsn-wr2b-qbu2-t4gg.onion
Cheers
Date: Tue, 24 Jan 2017 12:40:00 +0000 From: segfault segfault@riseup.net
But maybe it would help to separate them into groups of 4 characters, separated maybe by a dash, which would make them look like this:
tbi5-tdxb-osio-tpha-wjyu-7f5p-w5tl-nvbv-fjrj-7mes-kbsn-wr2b-qbu2-t4gg.onion
This exceeds the maximum length of a DNS label, 63 octets[1]. One could use larger groups to avoid that, of course -- e.g., with eight octets per group you get down to 62:
tbi5tdxb-osiotpha-wjyu7f5p-w5tlnvbv-fjrj7mes-kbsnwr2b-qbu2t4gg.onion
[1] P. Mockapetris, `Domain Names - Concepts and Facilities', RFC 1034, IETF, 1987, Sec. 3.1 `Name space specifications and terminology', p. 7. https://www.ietf.org/rfc/rfc1034.txt
On Tue, Jan 24, 2017 at 12:40:00PM +0000, segfault wrote:
But maybe it would help to separate them into groups of 4 characters, separated maybe by a dash, which would make them look like this:
tbi5-tdxb-osio-tpha-wjyu-7f5p-w5tl-nvbv-fjrj-7mes-kbsn-wr2b-qbu2-t4gg.onion
Check out https://trac.torproject.org/projects/tor/ticket/15622 for more discussion of this idea.
I think the main problem with it is that we'd have to pick, and stick to, a particular format. The "let people add hyphens wherever they think it improves usability for them" option has all sorts of unexpected side effects when client-side apps accidentally reveal the hyphenation choice.
--Roger
On Tue, 24 Jan 2017 12:40:00 +0000, segfault wrote: ...
While the addresses are definitely too long to be fun to type, there are still use cases where the addresses will be typed.
For those cases you could print them with half-spaces or similar. You can even type them but need to remove them before actually pressing enter. Side effects, like the address being searched for, ensue.
Andreas