Hello all,
I've been closely following the other Proposal 224 threads regarding the next-generation of onion services. I'm glad to see that we have a timeline and plan for migrating the network. One unresolved point is what to do with the remaining 4 bits in the longer addresses. Section 1.2 in the 224 document states "Note that since master keys are 32 bytes long, and 52 bytes of base 32 encoding can hold 260 bits of information, we have four unused bits in each of these names." It seems a waste for these to be zeroed out. The four bits could also be used to hold client-side flags, but I'm not aware of any applicable client settings that could be used here. I suggest that we use them as a checksum. (wasn't this proposed in Arlington?)
Since speed isn't a priority, aside from Adler-32 or some CRC function, we could also hash the 32-byte key and use the first four bits of the hash. I think a checksum is best because it helps ensure data integrity when the address is shared online, copy/pasted, or physically written down. Bitcoin addresses contain a checksum as well for exactly this reason. They use a combination of SHA-256 and RIPEMD-160 to compute the checksum component. Source: 1) https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_address... 2) https://bitcoin.stackexchange.com/questions/32353/
What do we think about a checksum, or do we have other plans here? I ask because once we nail down the address format, I can add support for 224 into my Onion Name System.
Jesse V kernelcorn@torproject.org writes:
Hello all,
I've been closely following the other Proposal 224 threads regarding the next-generation of onion services. I'm glad to see that we have a timeline and plan for migrating the network. One unresolved point is what to do with the remaining 4 bits in the longer addresses. Section 1.2 in the 224 document states "Note that since master keys are 32 bytes long, and 52 bytes of base 32 encoding can hold 260 bits of information, we have four unused bits in each of these names." It seems a waste for these to be zeroed out. The four bits could also be used to hold client-side flags, but I'm not aware of any applicable client settings that could be used here. I suggest that we use them as a checksum. (wasn't this proposed in Arlington?)
Since speed isn't a priority, aside from Adler-32 or some CRC function, we could also hash the 32-byte key and use the first four bits of the hash. I think a checksum is best because it helps ensure data integrity when the address is shared online, copy/pasted, or physically written down. Bitcoin addresses contain a checksum as well for exactly this reason. They use a combination of SHA-256 and RIPEMD-160 to compute the checksum component. Source:
https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_address... 2) https://bitcoin.stackexchange.com/questions/32353/
What do we think about a checksum, or do we have other plans here? I ask because once we nail down the address format, I can add support for 224 into my Onion Name System.
Thanks for bringing up this topic Jesse.
I'd be interested in both a version field and a checksum to be part of the encoding of the onion address. I also don't mind extending the encoding by a character or two if that will make it more useful (there is little difference between 54 and 56 characters).
WRT version field, should it be a single value/bitmap or should it be able to denote support for multiple versions?
WRT checksum, how much of the address can we protect using a checksum of 1 or 2 bytes? My error-correcting-codes fu is a bit rusty, and I'm not sure how many errors we can detect/correct. Anyone can do some digging here and prepare a table, so that we can take an informed decision? Perhaps the bitcoin community has already done this for us.
On 12/06/2016 11:24 AM, George Kadianakis wrote:
I'd be interested in both a version field and a checksum to be part of the encoding of the onion address. I also don't mind extending the encoding by a character or two if that will make it more useful (there is little difference between 54 and 56 characters).
Sure, I don't see a problem with this. It's unlikely that we will need a full byte for a version number. What if addresses are 53 characters long, instead of 52? Now you have nine bits to work with. We could use four of them for a version number and five for checksum. Alternatively, if addresses are 54 characters, then we have 14 bits. Then we could use four or five for versions and 9 or 10 for checksum.
WRT version field, should it be a single value/bitmap or should it be able to denote support for multiple versions?
I'm not entirely sure what you mean by a bitmap here. It's unlikely that we will need to denote new versions, but if we have some room, it could be useful in case we need to.
WRT checksum, how much of the address can we protect using a checksum of 1 or 2 bytes? My error-correcting-codes fu is a bit rusty, and I'm not sure how many errors we can detect/correct. Anyone can do some digging here and prepare a table, so that we can take an informed decision? Perhaps the bitcoin community has already done this for us.
We don't have enough room for an error-correcting Hamming code or something like that. If we do want to use a simple checksum, I'm simply concerned about ensuring that the address is correct. A simple code won't be able to prevent from intentional modification, but it can help with accidental typos to some degree. If we do want to use a checksum, I suggest that we base it on a cryptographic hash function, such as SHA-2 or SHA-3.
If we use 54-character addresses, then we have room for up to 16 versions, and 10 bits of checksum, which gives us a 1/1024 chance of NOT identifying a typo, right? Maybe the birthday paradox comes into play here.
On 06 Dec (07:05:47), Jesse V wrote:
Hello all,
I've been closely following the other Proposal 224 threads regarding the next-generation of onion services. I'm glad to see that we have a timeline and plan for migrating the network. One unresolved point is what to do with the remaining 4 bits in the longer addresses. Section 1.2 in the 224 document states "Note that since master keys are 32 bytes long, and 52 bytes of base 32 encoding can hold 260 bits of information, we have four unused bits in each of these names." It seems a waste for these to be zeroed out. The four bits could also be used to hold client-side flags, but I'm not aware of any applicable client settings that could be used here. I suggest that we use them as a checksum. (wasn't this proposed in Arlington?)
Fun fact, that discussion was part of the "other tor-dev@ thread" I was planning to do after torrc discussion ;).
Since speed isn't a priority, aside from Adler-32 or some CRC function, we could also hash the 32-byte key and use the first four bits of the hash. I think a checksum is best because it helps ensure data integrity when the address is shared online, copy/pasted, or physically written down. Bitcoin addresses contain a checksum as well for exactly this reason. They use a combination of SHA-256 and RIPEMD-160 to compute the checksum component. Source:
https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_address... 2) https://bitcoin.stackexchange.com/questions/32353/
What do we think about a checksum, or do we have other plans here? I ask because once we nail down the address format, I can add support for 224 into my Onion Name System.
We had little discussion but some of us agree for sure on having bits for the version number. That will tell a tor client to fetch the right descriptor instead of trying all version that have the same type of public key (.onion address). We currently have I believe 4 bit left which is only 16 values so we could extend to one more byte here so have more room.
Second thing that is possible, like you stated above, is a checksum. Unfortunately, I haven't thought much about this nor know the "state of the art of small-checksum" but definitely something to dig through! Jessie, if you feel like it, I welcome any analysis you can do on checksum here and some proposal about it. (Only if you want to :).
Thanks! David
-- Jesse
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 12/06/2016 11:27 AM, David Goulet wrote:
We had little discussion but some of us agree for sure on having bits for the version number. That will tell a tor client to fetch the right descriptor instead of trying all version that have the same type of public key (.onion address). We currently have I believe 4 bit left which is only 16 values so we could extend to one more byte here so have more room.
I'm curious if we ever ran into this issue with the current HS protocol. What type of changes would warrant a new address that that could not be solved with a patch to the tor binary? We also need to consider the difficulty of distributing a one-character-different address against the difficulty of transitioning the network to the new descriptors. People get very entrenched to their onion address, bookmark them, and some even issue SSL certs for them.
Let's say we added another character, so that we have 9 bits free. Would would be the consequence of using all 9 bits for a checksum? We could solve the version/descriptor issue using a naming system and simply point the name to a newer onion address. It's something to consider.
Second thing that is possible, like you stated above, is a checksum. Unfortunately, I haven't thought much about this nor know the "state of the art of small-checksum" but definitely something to dig through! Jessie, if you feel like it, I welcome any analysis you can do on checksum here and some proposal about it. (Only if you want to :).
I'm not fluent in the arts of small checksums, but it seems to me that we do have some benefit of using the first N bits of SHA2(version + edDSA_address) as the checksum. I may not have time to write a full proposal, but even with a small number of bits we do have a decent chance of catching typos, which is the whole point. Obviously, this chance will get better as you add more bytes, but prop224 addresses are already fairly long and we should weigh the usability impact against the probability of typos.
On 7 Dec. 2016, at 09:47, Jesse V kernelcorn@torproject.org wrote:
On 12/06/2016 11:27 AM, David Goulet wrote:
We had little discussion but some of us agree for sure on having bits for the version number. That will tell a tor client to fetch the right descriptor instead of trying all version that have the same type of public key (.onion address). We currently have I believe 4 bit left which is only 16 values so we could extend to one more byte here so have more room.
I'm curious if we ever ran into this issue with the current HS protocol. What type of changes would warrant a new address that that could not be solved with a patch to the tor binary? We also need to consider the difficulty of distributing a one-character-different address against the difficulty of transitioning the network to the new descriptors. People get very entrenched to their onion address, bookmark them, and some even issue SSL certs for them.
Let's say we added another character, so that we have 9 bits free. Would would be the consequence of using all 9 bits for a checksum? We could solve the version/descriptor issue using a naming system and simply point the name to a newer onion address. It's something to consider.
Second thing that is possible, like you stated above, is a checksum. Unfortunately, I haven't thought much about this nor know the "state of the art of small-checksum" but definitely something to dig through! Jessie, if you feel like it, I welcome any analysis you can do on checksum here and some proposal about it. (Only if you want to :).
I'm not fluent in the arts of small checksums, but it seems to me that we do have some benefit of using the first N bits of SHA2(version + edDSA_address) as the checksum. I may not have time to write a full proposal, but even with a small number of bits we do have a decent chance of catching typos, which is the whole point. Obviously, this chance will get better as you add more bytes, but prop224 addresses are already fairly long and we should weigh the usability impact against the probability of typos.
A more appropriate construction here is: H(prefix + version + edDSA_address) Where prefix is a static string identifying the purpose of the hash.
That way, hash re-use becomes difficult - tables must be re-built for every different prefix.
T
On 06 Dec (17:47:01), Jesse V wrote:
On 12/06/2016 11:27 AM, David Goulet wrote:
We had little discussion but some of us agree for sure on having bits for the version number. That will tell a tor client to fetch the right descriptor instead of trying all version that have the same type of public key (.onion address). We currently have I believe 4 bit left which is only 16 values so we could extend to one more byte here so have more room.
I'm curious if we ever ran into this issue with the current HS protocol. What type of changes would warrant a new address that that could not be solved with a patch to the tor binary? We also need to consider the difficulty of distributing a one-character-different address against the difficulty of transitioning the network to the new descriptors. People get very entrenched to their onion address, bookmark them, and some even issue SSL certs for them.
Descriptor have a version which is basically the HS protocol, IP have a subprotocol version, RP have a subprotocol version, HSDir have a subprotocol version, there are really lots of values :).
IMO, adding a version to the address would be the version of the HS protocol because the address (for prop224 it is your public key) is litterally cryptographically tied to the descriptor. Imagine in 6 months after v3 is out we go to v4 because we need to add a field to the descriptor for X reasons, v4 addresses will be generated with a version "4" in it so client can fetch do the fetch for a v4 descriptor (yes the version is in the URL of the request to the HSDir) instead of being confused and trying v3 and then v4. This is considering of course that the length between v3 and v4 is the same. Different length makes it "easier" but yet we shouldn't rely on that for any versionning scheme.
That being said, you are right that people get very entrenched to their .onion *especially* with EV certificate nowadays or bookmark but encoding a checksum or/and version will indeed make a part of it different for some feature gain... not easy problem user wise... :S
Let's say we added another character, so that we have 9 bits free. Would would be the consequence of using all 9 bits for a checksum? We could solve the version/descriptor issue using a naming system and simply point the name to a newer onion address. It's something to consider.
Yes, _ideally_, naming transport should be our way forward else we'll loose this battle of security vs usability. Onion address _have_ to get bigger unfortunately. Proposal 274 (which is stuck on tor-dev@ I realize) is imo a really good way forward (A Name System API for Tor Onion Services).
Second thing that is possible, like you stated above, is a checksum. Unfortunately, I haven't thought much about this nor know the "state of the art of small-checksum" but definitely something to dig through! Jessie, if you feel like it, I welcome any analysis you can do on checksum here and some proposal about it. (Only if you want to :).
I'm not fluent in the arts of small checksums, but it seems to me that we do have some benefit of using the first N bits of SHA2(version + edDSA_address) as the checksum. I may not have time to write a full proposal, but even with a small number of bits we do have a decent chance of catching typos, which is the whole point. Obviously, this chance will get better as you add more bytes, but prop224 addresses are already fairly long and we should weigh the usability impact against the probability of typos.
teor's reply seems reasonable so far about that :). I just wonder how much we truncate.
David
-- Jesse
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Jesse V kernelcorn@torproject.org writes:
On 12/06/2016 11:27 AM, David Goulet wrote:
We had little discussion but some of us agree for sure on having bits for the version number. That will tell a tor client to fetch the right descriptor instead of trying all version that have the same type of public key (.onion address). We currently have I believe 4 bit left which is only 16 values so we could extend to one more byte here so have more room.
I'm curious if we ever ran into this issue with the current HS protocol. What type of changes would warrant a new address that that could not be solved with a patch to the tor binary? We also need to consider the difficulty of distributing a one-character-different address against the difficulty of transitioning the network to the new descriptors. People get very entrenched to their onion address, bookmark them, and some even issue SSL certs for them.
Let's say we added another character, so that we have 9 bits free. Would would be the consequence of using all 9 bits for a checksum? We could solve the version/descriptor issue using a naming system and simply point the name to a newer onion address. It's something to consider.
Second thing that is possible, like you stated above, is a checksum. Unfortunately, I haven't thought much about this nor know the "state of the art of small-checksum" but definitely something to dig through! Jessie, if you feel like it, I welcome any analysis you can do on checksum here and some proposal about it. (Only if you want to :).
I'm not fluent in the arts of small checksums, but it seems to me that we do have some benefit of using the first N bits of SHA2(version + edDSA_address) as the checksum. I may not have time to write a full proposal, but even with a small number of bits we do have a decent chance of catching typos, which is the whole point. Obviously, this chance will get better as you add more bytes, but prop224 addresses are already fairly long and we should weigh the usability impact against the probability of typos.
Hello people and happy new year :)
I think at this point the best way forward would be for someone to take initiative and write a Tor proposal on how onion addresses should be encoded/represented. This way we will have something concrete that we can discuss and work with.
Anyone interested? If not, I will get to it in a few months.
peace