Re: [tor-dev] Can we stop sanitizing nicknames in bridge descriptors?

4 May 2012


      On 5/4/12 2:21 AM, Ondrej Mikle wrote:
...
On 05/03/2012 01:32 PM, Karsten Loesing wrote:
...
On 5/2/12 9:35 PM, Sebastian G. <bastik.tor> wrote:
...
[...]
"We don't need it, so better remove it." I really like that.
I think we're really conservative with giving out bridge data, and
that's good.
At the same time there's a value in giving out information about
bridges, so that "remove everything" is not a good answer.  For example,
I think if we give bridge operators better feedback how their bridge is
doing, we'll suddenly have a lot more bridges.  Making it easy for
bridge operators to use Atlas would be a good step into that direction.
 The same applies to funders who realize from our statistics how
successful the Tor Cloud project is and who then want to fund it more to
make it more usable, support more cloud providers, etc.
I would suggest looking at homomorphic hash [1] and Shamir's discrete logarithm
hash function [2]. (Those also work well with linear network coding [3], but not
sure if it could be useful here.)
For example, encoding FQDN, IP or nick can be done by splitting the
argument-to-encode by fields or characters. The parts can be then used as input
to the hash function.
The function allows checking whether a nick/FQDN/IP has specific part, or two
have identical part, but does not disclose "plaintext" of the part.
Obviously, there are statistical attacks possible: e.g. for FQDNs, the attacker
could guess which component maps to 'com', as it is the most common TLD.
Similarly, splitting up into characters can be attacked by using frequency
tables. There are other things that could apply here (thinking about attacks on
"plaintext RSA" without padding).
Nevertheless, I think it's still better than publishing plaintext data if we are
not sure what they might give away. Implementation using gmp/gmpy/numpy should
be fairly easy.
Interesting approach.  So, the idea would be to split a nickname like
"ec2bridgeb268f2ae6" into its characters (or pairs of 2 or more
characters?), run it through the hash function, and then be able to
check if the nickname starts with "ec2bridge"?  Plus, the approach would
still work if we later decide we want to find all bridges with nicknames
starting with "rackspacebridge"?
My first concern is that there's not enough entropy in nicknames for the
hash function to provide sufficient protection.  I could imagine it's
not hard to throw variants of all known relay nicknames into that hash
function and learn 50%, if not 75%, of all used bridge nicknames.
My second concern is that this approach would only solve the problem of
counting EC2 bridges, but wouldn't make sites like Atlas more usable for
bridge operators.
Best,
Karsten

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Can we stop sanitizing nicknames in bridge descriptors?