On 05/03/2012 01:32 PM, Karsten Loesing wrote:
On 5/2/12 9:35 PM, Sebastian G. <bastik.tor> wrote:
[...] "We don't need it, so better remove it." I really like that.
I think we're really conservative with giving out bridge data, and that's good.
At the same time there's a value in giving out information about bridges, so that "remove everything" is not a good answer. For example, I think if we give bridge operators better feedback how their bridge is doing, we'll suddenly have a lot more bridges. Making it easy for bridge operators to use Atlas would be a good step into that direction. The same applies to funders who realize from our statistics how successful the Tor Cloud project is and who then want to fund it more to make it more usable, support more cloud providers, etc.
I would suggest looking at homomorphic hash [1] and Shamir's discrete logarithm hash function [2]. (Those also work well with linear network coding [3], but not sure if it could be useful here.)
For example, encoding FQDN, IP or nick can be done by splitting the argument-to-encode by fields or characters. The parts can be then used as input to the hash function.
The function allows checking whether a nick/FQDN/IP has specific part, or two have identical part, but does not disclose "plaintext" of the part.
Obviously, there are statistical attacks possible: e.g. for FQDNs, the attacker could guess which component maps to 'com', as it is the most common TLD. Similarly, splitting up into characters can be attacked by using frequency tables. There are other things that could apply here (thinking about attacks on "plaintext RSA" without padding).
Nevertheless, I think it's still better than publishing plaintext data if we are not sure what they might give away. Implementation using gmp/gmpy/numpy should be fairly easy.
Ondrej
[1] On-the-Fly Verification of Rateless Erasure Codes for Efficient Content Distribution, http://pdos.csail.mit.edu/papers/otfvec/paper.pdf (see section IV) [2] http://www.senderek.com/SDLH/ [3] https://en.wikipedia.org/wiki/Network_coding