On Tue, Nov 1, 2011 at 5:50 AM, Watson Ladd watsonbladd@gmail.com wrote:
Turns out that almost everything you said about SHA3 vs SHA256 performance is wrong:
I should have specified that from the page I referenced -- http://bench.cr.yp.to/results-hash.html -- I was looking at the first x86_64 machine: amd64 Sandy Bridge (206a7); 2011 Intel Core i7-2600K; 4 x 3400MHz; "sandy0" and the first ARM machine: armeabi (v7-A, Cortex A8); 2009 Freescale i.MX515; 1 x 800MHz; "gcc33". Given that I was using the numbers from those two machines, I believe what I wrote was correct.
Now, perhaps those weren't the right machines to use for examples. My reasoning was that they are the newest and most efficient of their respective instruction sets, so they're likely to be more representative of their successors. I don't know why SHA-256 was slower on the ARMv6 that you mentioned.
Furthermore, crypto efficiency is less likely to be a bottleneck on a client then a node: server architectures matter much more because we do a lot more crypto on them. (This isn't true for each connection but servers handle more connections then clients.)
My letter, to which you replied is a quantitative argument to the contrary. The bottom line quantities that I came up with were that if you used 100% of one Sandy Bridge core, then you could hash at least 190 Mbytes/s, even with the less efficient of the two hash functions (SHA-256), and if you used 10% of a Cortex-A8, then you could hash at least 2.5 Mbytes/s even with the less efficient of the two (Blake-256).
As far as I understand, the case where a server would be asked to handle more than 190 Mbytes/s will be rarer than the case where a client will be asked to handle more than 2.5 Mbytes/s, during the lifetime of the next version of the Tor protocol. I don't know much about Tor performance so I could be all wrong about that.
Now, maybe the assumption that you can afford to dedicate 100% of a server core but only 10% of the embedded CPU is wrong. Even if you change the assumption to 50% of one server core vs. 50% of your embedded CPU, the server will probably still be network-limited but the embedded chip may be CPU-limited.
Secondly SHA256 is already weaker then an ideal hash function.
…
and SHA3 is designed to be a better approximation to a random oracle then SHA2.
I'm aware of these considerations, and I don't disagree! Thank you for bringing them up. But they don't prove that SHA-256 will prove vulnerable to any real attack. Likewise, there could be some unforeseen critical weakness in a SHA-3 candidate. We don't have any proof one way or the other about these questions. My personal, under-informed guess is that none of SHA-256, Blake, Skein will prove vulnerable to any real attack during the lifespan of the next version of the Tor protocol.
I would be interested in other people's personal guesses about that question, especially if you've studied the actual algorithms and attacks closely.
Regards,
Zooko