Hi,
in WPES 2012 [2], Aniket Kate and me introduced Ace, an alternative for Tor's current handshake protocol ntor (which was also briefly discussed on this list [4]). Back then, no implementation for the double scalar multiplication operation (a * b + c * d) on Curve25519 was available. But recently, Robert Ransom implemented in his celator library [1] a highly efficient double scalar multiplication which is suitable for our handshake protocol Ace.
In an internship with us, Shivanker Goel implemented and run comparative benchmarks of the Ace protocol, using Robert's celator library [1], and the current ntor implementation. Here are the benchmarks:
64-bit architecture: Ace: 401,129 cycles, ntor: 472,772 cycles,
32-bit architecture: Ace: 2,797,303 cycles, ntor: 5,179,069 cycles,
About the benchmarks:
* For cleaning the cache, the benchmarks runs the handshake 1000 times prior to starting the benchmarks. Then, we used a tsc_read() function to count the cycles, repeated the handshake 5000 times and computed the average of the cycles.
* In order not to be forced to import too many functions from the Tor library, we do not use the function dimap_search, which looks up the secret key for a given public key, in the benchmarks.
* We assume that ephemeral keys are precomputed, e.g., in idle cycles. In the benchmarks these ephemeral keys are received as an input.
* For the benchmarks hyper-threading and overclocking were disabled and gcc-4.8 with "-O2" optimizations was used.
The code is available online [2]. We also tried to formulate a tor-style proposal, which is also available online [3].
Any feedback, comments, or suggestions are welcome.
- Esfandiar
[1] http://www.infsec.cs.uni-saarland.de/~mohammadi/ace/celator.tar.gz [2] http://www.infsec.cs.uni-saarland.de/~mohammadi/ace/ace-benchmarks.tar.gz [3] http://www.infsec.cs.uni-saarland.de/~mohammadi/ace/ace-handshake.txt [4] It was discussed in the thread "Another key exchange algorithm for extending circuits: alternative to ntor?" from August 2012.