I’m interested in helping out with this, mostly because we’ll want it for Pond : https://pond.imperialviolet.org/
I’ve read the alpha-mixing paper, but not the others, so I’ll check em’ out.
Jeff
On 9 Dec 2014, at 16:40, Michael Rogers michael@briarproject.org wrote:
Signed PGP part On 25/11/14 12:45, George Kadianakis wrote:
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Two of these papers were also mentioned in the guardian-dev thread, so I guess we're thinking along similar lines.
Alpha mixes and stop-and-go mixes are message-oriented, which as you said raises the question of how to integrate them into Tor. Judging by the abstract of the garbled routing paper (paywalled), it's a hybrid design combining message-oriented and circuit-oriented features. I think there might also be scope for circuit-oriented designs with higher latency than Tor currently provides, which might fit more easily into the Tor architecture than message-oriented or hybrid designs.
A circuit-oriented design would aim to prevent an observer from matching the circuits entering a relay with the circuits leaving the relay. In other words it would prevent traffic confirmation at each hop, and thus also end-to-end.
At least four characteristics can be used to match circuits entering and leaving a relay: start time, end time, total traffic volume and traffic timing. The design would need to provide ways to mix a circuit with other circuits with respect to each characteristic.
The current architecture allows start and end times to be mixed by pausing at each hop while building or tearing down a circuit. However, each hop of a given circuit must start earlier and end later than the next hop.
Traffic volumes can also be mixed by discarding padding at each hop, but each hop must carry at least as much traffic as the next hop (or vice versa for traffic travelling back towards the initiator). This is analogous to the problem of messages shrinking at each hop of a cypherpunk mix network, as padding is removed but not added.
There's currently no way to conceal traffic timing - each relay forwards cells as soon as it can.
Here's a crude sketch of a design that allows all four characteristics to be mixed, with fewer constraints than the current architecture. Each hop of a circuit must start earlier than the next hop, but it can end earlier or later, carry more or less traffic, and have different traffic timing.
The basic idea is that the initiator chooses a traffic pattern for each direction of each hop. The traffic pattern is described by a distribution of inter-cell delays. Each relay sends the specified traffic pattern regardless of whether it has any data to send, and regardless of what happens at other hops.
Whenever a relay forwards a data cell along a circuit, it picks a delay from the specified distribution, adds it to the current time, and writes the result on the circuit's queue. When the scheduler round-robins over circuits, it skips any circuits with future times written on them. If a circuit's time has come, the relay sends the first queued data cell if there is one; if not, it sends a single-hop padding cell.
Flow control works end-to-end in the same way as any other Tor circuit: single-hop padding cells aren't included in the circuit's flow control window.
When tearing down the circuit, the initiator tells each relay how long to continue sending the specified traffic pattern in each direction. Thus each hop may stop sending traffic before or after the next hop.
Even this crude design has multiple parameters, so its anonymity properties may not be easy to reason about. Even if we restrict traffic patterns to a single-parameter distribution such as the exponential, we also have to consider the pause time at each hop while building circuits and the 'hangover time' at each hop while tearing them down. But I think we can mine the mix literature for some ideas to apply - and probably some attacks against this first attempt at a design as well.
Cheers, Michael
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev