-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Resurrecting a thread from last year...
On 11/12/14 16:05, grarpamp wrote:
On Thu, Dec 11, 2014 at 8:26 AM, Michael Rogers michael@briarproject.org wrote:
- Which links should carry chaff?
First you need it to cover the communicating endpoints entry links at the edges. But if your endpoints aren't generating enough traffic to saturate the core, or even worse if there's not enough talking clients to multiplex each other through shared entry points densely enough, that's bad. So all links that any node has established to any other nodes seem to need chaffed.
Are you proposing that chaff would be sent end-to-end along circuits? (That's what "generating enough traffic to saturate the core" seems to imply.) If so, that would raise a number of problems:
1. Chaff would start and end at the same time for all hops of a given circuit.
2. Each hop of a given circuit would carry at least as much traffic away from the initiator as the next hop, and at most as much traffic towards the initiator as the next hop (where traffic = wheat + chaff in this context).
3. A delay introduced at one point in a circuit (e.g. by inducing congestion) would be visible along the rest of circuit, potentially revealing the path taken by the circuit.
The mechanism I proposed doesn't suffer from these problems.
- How much chaff should we send on each link?
Today, all nodes have an idea that the most bw you're ever going to get out of the system anyways is up to your pipe capacity, whether you let it free run, or you set a limit... all within your personal or purchased limits. So you just decide how much you can bear, set your committed rate, and fill it up.
That tells you how much chaff to send in total, but not how much to send on each link.
At present, relays don't divide their bandwidth between links ahead of time - they allocate bandwidth where it's needed. The bandwidth allocated to each link could be a smoothed function of the load
This sounds like picking some chaff ratio (even a ratio function) and scaling up the overall link bandwidth as needed to carry enough overall wheat within that. Not sure if that works to cover the 'first, entries, GPA watching there' above. Seems too user session driven bursty at those places, or the ratio/scale function is too slow to accept fast wheat demand. So you need a constant flow of CAR to hide all style wheat demands in. I scrap the ethernet thinking and recall clocked ATM. You interleave your wheat in place of the base fixed rate chaff of the link as needed. You negotiate the bw at the switch (node).
Here it sounds like you're proposing hop-by-hop chaff, not end-to-end?
but then we need to make sure that instantaneous changes in the function's output don't leak any information about instantaneous changes in its input.
This is the point of filling the links fulltime, you don't see any such ripples. (Maybe instantaneous pressure gets translated into a new domain of some nominal random masking jitter below. Which may still be a bit ethernet-ish.)
A relay can't send chaff to every other relay, so you can't fill all the links fulltime. The amount of wheat+chaff on each link must change in response to the amount of wheat.
The question is, do those changes only occur when circuits are opened and closed - in which case the endpoint must ask each relay to allocate some bandwidth for the lifetime of the circuit, as in my proposal - or do changes occur in response to changes in the amount of wheat, in which case we would need to find a function that allocates bandwidth to each link in response to the amount of wheat, without leaking too much information about the amount of wheat?
That isn't trivial.
What needs work is the bw negotiation protocol between nodes. You've set the CAR on your box, now how to divide it among established links? Does it reference other CARs in the consensus, does it accept subtractive link requests until full, does it span just one hop or the full path, is it part of each node's first hop link extension relative to itself as a circuit builds its way through, are there PVCs, SVCs, then the recalc as nodes and paths come and go, how far in does the wheat/chaff recognition go, etc? Do you need to drop any node that doesn't keep up RX/TX at the negotiated rate as then doing something nefarious?
It seems like there are a lot of unanswered questions here. I'm not saying the idea I proposed is perfect, but it does avoid all these questions.
- Is it acceptable to add any delay at all to low-latency
traffic? My assumption is no - people already complain about Tor being slow for interactive protocols.
No fixed delays, because yes it's annoyingly additive, and you probably can't clock packets onto the wire from userland Tor precisely enough for [1]. Recall the model... a serial bucket stream. Random jitter within an upper bound is different from fixed delay. Good question. [1 There was something I was trying to mask or prevent with jitter, I'll try to remember.]
If I understand right, you were using jitter to disguise the fine-grained timing of packets, so that packets arriving at relay couldn't easily be matched with packets leaving it. Packet departures would be triggered by a clock rather than by packet arrivals.
My suggestion was similar in that respect: packet departure times would be drawn from a random distribution.
Either way, decoupling the arrival and departure times necessarily involves delaying packets, which isn't acceptable for Tor's existing low-latency use cases. So we're looking at two classes of traffic here: one with lower latency, and the other with better protection against traffic confirmation.
So we'd still need to mark certain circuits as high-latency, and only those circuits would benefit from chaff.
High latency feels more like a class of service... not tied to selective chaffing on/off. I'll try to reread these parts from you.
Thanks, class of service is a better way to put it.
Cheers, Michael