George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
Mike Perry:
teor:
On 25 Apr 2018, at 18:30, Mike Perry mikeperry@torproject.org wrote:
- Hidden service use can't push you over to an unused guard (at all).
- Hidden service use can't influence your choice of guard (at all).
- Exits and websites can't push you over to an unused guard (at all)
- DoS/Guard node downtime signals are rare (absent)
- Nodes are not reused for Guard and Exit positions ("any" positions)
- Information about the guard(s) does not leak to the website/RP (at all).
- Relays in the same family can't be forced to correlate Exit traffic.
I think this list is missing some important user-visible properties, or it's not clear which property above corresponds to these properties:
- Is Tor reliable and responsive when guards go down, or when I move networks, or when I have lost and regained service?
I think this is implicitly provided by #4. Downtime is a security issue. If (any of) a client Guard(s) are down, and the adversary can detect this based on client behavior, well, that is a side channel signal that provides information about the Guard. So by satisfying #4, we also satisfy the weaker conditions of general reliability and responsiveness.
I also think it's missing an implicit property, which we should make explicit:
- Can Tor users be fingerprinted by their set of guards or directory guards?
Perhaps this property is out of scope.
I think it is worth considering. We should add it if we need to do another round of evaluation.
Alright, for the sake of argument, let's call this Property #8: 8. Less information from guard fingerprinting (the least information)
I argue that this #8 is also equivalent to a #9 that Roger would ask for: 9. Fewer points of observation into the network (the fewest points).
If we are actually aiming for 8 and 9 we need to do something about the numdirguard=3 situation, otherwise we still have a huge guard fpr and we still expose ourselves to more of the network even if we keep one guard.
Yeah. Hrmm. I suppose this is a way that property #8 differs from property #9... The dirguard usage increases fingerprinting, but if observation for #9 means "observation of relayed application traffic", then not setting the dirguards to 1 costs us #8, but not #9.
To avoid TL;DR, that argument is an exercise to the reader ;).
Here is a proposal that beats my previous proposal on Property #8 and #9, while trying to preserve as many of the other properties as possible:
- Set "num primary guards"=1 and "num primary guards to use"=1
- Set "num directory guards"=1 and "num directory guards to use"=1
- Don't give Exit nodes the Guard flag.
- Allow "same node, same /16, same family" between guard and last hop, but only for HS circuits (which are at least 4 hops).
- Allow same /16 and same family for HS circuits.
This's for all hops? So all service-side HS circ hops can share the same family? I gues that's OK since we don't know what's happening on the other side of the HS circuit anyhow? Or what?
Yeah, that was my reasoning for defining property #7 in terms of Exit traffic only. There may be alterations of this that prevent the same family from being in every position of one end of the circuit, but since we can't prevent the case where the same family is on both entry points across the entire HS connection to correlate the entire circuit, I am not sure how to define this property.
Maybe there is a difference if the same family is allowed to be the IP and HSDIR, though, since that could allow forced correlation to deanonymize the HS itself... We could consider preventing that. With one guard, it definitely will leak information about the choice of IPs over time, though, which is worse (and is the case today :/). With two guards chosen from different families and /16, it should be fine with respect to chosen IPs and used HSDIRs, except in the event that one of the guard's downtime happens at the same time as an IP or HSDIR is chosen from the same family as the still-up guard. This is a much more rare and less risky event than the similar situation with an RP, though (since the RP cycles frequently and can be adversary controlled).
- When a primary guard leaves the consensus, pick a new one.
- When a primary guard fails circuits, do $MAGIC_FAILURE_HEURISTIC.
What is the $MAGIC_FAILURE_HEURISTIC supposed to do? Also I doubt we can do anything magic here, we even have trouble doing very naive stuff when it comes to network-uptime response.
In order to preserve property #8 (and #9), this failure heuristic has to try really hard not to quickly switch over to the second guard as soon as there is a RESOURCELIMIT or other failure. It needs to be "sure" that the guard is really down. This means waiting for some number of RESOURCELIMITs or other failures to happen before the switch to the second guard, which necessarily introduces some level of downtime signal, which costs us property #4. (We already have decided in https://trac.torproject.org/projects/tor/ticket/25347 that it is preferable to accept large amounts of RESOURCELIMITs before switching guards.)
That was the point of this proposal -- I wanted to demonstrate that with only one guard, we basically have to accept either a louder downtime signal, or we have to accept cases where we use two guards more often.
I still believe that two always-on guards is the better choice (and gives us more flexibility with alternate ways to handle things like family restrictions above), but I also wanted to compare apples to apples in terms of one guard vs two guard proposals.
This proposal gets strong:
- Hidden service use can't push you over to an unused guard (at all).
- Hidden service use can't influence your choice of guard (at all).
- Exits and websites can't push you over to an unused guard (at all)
- Less information from guard fingerprinting (the least information)
It loses #4 (and your reliability point above), because if we transition to a second guard too quickly when the first one starts failing, then we lose the winning fingerprinting property we want to keep. So then therefore, we must tolerate failure and RESOURCELIMIT issues and suffer through connectivity issues during DoS: 4. DoS/Guard node downtime signals are rare (absent)
It then gets us regular: 5. Nodes are not reused for Guard and Exit positions ("any" positions) 6. Information about the guard(s) does not leak to the website/RP (at all). 7. Relays in the same family can't be forced to correlate Exit traffic.
And again, we could get strong #6 if we allow the guard node for both RP and the node before the RP: 6. Information about the guard(s) does not leak to the website/RP (at all).
So the key thing (in this property list) that forcing one guard causes us to lose is reliability under DoS, which is a guard discovery vector (and probably a source of other side channels, too).