George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
While discussing proposal 247 with George yesterday, we realized that we still get security benefit from additional ephemeral hops beyond the vanguards themselves.
Recall the high-level 247 path design is:
C - L - M - S -- S - M - L - H
Where: C = Client L = Long lived Layer1 guard M = Medium lifespan Layer2 guard S = Short lifespan Layer3 guard H = Hidden Service (or hsdir) E = Ephemeral Hop (used below) I = Intro Point (used below) R = Rend Point (used below)
Hm, that's not entirely true. Currently prop247 only changes hidden service path selection. It does _not_ change hidden service _client_ path selection as described above. This is a new thing.
For the record, my understanding all along was that it would be opt-in for good while, on either client or service side.
From talking to Mike, he suggests that clients should use vanguards as well to
defend against client guard discovery attacks from HSDirs or intro points. Also he believes that having symmetry on circuits (both sides using vanguards) is something we want.
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
I agree this is worthwhile, if only to better understand the design space. However, I think we're going to find that most applications we envision can be induced into violating many of the ad-hoc mitigations we try to bake in.
before complicating path selection even more.
I feel like you're actually going to end up complicating the implementation more with this position. If we have to have separate path selection modes for service side and client side, we then have to maintain three different path selection mechanisms in Tor: normal exit, onion services, and onion clients.
If we gave the same options for both hidden services and clients, we are at least down to two systems (exit vs non-exit), with some minor options for each.
- Also, I like symmetry myself, but I wouldn't change path selection and security just for that _if I can help it_.
The benefit to just one more hop is easiest to see in the Introduction Point case, where George and I reasoned that it probably is a good idea to pick an intro point that is not the same as the Layer3 (S) set, otherwise the hidden service is effectively publishing its Layer3 guards in its descriptor, and using those same nodes to connect to rendezvous points. Clients probably also do not want the multi-visit linkability of using their layer3 vanguards to directly connect to an HS intro point. This means the intro circuit becomes:
C - L - M - S - E -- I - S - M - L - H
Similarly, in the rend case, hidden services probably do not want to expose their Layer3 (S) guards quite so easily to a client's chosen Rendezvous Point, and again, the client probably does not want to use their Layer3 (S) guards as its Rendezvous point, to avoid visit linkability. This means we again have 8 hops for rends:
C - L - M - S - R -- E - S - M - L - H
Unfortunately, this is starting to get ridiculous. While there are clear security benefits here, I think 8 hops is definitely at the point where we can forget about voice and other interactive traffic behaving reasonably. So what could we cut, if we wanted to?
Well, going back to the Prop247 threat model, we want the adversary to perform at least two attacks: a Sybil and one or mode node compromise attacks. So maybe we can (in some cases, or optionally?) eliminate the M nodes from the path. Since the linkability risks may be acceptable for some applications, maybe we can also optionally allow clients or servers to omit the ephemeral hop. This basically gives us three options for path lengths. Let's consider each path type:
Hsdir post/fetch:
- C - L - M - S - E - H
- C - L - S - E - H
- C - L - S - H
Intro:
- C - L - M - S - E -- I - S - M - L - H
- C - L - S - E -- I - S - L - H
*3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend:
- C - L - M - S - R -- E - S - M - L - H
- C - L - S - R -- E - S - L - H
- C - L - R&S -- S - L - H
What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
Hm, not sure what should be done here.
If we are actually worrying about HSDir/IP guard discovery attacks, my intuition tells me to take the most conservative approach on HSDir/IP circuits (so maybe do (1) and use vanguards and an extra ephemeral node), but leave the rendezvous circuits as they are now (so that they remain 7 hops). Maybe like this: "C - L - S - R -- S - L - H" or maybe without any vanguards at all :/
I think an extra hop on HSDir and intro circuits is not that terrible, but extra hops on rendezvous circuits _might_ make performance visibly worse.
I agree that the Rend case is the one to target with optimizations. The other two probably aren't even bothering with, if it saves implementation complexity.
However, I still have mixed feelings about changing client path selection as part of proposal 247:
- My main issue is that I think figuring out the right client path selection will require a _heavy_ amount of security analysis that will delay prop247 even more. I was hoping that we could treat the client-side as an orthogonal problem and tackle it in the future separately. But maybe I'm totally wrong and should be more patient and these two problems should be handled together.
I think patience is best, because if we don't understand this problem really well, we're liable to miss something. Or cement ourselves off from a potential future of interactive HS voice+video. Neither one is a great failure mode.
I think for many applications (esp the browser and ricochet), we're going to find that we need to protect the client just as much as the server.
- If the above changes only happen to HS circuits, we make it harder to make HS circuits indistinguishable from normal circuits on the face of traffic analysis. But maybe we have already lost this game.
We already lost that game until we have multihop padding. Proposal 247 already outlines how to use it in section 4.1 to help conceal vanguard usage.
It is also worth pointing out that if we fail to conceal the HS vanguard fingerprint entirely with padding, it will be especially valuable to have more than just 30k service-side instances with the vanguard fingerprint. Far better to have all the clients in that anonymity set, too, I think.
- Also, not sure how the load balancing will work here. It's one thing having 30k hidden services change their path selection, and another thing having 1 million clients change it. If we make it opt-in for clients, who is going to enable it? Probably only very few paranoid people, or maybe only Ricochet users.
The application should decide what mode it wants to be in, not the end-user, I think. It was my understanding that Prop247 specified optional behavior in section 5.4.
Plus, if you were thinking that enabling it by default only for service-side means "only 30k hidden services and not 1M clients", that's kinda wrong. Those 1M clients still have to use the services :). Adding default client-side vanguards only means that they're twice the traffic going over them.
But what about the others? Especially that Rend case? I really like the security properties of the full 8 hop paths, but it seems to me that for highly-interactive applications, we can provide the option for users to give up some of the unlinkability in exchange for that 4 hop circuit, which might actually allow for e2e hidden service voice and video to have a shot at working. Are there any risks with paths this short for that case?
Does it make sense to provide users with these different path length and latency options? I'm thinking that the service could list its preferred path length in its hsdesc, and the client could override that as it chooses (either for more or less security). Is that dangerous? We were already considering letting users choose their guard set sizes. Why not path lengths also (or instead)?
4-hop rendezvous circuits might be acceptable for some use cases. Maybe even for most use cases, I'm not really sure.
There is definitely some difference between requiring two compromise attacks and requiring a single compromise attack to deanonimize a target, but I'm not sure how big the difference is.
Also, I'm concerned about all the various linkability and intersection attacks that appear when we reduce the path length.
In general, I'm not sure how to make this security difference clear to users who enable the opt-in 4-hop rendezvous circuit feature. But maybe there is a way.
Again, I think this will be the application's choice more than the end-user's. For example, I could see Ricochet using full security until you want to make a voice call, then it spins up a new single-use low security address for just that call, sends it in-band over the high security one, and then disposes of it after the call. No long term risk, there.
We do have 3 bits or so in the top of the hidden service to communicate its security level, even :)
All in all, more thinking is required here :)
For sure.