While discussing proposal 247 with George yesterday, we realized that we still get security benefit from additional ephemeral hops beyond the vanguards themselves.
Recall the high-level 247 path design is:
C - L - M - S -- S - M - L - H
Where: C = Client L = Long lived Layer1 guard M = Medium lifespan Layer2 guard S = Short lifespan Layer3 guard H = Hidden Service (or hsdir) E = Ephemeral Hop (used below) I = Intro Point (used below) R = Rend Point (used below)
The benefit to just one more hop is easiest to see in the Introduction Point case, where George and I reasoned that it probably is a good idea to pick an intro point that is not the same as the Layer3 (S) set, otherwise the hidden service is effectively publishing its Layer3 guards in its descriptor, and using those same nodes to connect to rendezvous points. Clients probably also do not want the multi-visit linkability of using their layer3 vanguards to directly connect to an HS intro point. This means the intro circuit becomes:
C - L - M - S - E -- I - S - M - L - H
Similarly, in the rend case, hidden services probably do not want to expose their Layer3 (S) guards quite so easily to a client's chosen Rendezvous Point, and again, the client probably does not want to use their Layer3 (S) guards as its Rendezvous point, to avoid visit linkability. This means we again have 8 hops for rends:
C - L - M - S - R -- E - S - M - L - H
Unfortunately, this is starting to get ridiculous. While there are clear security benefits here, I think 8 hops is definitely at the point where we can forget about voice and other interactive traffic behaving reasonably. So what could we cut, if we wanted to?
Well, going back to the Prop247 threat model, we want the adversary to perform at least two attacks: a Sybil and one or mode node compromise attacks. So maybe we can (in some cases, or optionally?) eliminate the M nodes from the path. Since the linkability risks may be acceptable for some applications, maybe we can also optionally allow clients or servers to omit the ephemeral hop. This basically gives us three options for path lengths. Let's consider each path type:
Hsdir post/fetch: 1. C - L - M - S - E - H 2. C - L - S - E - H 3. C - L - S - H
Intro: 1. C - L - M - S - E -- I - S - M - L - H 2. C - L - S - E -- I - S - L - H *3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend: 1. C - L - M - S - R -- E - S - M - L - H 2. C - L - S - R -- E - S - L - H 3. C - L - R&S -- S - L - H
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
But what about the others? Especially that Rend case? I really like the security properties of the full 8 hop paths, but it seems to me that for highly-interactive applications, we can provide the option for users to give up some of the unlinkability in exchange for that 4 hop circuit, which might actually allow for e2e hidden service voice and video to have a shot at working. Are there any risks with paths this short for that case?
Does it make sense to provide users with these different path length and latency options? I'm thinking that the service could list its preferred path length in its hsdesc, and the client could override that as it chooses (either for more or less security). Is that dangerous? We were already considering letting users choose their guard set sizes. Why not path lengths also (or instead)?
Mike Perry mikeperry@torproject.org writes:
While discussing proposal 247 with George yesterday, we realized that we still get security benefit from additional ephemeral hops beyond the vanguards themselves.
Recall the high-level 247 path design is:
C - L - M - S -- S - M - L - H
Where: C = Client L = Long lived Layer1 guard M = Medium lifespan Layer2 guard S = Short lifespan Layer3 guard H = Hidden Service (or hsdir) E = Ephemeral Hop (used below) I = Intro Point (used below) R = Rend Point (used below)
Hm, that's not entirely true. Currently prop247 only changes hidden service path selection. It does _not_ change hidden service _client_ path selection as described above. This is a new thing.
From talking to Mike, he suggests that clients should use vanguards as well to
defend against client guard discovery attacks from HSDirs or intro points. Also he believes that having symmetry on circuits (both sides using vanguards) is something we want.
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them, before complicating path selection even more.
- Also, I like symmetry myself, but I wouldn't change path selection and security just for that _if I can help it_.
The benefit to just one more hop is easiest to see in the Introduction Point case, where George and I reasoned that it probably is a good idea to pick an intro point that is not the same as the Layer3 (S) set, otherwise the hidden service is effectively publishing its Layer3 guards in its descriptor, and using those same nodes to connect to rendezvous points. Clients probably also do not want the multi-visit linkability of using their layer3 vanguards to directly connect to an HS intro point. This means the intro circuit becomes:
C - L - M - S - E -- I - S - M - L - H
Similarly, in the rend case, hidden services probably do not want to expose their Layer3 (S) guards quite so easily to a client's chosen Rendezvous Point, and again, the client probably does not want to use their Layer3 (S) guards as its Rendezvous point, to avoid visit linkability. This means we again have 8 hops for rends:
C - L - M - S - R -- E - S - M - L - H
Unfortunately, this is starting to get ridiculous. While there are clear security benefits here, I think 8 hops is definitely at the point where we can forget about voice and other interactive traffic behaving reasonably. So what could we cut, if we wanted to?
Well, going back to the Prop247 threat model, we want the adversary to perform at least two attacks: a Sybil and one or mode node compromise attacks. So maybe we can (in some cases, or optionally?) eliminate the M nodes from the path. Since the linkability risks may be acceptable for some applications, maybe we can also optionally allow clients or servers to omit the ephemeral hop. This basically gives us three options for path lengths. Let's consider each path type:
Hsdir post/fetch:
- C - L - M - S - E - H
- C - L - S - E - H
- C - L - S - H
Intro:
- C - L - M - S - E -- I - S - M - L - H
- C - L - S - E -- I - S - L - H
*3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend:
- C - L - M - S - R -- E - S - M - L - H
- C - L - S - R -- E - S - L - H
- C - L - R&S -- S - L - H
What is R&S is here? Clients use static short-lifespan rendezvous points?
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
Hm, not sure what should be done here.
If we are actually worrying about HSDir/IP guard discovery attacks, my intuition tells me to take the most conservative approach on HSDir/IP circuits (so maybe do (1) and use vanguards and an extra ephemeral node), but leave the rendezvous circuits as they are now (so that they remain 7 hops). Maybe like this: "C - L - S - R -- S - L - H" or maybe without any vanguards at all :/
I think an extra hop on HSDir and intro circuits is not that terrible, but extra hops on rendezvous circuits _might_ make performance visibly worse.
However, I still have mixed feelings about changing client path selection as part of proposal 247:
- My main issue is that I think figuring out the right client path selection will require a _heavy_ amount of security analysis that will delay prop247 even more. I was hoping that we could treat the client-side as an orthogonal problem and tackle it in the future separately. But maybe I'm totally wrong and should be more patient and these two problems should be handled together.
- If the above changes only happen to HS circuits, we make it harder to make HS circuits indistinguishable from normal circuits on the face of traffic analysis. But maybe we have already lost this game.
- Also, not sure how the load balancing will work here. It's one thing having 30k hidden services change their path selection, and another thing having 1 million clients change it. If we make it opt-in for clients, who is going to enable it? Probably only very few paranoid people, or maybe only Ricochet users.
But what about the others? Especially that Rend case? I really like the security properties of the full 8 hop paths, but it seems to me that for highly-interactive applications, we can provide the option for users to give up some of the unlinkability in exchange for that 4 hop circuit, which might actually allow for e2e hidden service voice and video to have a shot at working. Are there any risks with paths this short for that case?
Does it make sense to provide users with these different path length and latency options? I'm thinking that the service could list its preferred path length in its hsdesc, and the client could override that as it chooses (either for more or less security). Is that dangerous? We were already considering letting users choose their guard set sizes. Why not path lengths also (or instead)?
4-hop rendezvous circuits might be acceptable for some use cases. Maybe even for most use cases, I'm not really sure.
There is definitely some difference between requiring two compromise attacks and requiring a single compromise attack to deanonimize a target, but I'm not sure how big the difference is.
Also, I'm concerned about all the various linkability and intersection attacks that appear when we reduce the path length.
In general, I'm not sure how to make this security difference clear to users who enable the opt-in 4-hop rendezvous circuit feature. But maybe there is a way.
All in all, more thinking is required here :)
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
While discussing proposal 247 with George yesterday, we realized that we still get security benefit from additional ephemeral hops beyond the vanguards themselves.
Recall the high-level 247 path design is:
C - L - M - S -- S - M - L - H
Where: C = Client L = Long lived Layer1 guard M = Medium lifespan Layer2 guard S = Short lifespan Layer3 guard H = Hidden Service (or hsdir) E = Ephemeral Hop (used below) I = Intro Point (used below) R = Rend Point (used below)
Hm, that's not entirely true. Currently prop247 only changes hidden service path selection. It does _not_ change hidden service _client_ path selection as described above. This is a new thing.
For the record, my understanding all along was that it would be opt-in for good while, on either client or service side.
From talking to Mike, he suggests that clients should use vanguards as well to
defend against client guard discovery attacks from HSDirs or intro points. Also he believes that having symmetry on circuits (both sides using vanguards) is something we want.
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
I agree this is worthwhile, if only to better understand the design space. However, I think we're going to find that most applications we envision can be induced into violating many of the ad-hoc mitigations we try to bake in.
before complicating path selection even more.
I feel like you're actually going to end up complicating the implementation more with this position. If we have to have separate path selection modes for service side and client side, we then have to maintain three different path selection mechanisms in Tor: normal exit, onion services, and onion clients.
If we gave the same options for both hidden services and clients, we are at least down to two systems (exit vs non-exit), with some minor options for each.
- Also, I like symmetry myself, but I wouldn't change path selection and security just for that _if I can help it_.
The benefit to just one more hop is easiest to see in the Introduction Point case, where George and I reasoned that it probably is a good idea to pick an intro point that is not the same as the Layer3 (S) set, otherwise the hidden service is effectively publishing its Layer3 guards in its descriptor, and using those same nodes to connect to rendezvous points. Clients probably also do not want the multi-visit linkability of using their layer3 vanguards to directly connect to an HS intro point. This means the intro circuit becomes:
C - L - M - S - E -- I - S - M - L - H
Similarly, in the rend case, hidden services probably do not want to expose their Layer3 (S) guards quite so easily to a client's chosen Rendezvous Point, and again, the client probably does not want to use their Layer3 (S) guards as its Rendezvous point, to avoid visit linkability. This means we again have 8 hops for rends:
C - L - M - S - R -- E - S - M - L - H
Unfortunately, this is starting to get ridiculous. While there are clear security benefits here, I think 8 hops is definitely at the point where we can forget about voice and other interactive traffic behaving reasonably. So what could we cut, if we wanted to?
Well, going back to the Prop247 threat model, we want the adversary to perform at least two attacks: a Sybil and one or mode node compromise attacks. So maybe we can (in some cases, or optionally?) eliminate the M nodes from the path. Since the linkability risks may be acceptable for some applications, maybe we can also optionally allow clients or servers to omit the ephemeral hop. This basically gives us three options for path lengths. Let's consider each path type:
Hsdir post/fetch:
- C - L - M - S - E - H
- C - L - S - E - H
- C - L - S - H
Intro:
- C - L - M - S - E -- I - S - M - L - H
- C - L - S - E -- I - S - L - H
*3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend:
- C - L - M - S - R -- E - S - M - L - H
- C - L - S - R -- E - S - L - H
- C - L - R&S -- S - L - H
What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
Hm, not sure what should be done here.
If we are actually worrying about HSDir/IP guard discovery attacks, my intuition tells me to take the most conservative approach on HSDir/IP circuits (so maybe do (1) and use vanguards and an extra ephemeral node), but leave the rendezvous circuits as they are now (so that they remain 7 hops). Maybe like this: "C - L - S - R -- S - L - H" or maybe without any vanguards at all :/
I think an extra hop on HSDir and intro circuits is not that terrible, but extra hops on rendezvous circuits _might_ make performance visibly worse.
I agree that the Rend case is the one to target with optimizations. The other two probably aren't even bothering with, if it saves implementation complexity.
However, I still have mixed feelings about changing client path selection as part of proposal 247:
- My main issue is that I think figuring out the right client path selection will require a _heavy_ amount of security analysis that will delay prop247 even more. I was hoping that we could treat the client-side as an orthogonal problem and tackle it in the future separately. But maybe I'm totally wrong and should be more patient and these two problems should be handled together.
I think patience is best, because if we don't understand this problem really well, we're liable to miss something. Or cement ourselves off from a potential future of interactive HS voice+video. Neither one is a great failure mode.
I think for many applications (esp the browser and ricochet), we're going to find that we need to protect the client just as much as the server.
- If the above changes only happen to HS circuits, we make it harder to make HS circuits indistinguishable from normal circuits on the face of traffic analysis. But maybe we have already lost this game.
We already lost that game until we have multihop padding. Proposal 247 already outlines how to use it in section 4.1 to help conceal vanguard usage.
It is also worth pointing out that if we fail to conceal the HS vanguard fingerprint entirely with padding, it will be especially valuable to have more than just 30k service-side instances with the vanguard fingerprint. Far better to have all the clients in that anonymity set, too, I think.
- Also, not sure how the load balancing will work here. It's one thing having 30k hidden services change their path selection, and another thing having 1 million clients change it. If we make it opt-in for clients, who is going to enable it? Probably only very few paranoid people, or maybe only Ricochet users.
The application should decide what mode it wants to be in, not the end-user, I think. It was my understanding that Prop247 specified optional behavior in section 5.4.
Plus, if you were thinking that enabling it by default only for service-side means "only 30k hidden services and not 1M clients", that's kinda wrong. Those 1M clients still have to use the services :). Adding default client-side vanguards only means that they're twice the traffic going over them.
But what about the others? Especially that Rend case? I really like the security properties of the full 8 hop paths, but it seems to me that for highly-interactive applications, we can provide the option for users to give up some of the unlinkability in exchange for that 4 hop circuit, which might actually allow for e2e hidden service voice and video to have a shot at working. Are there any risks with paths this short for that case?
Does it make sense to provide users with these different path length and latency options? I'm thinking that the service could list its preferred path length in its hsdesc, and the client could override that as it chooses (either for more or less security). Is that dangerous? We were already considering letting users choose their guard set sizes. Why not path lengths also (or instead)?
4-hop rendezvous circuits might be acceptable for some use cases. Maybe even for most use cases, I'm not really sure.
There is definitely some difference between requiring two compromise attacks and requiring a single compromise attack to deanonimize a target, but I'm not sure how big the difference is.
Also, I'm concerned about all the various linkability and intersection attacks that appear when we reduce the path length.
In general, I'm not sure how to make this security difference clear to users who enable the opt-in 4-hop rendezvous circuit feature. But maybe there is a way.
Again, I think this will be the application's choice more than the end-user's. For example, I could see Ricochet using full security until you want to make a voice call, then it spins up a new single-use low security address for just that call, sends it in-band over the high security one, and then disposes of it after the call. No long term risk, there.
We do have 3 bits or so in the top of the hidden service to communicate its security level, even :)
All in all, more thinking is required here :)
For sure.
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry <mikeperry at torproject.org> writes:
<snip>
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think
we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
I agree this is worthwhile, if only to better understand the design space. However, I think we're going to find that most applications we envision can be induced into violating many of the ad-hoc mitigations we try to bake in.
OK. Let's see. I feel that these guard discovery attacks can be blocked with:
a) If an IP listed on an HS descriptor tells you that it doesn't know the HS, then ignore it for this hidden service today.
b) If an HSDir that should have an HS descriptor tells you that it doesn't have it, then don't ask it again this hour.
I think we do both checks right now in the Tor codebase and we also have caches so that we don't retry the same nodes. If we are serious, we could even write those caches on disk.
I feel that if an application restarts Tor or flushes those caches because a hidden service does not work, then the application is doing it wrong.
Also even with client vanguards I think the checks above will still have to be implemented. I could imagine an application that flushes all the DataDirectory if the hidden service stops working, and then even vanguards won't save them.
In general, I'm not sure how much sanity we can assume from third-party applications.
before complicating path selection even more.
I feel like you're actually going to end up complicating the implementation more with this position. If we have to have separate path selection modes for service side and client side, we then have to maintain three different path selection mechanisms in Tor: normal exit, onion services, and onion clients.
If we gave the same options for both hidden services and clients, we are at least down to two systems (exit vs non-exit), with some minor options for each.
Hmmm maybe. But onion clients would look very much like normal exit, but they would connect to RPs/IPs, instead of exits. Just like the code is now.
Also, with vanguards if we end up doing something like:
HSDir: C - L - S - E - HSDir IP: C - L - S - E - IP Rend: C - L - M - RP -- S - M - L - HS
we have three different path types here. We would need to write very beautiful interfaces if we want this to be done by the same code.
- Also, I like symmetry myself, but I wouldn't change path selection and security just for that _if I can help it_.
<snip>
Hsdir post/fetch:
- C - L - M - S - E - HC - L - M - S - E - H
- C - L - S - E - H
- C - L - S - H
Intro:
- C - L - M - S - E -- I - S - M - L - H
- C - L - S - E -- I - S - L - H
*3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend:
- C - L - M - S - R -- E - S - M - L - H
- C - L - S - R -- E - S - L - H
- C - L - R&S -- S - L - H
What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
Another _big_ gotcha here is that let's say we end up doing:
HSDir: C - L - M - S - E - HSDir IP: C - L - M - S - E - IP Rend: C - L - S - RP -- S - M - L - HS
and all the 'S' nodes are taken from the same pool, then the 'L' node will be able to learn 'M' by looking at the IP circuits, and learn 'S' by looking at the rend circuit. So it will basically be able to derive the full circuit.
We need to be very careful about which paths we pick, and which "guardsets" we get the nodes from.
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
<snip>
However, I still have mixed feelings about changing client path selection as part of proposal 247:
- My main issue is that I think figuring out the right client path
selection will require a _heavy_ amount of security analysis that will delay prop247 even more. I was hoping that we could treat the client-side as an orthogonal problem and tackle it in the future separately. But maybe I'm totally wrong and should be more patient and these two problems should be handled together.
I think patience is best, because if we don't understand this problem really well, we're liable to miss something. Or cement ourselves off from a potential future of interactive HS voice+video. Neither one is a great failure mode.
Agreed.
I think for many applications (esp the browser and ricochet), we're going to find that we need to protect the client just as much as the server.
- If the above changes only happen to HS circuits, we make it harder to
make HS circuits indistinguishable from normal circuits on the face of traffic analysis. But maybe we have already lost this game.
We already lost that game until we have multihop padding. Proposal 247 already outlines how to use it in section 4.1 to help conceal vanguard usage.
It is also worth pointing out that if we fail to conceal the HS vanguard fingerprint entirely with padding, it will be especially valuable to have more than just 30k service-side instances with the vanguard fingerprint. Far better to have all the clients in that anonymity set, too, I think.
Yes that's true. This seems to be the main argument for doing client vanguards right now for me.
However, to actually achieve any sort of confusion here, we need to ensure that the paths between clients and HSes are symmetric. So for example if we end up doing:
C - L - S - E -- IP - S - M - L - H
then the L guard could distinguish clients from HSes by looking at whether the second hop is short lived ('S') or medium lived ('M').
Woohoo! Anonymity!
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
I agree this is worthwhile, if only to better understand the design space. However, I think we're going to find that most applications we envision can be induced into violating many of the ad-hoc mitigations we try to bake in.
OK. Let's see. I feel that these guard discovery attacks can be blocked with:
a) If an IP listed on an HS descriptor tells you that it doesn't know the HS, then ignore it for this hidden service today.
b) If an HSDir that should have an HS descriptor tells you that it doesn't have it, then don't ask it again this hour.
I think we do both checks right now in the Tor codebase and we also have caches so that we don't retry the same nodes. If we are serious, we could even write those caches on disk.
I feel that if an application restarts Tor or flushes those caches because a hidden service does not work, then the application is doing it wrong.
Ok, well consider the browser. All that has to happen for Guard discovery is a bunch of nested iframes for many different hidden services, perhaps injected by the exit. We protect against this to some degree for non-HS traffic by using SOCKS u+p isolation in combination with keeping circuits open as long as they are used (#15482). But for HS traffic, new circuits will be built for each new HS address that is accessed, so we don't have the same ability to limit circuit creation.
For other things, like Ricochet, subtler failure modes can be introduced to cause circuit churn without repeated hsdir/IP activity, once you bring the full application layer into scope. Say I'm a compromised/malicious Ricochet user looking to track down activists, marginalized folks, whistleblowers, etc. I could rig my Ricochet to fail the rend circuit periodically, waiting for them to reconnect to me as an HS client over and over, until my malicious middle was chosen next to the target's guard. Ricochet (indeed, many P2P protocols) will keep reconnecting in this case.
Maybe this means that Ricochet made a mistake in using HS circuits in "full duplex" mode, where the application is agnostic wrt who initiates the connection, and both sides keep retrying. However, I suspect that all P2P protocols are going to make this mistake. If we manage to get HS endpoints working as WebRTC endpoints, then WebRTC calls/connections will also naturally end up with this problem as well. Probably just about anything designed for symmetric P2P Internet connections will also make this mistake.
Also even with client vanguards I think the checks above will still have to be implemented. I could imagine an application that flushes all the DataDirectory if the hidden service stops working, and then even vanguards won't save them.
In general, I'm not sure how much sanity we can assume from third-party applications.
I think even our own applications are going to surprise us. One of the things I had to repeatedly argue years ago was "Kill .exit notation: Path selection must not be capable of being influenced by untrusted content from the application layer." People whined and cried and whined and cried when .exit finally vanished from TBB, but it was really necessary to prevent all sorts of path manipulation+capture attacks.
Any time where the application can be induced into making new paths through the Tor network, that is vulnerability surface. For some applications, they actually *must* be allowed to make new circuits based on untrusted/semitrusted input, so the only thing we can do at the Tor network layer is to restrict the paths of those circuits to limit exposure.
My current thinking is that long-term, I still like "virtual circuits" for client exit traffic (https://trac.torproject.org/projects/tor/ticket/15458). Maybe that can be used for HS clients, too, but it kinda gets messy in that we'll want to keep re-using HS paths for different HS addrs with the same SOCKS u+p, which may have other problems. I could be talked into it instead of client vanguards, though.
before complicating path selection even more.
I feel like you're actually going to end up complicating the implementation more with this position. If we have to have separate path selection modes for service side and client side, we then have to maintain three different path selection mechanisms in Tor: normal exit, onion services, and onion clients.
If we gave the same options for both hidden services and clients, we are at least down to two systems (exit vs non-exit), with some minor options for each.
Hmmm maybe. But onion clients would look very much like normal exit, but they would connect to RPs/IPs, instead of exits. Just like the code is now.
Also, with vanguards if we end up doing something like:
HSDir: C - L - S - E - HSDir IP: C - L - S - E - IP Rend: C - L - M - RP -- S - M - L - HS
we have three different path types here. We would need to write very beautiful interfaces if we want this to be done by the same code.
- Also, I like symmetry myself, but I wouldn't change path selection and security just for that _if I can help it_.
<snip>
Hsdir post/fetch:
- C - L - M - S - E - H
- C - L - S - E - H
- C - L - S - H
Intro:
- C - L - M - S - E -- I - S - M - L - H
- C - L - S - E -- I - S - L - H
*3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend:
- C - L - M - S - R -- E - S - M - L - H
- C - L - S - R -- E - S - L - H
- C - L - R&S -- S - L - H
What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
Another _big_ gotcha here is that let's say we end up doing:
HSDir: C - L - M - S - E - HSDir IP: C - L - M - S - E - IP Rend: C - L - S - RP -- S - M - L - HS
and all the 'S' nodes are taken from the same pool, then the 'L' node will be able to learn 'M' by looking at the IP circuits, and learn 'S' by looking at the rend circuit. So it will basically be able to derive the full circuit.
We need to be very careful about which paths we pick, and which "guardsets" we get the nodes from.
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
<snip>
However, I still have mixed feelings about changing client path selection as part of proposal 247:
- My main issue is that I think figuring out the right client path
selection will require a _heavy_ amount of security analysis that will delay prop247 even more. I was hoping that we could treat the client-side as an orthogonal problem and tackle it in the future separately. But maybe I'm totally wrong and should be more patient and these two problems should be handled together.
I think patience is best, because if we don't understand this problem really well, we're liable to miss something. Or cement ourselves off from a potential future of interactive HS voice+video. Neither one is a great failure mode.
Agreed.
I think for many applications (esp the browser and ricochet), we're going to find that we need to protect the client just as much as the server.
- If the above changes only happen to HS circuits, we make it harder to
make HS circuits indistinguishable from normal circuits on the face of traffic analysis. But maybe we have already lost this game.
We already lost that game until we have multihop padding. Proposal 247 already outlines how to use it in section 4.1 to help conceal vanguard usage.
It is also worth pointing out that if we fail to conceal the HS vanguard fingerprint entirely with padding, it will be especially valuable to have more than just 30k service-side instances with the vanguard fingerprint. Far better to have all the clients in that anonymity set, too, I think.
Yes that's true. This seems to be the main argument for doing client vanguards right now for me.
However, to actually achieve any sort of confusion here, we need to ensure that the paths between clients and HSes are symmetric. So for example if we end up doing:
C - L - S - E -- IP - S - M - L - H
then the L guard could distinguish clients from HSes by looking at whether the second hop is short lived ('S') or medium lived ('M').
Ok, I think this, as well as your complexity argument earlier are great reasons not to mix and match strategy #1 with #2 or #3. If we do provide security vs latency tradeoff options, I'm now convinced that tradeoff should be consistent for all paths that an HS uses for all of its circuits.
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
<snip> > > > > > > Hsdir post/fetch: > > 1. C - L - M - S - E - H > > 2. C - L - S - E - H > > 3. C - L - S - H > > > > Intro: > > 1. C - L - M - S - E -- I - S - M - L - H > > 2. C - L - S - E -- I - S - L - H > > *3. C - L - S -- I&S - L - H (* IP Intersection attack!) > > > > Rend: > > 1. C - L - M - S - R -- E - S - M - L - H > > 2. C - L - S - R -- E - S - L - H > > 3. C - L - R&S -- S - L - H > > > > What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
<snip>
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Another issue with Rend#3 is that the hidden service will be able to link client visits (for a Short while) using the client R&S as a selector.
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
<snip> > > > > > > Hsdir post/fetch: > > 1. C - L - M - S - E - H > > 2. C - L - S - E - H > > 3. C - L - S - H > > > > Intro: > > 1. C - L - M - S - E -- I - S - M - L - H > > 2. C - L - S - E -- I - S - L - H > > *3. C - L - S -- I&S - L - H (* IP Intersection attack!) > > > > Rend: > > 1. C - L - M - S - R -- E - S - M - L - H > > 2. C - L - S - R -- E - S - L - H > > 3. C - L - R&S -- S - L - H > > > > What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
<snip>
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Another issue with Rend#3 is that the hidden service will be able to link client visits (for a Short while) using the client R&S as a selector.
I am inclined to accept this risk, since really as you said it is not a sure shot. You need a lot of connections before the S set is reused enough to indicate it's the same client, and then even with each subsequent individual visit all you have is probability bias, not a sure sign. I'm inclined to think that this partial linkability leak is is acceptable enough risk to say "Well yeah, lower security for better latency. You're getting what you're paying for."
I'm also really expecting these low security addresses to be most useful in P2P, where linkability is already kinda out the window (but can still be maintained if the addresses are ephemeral/short lived).
We should keep thinking about other issues, because unless there are other, additional problems with R&S, I don't think this one kills it.
One thing worth noting is that you definitely want separate vanguard sets for high security and low security services, of course. Prop#247 was already leaning towards separate vanguard sets for each service service-side address, in Section 4.2. That seems excessive now for the high security services, due to the addition of ephemeral hops, but keeping the low and high sets separate is necessary, as you pointed out with your other attack.
Here's another application-layer issue: if low security services exist, then the application will need some way to differentiate them before being induced to connect to them, especially repeatedly over time. For example, Ricochet should forbid you from accidentally using a low security onion address for a contact addr, and the browser should probably forbid all low security HS addresses from being used as content elements, unless the url bar is also a low security HS address. Both apps should probably specifically generate low security addrs for WebRTC/voice calls, though.
I think a similar argument could be made for also differentiating RSOS/SOS addresses, post-224.
For these reasons, I'm still really liking the idea of using those spare 224 address bits to indicate HS security level (and to additionally differentiate between RSOS/SOS). That would make it much easier for the application to avoid being tricked into using weaker addresses in bad situations. In fact, the Tor client could even default to forbidding the use of them unless the application specifically turns on support.
On 21 Jan (10:28:16), Mike Perry wrote:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
<snip> > > > > > > Hsdir post/fetch: > > 1. C - L - M - S - E - H > > 2. C - L - S - E - H > > 3. C - L - S - H > > > > Intro: > > 1. C - L - M - S - E -- I - S - M - L - H > > 2. C - L - S - E -- I - S - L - H > > *3. C - L - S -- I&S - L - H (* IP Intersection attack!) > > > > Rend: > > 1. C - L - M - S - R -- E - S - M - L - H > > 2. C - L - S - R -- E - S - L - H > > 3. C - L - R&S -- S - L - H > > > > What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
<snip>
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Another issue with Rend#3 is that the hidden service will be able to link client visits (for a Short while) using the client R&S as a selector.
I am inclined to accept this risk, since really as you said it is not a sure shot. You need a lot of connections before the S set is reused enough to indicate it's the same client, and then even with each subsequent individual visit all you have is probability bias, not a sure sign. I'm inclined to think that this partial linkability leak is is acceptable enough risk to say "Well yeah, lower security for better latency. You're getting what you're paying for."
Here is my two cents on Rend#3 (hopefully if I understand correctly the R&S concept).
Currently, a client does reuse a RP circuit for the same .onion. So, let's say TBB opens blah.onion and has 25 HTTP GET to do for blah.onion, one single RP circuit will be used. Then I _think_ that circuit is closed after a while by timeout if unused. Then user goes back after an hour to blah.onion and will reopen a circuit with a _new_ R&S which is well below the 12 hour rotation period. And those that for some times but before rotation of R&S happens (btw, I do that _ALL_ the times, I open an HS and refresh it like once every 30 minutes which I think makes me use a new RP node every time.)
This means that if L is malicious, it will see from C some connections always going to the same subset of relays (prop247 defines S as 4*4 nodes). This is _small_ and easy to know that it's indeed a client connecting to a low security HS. I mean just inducing circuit failure here at L will be enough without having a legitimate use case that actually also allow the linkability.
However, that could be mitigated by an application keeping the RP circuit alive by sending heartbeats for instance (Ricochet?) since even though the set of R&S rotates after 12 hours, a circuit using an R&S node that is being rotated should NOT be killed in any circumstances making it "long lived" which makes it much more difficult for L to learn anything. But this is putting lots and lots of responsability on the application side :S but again it's low security and should only be used by people knowing what the hell they are doing. Is it what we want?
Thus, I'm incline to say that Rend#3 is fine but in a controlled environment... altough the fact that the HS doesn't use an E before connecting to R could be dicy. As an attacker client controlling R, I make your HS fail every circuit until you go through all your S. Chances are I do not control one of your S at first but then 12h later you rotate so I can do that for days (<90 days) until you pick a S I control and game over, I learn L.
S size is 4*4 and rotation happens twice a day so 160 rotations for 80 days (leaving a 10 days to raid your L) == potentially 2560 nodes being tested. That is an awful big chance of you picking my S. Multiple ways to mitigate that but it comes down to special cases in the code :S.
What I really like about Rend#2 though is symmetry. Implementation wise it's much more easier (using mike's argument) and we treat both the client and HS at the same security level. However, tradeoff is that if anything happens to our design in the future and both side will be affected probably meaning deanonymizing C and H with one single technique (currently the case anyway).
I'm also really expecting these low security addresses to be most useful in P2P, where linkability is already kinda out the window (but can still be maintained if the addresses are ephemeral/short lived).
We should keep thinking about other issues, because unless there are other, additional problems with R&S, I don't think this one kills it.
One thing worth noting is that you definitely want separate vanguard sets for high security and low security services, of course. Prop#247 was already leaning towards separate vanguard sets for each service service-side address, in Section 4.2. That seems excessive now for the high security services, due to the addition of ephemeral hops, but keeping the low and high sets separate is necessary, as you pointed out with your other attack.
Here's another application-layer issue: if low security services exist, then the application will need some way to differentiate them before being induced to connect to them, especially repeatedly over time. For example, Ricochet should forbid you from accidentally using a low security onion address for a contact addr, and the browser should probably forbid all low security HS addresses from being used as content elements, unless the url bar is also a low security HS address. Both apps should probably specifically generate low security addrs for WebRTC/voice calls, though.
I think a similar argument could be made for also differentiating RSOS/SOS addresses, post-224.
For these reasons, I'm still really liking the idea of using those spare 224 address bits to indicate HS security level (and to additionally differentiate between RSOS/SOS). That would make it much easier for the application to avoid being tricked into using weaker addresses in bad situations. In fact, the Tor client could even default to forbidding the use of them unless the application specifically turns on support.
That could be a useful way of using those bits indeed! Versionning was also in list for this but that also can be fixed by using one more byte anyway...
David
-- Mike Perry
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
David Goulet:
On 21 Jan (10:28:16), Mike Perry wrote:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis: > I have mixed feelings about this. > > - If client guard discovery is the main reason we are doing this, > I think we should first look into these guard discovery vectors > individually and figure out how concerning they are and if there > is anything else we can do to block them,
<snip> > > > > > > Hsdir post/fetch: > > 1. C - L - M - S - E - H > > 2. C - L - S - E - H > > 3. C - L - S - H > > > > Intro: > > 1. C - L - M - S - E -- I - S - M - L - H > > 2. C - L - S - E -- I - S - L - H > > *3. C - L - S -- I&S - L - H (* IP Intersection attack!) > > > > Rend: > > 1. C - L - M - S - R -- E - S - M - L - H > > 2. C - L - S - R -- E - S - L - H > > 3. C - L - R&S -- S - L - H > > > > What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
<snip>
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Another issue with Rend#3 is that the hidden service will be able to link client visits (for a Short while) using the client R&S as a selector.
I am inclined to accept this risk, since really as you said it is not a sure shot. You need a lot of connections before the S set is reused enough to indicate it's the same client, and then even with each subsequent individual visit all you have is probability bias, not a sure sign. I'm inclined to think that this partial linkability leak is is acceptable enough risk to say "Well yeah, lower security for better latency. You're getting what you're paying for."
Here is my two cents on Rend#3 (hopefully if I understand correctly the R&S concept).
Currently, a client does reuse a RP circuit for the same .onion. So, let's say TBB opens blah.onion and has 25 HTTP GET to do for blah.onion, one single RP circuit will be used. Then I _think_ that circuit is closed after a while by timeout if unused. Then user goes back after an hour to blah.onion and will reopen a circuit with a _new_ R&S which is well below the 12 hour rotation period. And those that for some times but before rotation of R&S happens (btw, I do that _ALL_ the times, I open an HS and refresh it like once every 30 minutes which I think makes me use a new RP node every time.)
This means that if L is malicious, it will see from C some connections always going to the same subset of relays (prop247 defines S as 4*4 nodes). This is _small_ and easy to know that it's indeed a client connecting to a low security HS. I mean just inducing circuit failure here at L will be enough without having a legitimate use case that actually also allow the linkability.
Ah, yes, I think you're right here.
However, that could be mitigated by an application keeping the RP circuit alive by sending heartbeats for instance (Ricochet?) since even though the set of R&S rotates after 12 hours, a circuit using an R&S node that is being rotated should NOT be killed in any circumstances making it "long lived" which makes it much more difficult for L to learn anything. But this is putting lots and lots of responsibility on the application side :S but again it's low security and should only be used by people knowing what the hell they are doing. Is it what we want?
I don't know. I think pushing solutions off to the application layer is dangerous. One option that might make it safer is to make these addresses single-use or limited-use only (for ex: for a voice call or file xfer negotiated over a higher security long-term circuit), and enforce that in Tor itself.
Thus, I'm incline to say that Rend#3 is fine but in a controlled environment... although the fact that the HS doesn't use an E before connecting to R could be dicy. As an attacker client controlling R, I make your HS fail every circuit until you go through all your S. Chances are I do not control one of your S at first but then 12h later you rotate so I can do that for days (<90 days) until you pick a S I control and game over, I learn L.
S size is 4*4 and rotation happens twice a day so 160 rotations for 80 days (leaving a 10 days to raid your L) == potentially 2560 nodes being tested. That is an awful big chance of you picking my S. Multiple ways to mitigate that but it comes down to special cases in the code :S.
What I really like about Rend#2 though is symmetry. Implementation wise it's much more easier (using mike's argument) and we treat both the client and HS at the same security level. However, tradeoff is that if anything happens to our design in the future and both side will be affected probably meaning deanonymizing C and H with one single technique (currently the case anyway).
After thinking about it, I think I am also leaning towards Rend#2 for low-security also for traffic analysis reasons. If we have 3 hops on each side, we get to make use of padding to the middle in the same way for these circuits as we would for everything else.
OTOH, with only 2 hops, a malicious guard and a malicious R&S get to know where you are going with some probability that is a function of the base rate of connections from the service's S to the client's R&S node. This base rate may be actually very low in practice, making it more certain that a particular extend is for a certain low-sec service rend connection. This is compounded by the fact that it is possible to at least know the set of S's that correspond to a given low-sec hidden service using lots of connections/circuit failure, as you said.
If we restrict Rend#3 to single-use, though, we *might* be able to disguise the rend handshake as another circuit extension from the service side, but the client side doesn't have this option.
Is single-use (or limit to N uses) too much of a restriction for low-sec services? Or do we defer it to later, and only implement Rend#1 and Rend#2 for now? I suppose if we do go with single-use, we still have to think about how long to keep the vanguards for it around. That also seems complicated and maybe application-dependent, unless we make the usage limits global or something..
In total, it seems like analysis simplicity, implementation simplicity, and safety are favoring Rend#1 and/or Rend#2 right now. We could implement the Rend#3-yolo option later when we understand a bit more about how to work with vanguards, or just make it available to controllers by exposing HS path construction to their control, should they want it?
FWIW, I am similarly pessimistic about combining s7r's detector based on the current proposal text, since even if full DoS is not possible, it is certainly possible to influence probability of RP selection if that proposal's detector is able to influence routes (and also to evade detection when the discovery attack is actually mounted). It feels a lot like the path bias detector but with worse properties due to adversarial circuit+node selection control... And even the original path bias detector really was only meant as a stopgap until something like https://gitweb.torproject.org/torspec.git/tree/proposals/261-aez-crypto.txt can be implemented.