Hello Tim,
Thank you very much for the comments. Please see my inline answers as I think I didn't explain good enough, most of the issues are not actually a problem.
teor wrote:
...
A rendezvous relay is considered suspicious when the number of successfully established circuits in the last 24 hours per a certain rendezvous relay exceeds with more than x2 factor the number of expected circuits having that relay as rendezvous point.
Why 2x? Is it just a number you picked?
In general, why the particular numbers in this proposal? Are they just guesses (most of our proposal numbers are), or are they evidence-based?
The numbers are not evidence based. I have picked the 2x factor to allow a margin for rare cases when genuine clients honestly pick a rendezvous relay more than its probability calculated considering its consensus weight. There is a nonzero chance this can (rarely) happen, because the probability calculation does not ensure 100% fixed accurate result so the 2x factor was chosen to minimize its impact as much as possible while maintaining a reasonable level of protection.
...
When a relay triggers it, instead of banning it and refusing to use it any longer, we just use hop 2 and hop 3 from the last circuit to further build new rendezvous circuits with this relay as rendezvous point for a random period between 24 to 72 hours. This ensures we mitigate the issue where the attacker DoS-es the HS by making all the relays in the consensus suspicious by hitting the limit for every relay, one by one.
Here's an attack that's enabled by this proposal:
- Send lots of traffic through various rend points until you trigger the limit on a particular hop2 or hop3 or rend you control.
- Stop sending traffic on that particular rend.
- Observe encrypted client traffic/paths on hop2, hop3 or rend for 24 to 72 hours.
- When hop2 or hop3 rotate, repeat from 1.
This attack can be performed in parallel on multiple rend points for the same service, and only needs to succeed once.
I am not sure I understand the attack. You cannot get to choose to trigger the limit on particulars hop2 and hop3. These are chosen by the hidden service server side (where the protection is implemented), they are chosen randomly, and when you trigger the limit on a particular rendezvous relay, the last hop 2 and hop 3 used to connect to that particular relay as RP are used for further new rend circuits only with that RP, circuits for other rendezvous relays that did not hit the limit are unaffected and chosen randomly as in normal conditions.
If multiple rendezvous relays hit the limit at the same time, each one will have different hop 2 and hop 3 static paths for the given period. It is exactly like vanguards, except it is activated only under some conditions and we have different Vanguards for each suspicious rendezvous relay.
How much effort would it take to bind all the rend points in the consensus to a particular hop2, hop3 for a service? (I think the minimum answer is min(Q) * count(rend), or about 15000-20000 connections.)
I believe this is answered above, you cannot bind all the rend points in the consensus to particular hop 2 and hop 3 for a service, each rend point will have its own hop 2 and hop 3, the ones used in the last circuit before the limit was reached and protection triggered.
Why not just use this defence (slow hop2, hop3 rotation) all the time? If we did, that makes this attack pointless, because you can't keep rotating hop2, hop3 fast until you get the ones you want.
Why not also use this defence (slow hop2, hop3 rotation) for clients? In the last thread, you said that clients can't be forced to make circuits. But with features like refresh and JavaScript, this just isn't true.
Clients are not subject to this attack (HS Guard Discovery attack). This is why they are not addressed. Clients choose the RP and request the HS to connect to it, as much as they want, without any limit. If you mean a malicious honeypot HS trying to de-anonymize clients or mount Guard Discovery attacks on clients, the chances for such an attack to succeed are orders or magnitude lower than vice versa, because the hidden service cannot make the client connect to an evil RP (RP is selected by the client).
In general, how do we know the suspicion thresholds are right?
From my point of view they should be pretty right, based on a simple logic:
a relay is selected by a client in a path based on its consensus weight fraction and position probability (guard, middle, exit). So, out of a given number of circuits a relay should appear in a position somewhere close to its position probability (more or less, which is why we use the 2x factor). You can test in practice, create 10000 rend circuits using the last consensus and take the relay that was chosen mostly, you will see that one has the highest middle probability. We consider the probability theory is tested and worthy of taking into consideration of course, we rely on it for this protection to work. I think we can safely do this.
Also, in general, it is harder to test and maintain software that changes its behaviour in rare circumstances. That doesn't mean this is a bad design: just that it costs extra to do right and make sure it's right. How would you test this?
I understand this can be a PITA from engineering point of view, we just have to put in balance and see if it's worth it and if it helps us more than Vanguards do, eliminate the load balancing problems and at the same time gain some other benefits. I don't see it MUCH more complicated than the vanguards proposal to be honest, it's just some more information we need to keep track of and validate a single rule before creating rendezvous circuits from hidden service server side only (not applied to HS client mode).
It is assumed that the protection is not usually triggered, only in exceptional cases (a normal Tor client will just randomly pick rendezvous points based on middle probability, this should not be able to trigger the protection). In the exceptional cases where we reuse hop 2 and hop 3 of the last circuit for a 24 to 72 hours period, the load balancing issues shouldn't be a problem given we talk about isolated cases.
How much would it cost an attacker to *not* make it an isolated case? Could an attacker bring down a relay by making multiple hidden services go through a hop2 or hop3?
No, an attacker cannot do it because he does not get to choose hop 2 and hop 3 as described above. An attacker can mostly make all rend points in the consensus have different static paths (hop 2 and hop 3) for a random short period. This requires some effort, since the attacker needs to take all the rend points one by one and trigger the limit on each. It gets even more complicated when the HS is popular, because more and more circuits are needed to trigger the limit on a particular rendezvous relay. Also, the number of circuits an attacker needs to do to trigger the limit grows exponentially, because as he hits the limit on some rend points the total number of established circuits grow, thus requiring more and more rend circuits to trigger the limit on new RPs.
Even if we come to this, having static hop 2 and hop 3 (different ones) for each RP in the consensus for a random period of time the result is exactly as if we were using vangaurds in the first place, except:
-> we talk about different hop 2 and hop 3 for each RP, so we have considerably less load balancing problems and we do not need to care about vanguard flag or logic for selecting which relays to use as vanguards.
- > the protection is no active all the time for all the hidden services existing in the network, it is only active in some conditions we consider suspect, as described above.
It is unclear how different is having the same hop 2 and hop 3 for each rend circuit (with any rend point) for a random period of time as vanguards proposal suggests better than this.
For example:
- Perform the attack above until the victim relay is in the hop3 position (with a malicious rend point, the client knows hop3).
- Repeat 1 with a different service and the same malicious rend point.
This method always assumes the rend point is evil, this should be the default thinking anyway since it's selected by the client all the time, and when an attacker wants to de-anonymize a hidden service, he is always obviously a client.
In the described attack, say you are the attacker and control rend point X:
HS -> Guard -> random_hop2 -> random_hop3 -> RP (X)
You know hop 3 of course. Now say you control some hostile consensus weight in the network and try your chances to get be picked as hop3, with the same X rend point so you learn hop 2. So you request more and more rend circuits having the same evil RP.
The HS will grant you more circuits: HS -> Guard -> random_hop2 -> random_hop3 -> RP (X) HS -> Guard -> random_hop2 -> random_hop3 -> RP (X) HS -> Guard -> random_hop2 -> random_hop3 -> RP (X) [...]
Until the limit is triggered for your RP (X), based on its consensus weight middle probability and total number of rend circuits established by that hidden service from server side.
When the limit is triggered, you will have this: HS -> Guard -> static_hop2 -> static_hop3 -> RP (X)
Hop 2 and 3 will be static for a random period with that RP(X), and your tries are finished for being chosen in hop 3 position.
Now you have to either wait, until the random period for RP (X) runs out, either try a different hostile RP you control. Soon you will hit the limit on the second hostile RP as well.
As you can see this clearly makes it harder and requires more time and resources from the attacker, while not affecting the hidden service in any way.
Why the load balancing is not a problem here? When a RP is marked as suspicious and 2 relays act as hop 2 and hop 3 only for rend circuits to this particular RP, the only circuits being created here are:
-> the circuits requested by the attacker, controlling the suspect RP -> the circuits requested by honest clients, that randomly pick the suspect RP under normal conditions (since this is based on its middle probability, we are not talking about concerning numbers).
Also, Tor2web with Tor2webRendezvousPoints will always trigger this case, as I said in response to the last proposal: (for "break" read "trigger on")
- This will break some Tor2Web installations, which deliberately choose
rendezvous points on the same server or network for latency reasons. (Forcing Tor2Web installations to choose multiple RPs may be a worthwhile security tradeoff.)
https://lists.torproject.org/pipermail/tor-dev/2016-January/010293.html
I won't repeat the entire thread here, but if this protection will always be triggered when Tor2webRendezvousPoints is on, please document that in the proposal, and talk about the load balancing implications.
(Tor2webRendezvousPoints allows a Tor2web client to chose set rendezvous points for every connection. Please re-read the thread or the tor man page for details.)
Exactly. Allows a Tor2web _client_ to set rendezvous point for every connection. Nothing will break here. What will happen is if a Tor2Web service is popular and many users browse via that Tor2Web proxy a particular hidden service, that hidden service will at some point trigger the limit and have static hop 2 and hop 3 to the Tor2webRendezvousPoints selected by this particular Tor2Web client. They will be rotated after some time of course.
There is absolutely no problem here and the Tor2web clients will not even notice it.
One question: Are we creating an additional risk by keeping this additional information (hop2, hop3 of every rendezvous circuit) on the hidden service server side? How useful can this historic information be for an attacker that comes aware of the location of the hidden service?
It provides the entire path to the rendezvous point. This is useful for an attacker that knows the rendezvous point. It is also useful for an attacker whose priority is to locate clients, rather than locate the service.
We already keep this information regarding the Guard. From my point of view this is irrelevant, given this information only becomes available after the location of the hidden service is already discovered (which is pretty much maximum damage).
... to the hidden service, not necessarily its clients.
Yes, but we only have info, from the hidden service server side for the path to the RP:
HS - Guard - hop2 - hop3 - RP
On the other hand, there's also: RP <- Hop 2 <- HS client Guard <- HS client
So we don't actually know much. The design assumes anyway a HS can be malicious, which is why the client selects the RP and makes a 2 hop circuit to it, while the HS server makes a 3 hop circuit to the RP because it was chosen by someone else.
if logging this information (hop2, hop3 and all rend points used) at hidden service server side is risky, this means all clients connecting to honey pot HSes are at risk? I doubt this.