Hello,
I'm attaching a proposal draft that should help us defend against guard discovery attacks.
There are a few pieces left unfinished (see the XXXs) but I decided to release early and release often for the sake of moving forward with this. I consider this issue very important and any feedback is greatly appreciated.
Thanks!
----
Filename: 246-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Author: George Kadianakis Created: 2015-07-10 Status: Draft
0. Motivation
A guard discovery attack allow attackers to determine the guard node of a Tor client. The hidden service rendezvous protocol provides an attack vector for a guard discovery attack since anyone can force an HS to construct a 3-hop circuit to a relay (#9001).
Following the guard discovery attack with a compromise or coercion of the guard node can lead to the deanonymization of a hidden service.
1. Overview
This document tries to make the above guard discovery + coersion attack harder to launch. It introduces an optional configuration option which makes the hidden service also pin the second and third hops of its circuits for a longer duration.
With this new path selection, we force the adversary to perform a Sybil attack and two coercion attacks before succeeding. This is an improvement over the current state where the Sybil attack is trivial to pull off, and only a single coercion attack is required.
With this new path selection, an attacker is forced to do a coercion attack before learning the guard node of a hidden service. This increases the uncertainty of the attacker, since he will need to perform a coercion attack before he learns the identity of the guard node and whether he can compromise it. Coercion attacks are costly and potentially detectable, so an attacker will have to think twice before beginning a chain of coercion attacks that he might not be able to complete.
1.1. Visuals
Here is how a hidden service rendezvous circuit currently looks like:
-> middle_1 -> middle_A -> middle_2 -> middle_B -> middle_3 -> middle_C -> middle_4 -> middle_D HS -> guard -> middle_5 -> middle_E -> Rendezvous Point -> middle_6 -> middle_F -> middle_7 -> middle_G -> middle_8 -> middle_H -> ... -> ... -> middle_n -> middle_n
this proposal pins the two middles nodes to a much more restricted set, as follows:
-> guard_3_A -> guard_3_B HS -> guard_1 -> guard_2_A -> guard_3_C -> Rendezvous Point -> guard_2_B -> guard_3_D -> guard_3_E -> guard_3_F
2. Design
This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled.
When a hidden service picks its guard nodes, it also picks two additional sets of guard nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively.
When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from `third_guard_set` as third hops of the circuit.
A hidden service rotates 'second_guard_set' every SECOND_GUARD_ROTATION hours, and 'third_guard_set' every THIRD_GUARD_ROTATION hours.
These extra guard nodes should be picked with the same path selection procedure that is used for regular guard nodes. Care should be taken such that guard sets do not share any common relays. XXX or simply that they are not used in the same circuit?
XXX maybe pick the second and third layer guards from the set of middle nodes but also enforce some kind of uptime requirement? that should greatly help our load balancing.
XXX maybe we should also introduce consensus flags for the extra guard layers? Vanguard?
XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here?
2.1. Security parameters
We set NUM_SECOND_GUARDS to 2 nodes and NUM_THIRD_GUARDS to 6 nodes. We set SECOND_GUARD_ROTATION to 2 weeks and THIRD_GUARD_ROTATION to 1 day.
See the discussion section for more analysis on these constants.
3. Discussion
3.1 How were these security parameters chosen?
Consider an adversary with the following powers:
- Can launch a Sybil guard discovery attack against any node of a rendezvous circuit. The slower the rotation period of the node, the longer the attack takes.
- Can compromise any node on the network. We assume that the adversary cannot compromise too many nodes, otherwise Tor's security would be breached anyhow.
We now calculate the time it takes for the adversary to launch a Sybil attack with 50% success assuming 5% network control. This depends solely on how frequently the hidden service rotates that node:
+-------------------+-------------------------------+------------------------+----------------------------+ | Rotation period | Sybil attack with 50% success | Sybil attack (2 guards)| Sybil attack (6 guards) | +-------------------+-------------------------------+------------------------+----------------------------+ | 1 hour | 14 hours | 7 hours | 2.5 hours | | 1 day | 14 days | 7 days | 2.5 days | | 1 week | 3.5 months | 6 weeks | 2.5 weeks | | 2 weeks | 7 months | 3.5 months | 5 weeks | | 1 month | 1 year and 2 months | 6 months | 2.5 months | | 3 months | 3.5 years | 1.7 years | 6 months | +-------------------+-------------------------------+------------------------+----------------------------+ Required time for Sybil attack by a 5% adversary
Our security parameters were selected so that the first two layers of guards should be hard to attack using a Sybil guard discovery attack and hence require a coercion attack. On the other hand, the outmost layer of guards should rotate fast enough to _require_ a Sybil attack.
XXX does this security model even make sense? what about a network adversary, or an adversary that can launch congestion attacks etc.????
3.2. Distinguishing new HS circuits from normal HS circuits
By pinning the middle nodes of rendezvous circuits, we make it easier for all hops of the circuit to detect that they are part of a special hidden service circuit. XXX hm how does the outermost guard knows?
Compare this to the current Tor network, where only the guard and the third hop of the HS circuit can trivially distinguish whether it's part of an HS circuit.
3.3. Circuit nodes can now be linked to specific hidden services
With this proposal the hops of hidden service circuits will be static, and hence an adversrary will be able to map them to specific hidden services. For example, a middle node that sees two hidden service circuits with the same guard node in different times, can assume with non-negligible probability that both circuits correspond to the same hidden service.
That said, this is also partially the case for the current Tor network, where the middle node can associate a guard with a specific hidden service.
3.4 Why is the torrc setting disabled by default, if it's so good?
We suggest this torrc option to be optional because it puts additional load on guard nodes and we are not sure if the network will be able to handle it.
However, by having this setting be disabled by default, we make hidden services who use it stand out a lot. For this reason, we should eventually fix our hidden service circuit load balancing so that we can enable this globally.
XXX But hidden services traffic is only 6% of the total network, so maybe it's not that big of a problem and we should just enable this feature by default anyway
3.4.1. How should we load balance to enable this feature globally?
The load balancing issue with this feature is that hidden service traffic that would otherwise be passing through middle nodes, will now be passing through guard nodes.
Furthermore, this additional traffic is not accounted for in our bandwidth weights. This means that a guard node that had 1% probability of being chosen as a guard for normal circuits, will still have the same probability of being chosen as a guard even though the hidden service traffic that it pushes increases.
To improve the load balancing here, we could have each relay report the amount of hidden service traffic it pushes every day (#15254), and have the authorities take this into account when they calculate bandwidth weights. The idea here would be that the dirauths would know that N% of the network is hidden services traffic, hence they would tweak the bandwidth weights such that guards would reserve some N% of their bandwidth for hidden service purposes.
4. Future directions
Here are some more ideas for improvements that should be done sooner or later:
- Maybe we should make the size and rotation period of secondary/third guard sets to be configurable by the user.
- To make it harder for an adversary, a hidden service MAY extend the path length of its circuits by an additional static hop. This forces the adversary to use another coercion attack to walk the chain up to the hidden service.
5. Acknowledgements
Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else who helped with this idea.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Hello,
Estimations look good.
I think second_guard_set & third_guard_set should not have the same requirements like how any Tor client chooses the guard (first hop). We can select second guards and third guards by requiring for example just Stable flag?
For 3.4.1 - Load balancing - wouldn't this be much easier if we only allowed relays in the consensus with a bandwidth of > n KB/s? I still see very low bandwidth relays included in the consensus. We should really do some math and only include relays in the consensus which offer at least 2x (maybe even more) what it takes for the network to include them in the consensus document and serve it to all the clients.
Most important problems are at 3.2 and 3.3 How easy would it be to distinguish the new 'special' circuits as opposite to regular ones, and more important, what can we do to eliminate this risk? We know that anonymity has many variables, and a very important one is 'mixing in the crowd'. By looking different from other users, we make one step back.
On 7/10/2015 11:58 PM, George Kadianakis wrote:
Hello,
I'm attaching a proposal draft that should help us defend against guard discovery attacks.
There are a few pieces left unfinished (see the XXXs) but I decided to release early and release often for the sake of moving forward with this. I consider this issue very important and any feedback is greatly appreciated.
Thanks!
Filename: 246-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Author: George Kadianakis Created: 2015-07-10 Status: Draft
- Motivation
A guard discovery attack allow attackers to determine the guard node of a Tor client. The hidden service rendezvous protocol provides an attack vector for a guard discovery attack since anyone can force an HS to construct a 3-hop circuit to a relay (#9001).
Following the guard discovery attack with a compromise or coercion of the guard node can lead to the deanonymization of a hidden service.
- Overview
This document tries to make the above guard discovery + coersion attack harder to launch. It introduces an optional configuration option which makes the hidden service also pin the second and third hops of its circuits for a longer duration.
With this new path selection, we force the adversary to perform a Sybil attack and two coercion attacks before succeeding. This is an improvement over the current state where the Sybil attack is trivial to pull off, and only a single coercion attack is required.
With this new path selection, an attacker is forced to do a coercion attack before learning the guard node of a hidden service. This increases the uncertainty of the attacker, since he will need to perform a coercion attack before he learns the identity of the guard node and whether he can compromise it. Coercion attacks are costly and potentially detectable, so an attacker will have to think twice before beginning a chain of coercion attacks that he might not be able to complete.
1.1. Visuals
Here is how a hidden service rendezvous circuit currently looks like:
-> middle_1 -> middle_A -> middle_2 -> middle_B -> middle_3 -> middle_C -> middle_4 -> middle_D HS -> guard -> middle_5 -> middle_E -> Rendezvous Point -> middle_6 -> middle_F -> middle_7 -> middle_G -> middle_8 -> middle_H -> ... -> ... -> middle_n -> middle_n
this proposal pins the two middles nodes to a much more restricted set, as follows:
-> guard_3_A -> guard_3_B HS -> guard_1 -> guard_2_A -> guard_3_C -> Rendezvous Point -> guard_2_B -> guard_3_D -> guard_3_E -> guard_3_F
- Design
This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled.
When a hidden service picks its guard nodes, it also picks two additional sets of guard nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively.
When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from `third_guard_set` as third hops of the circuit.
A hidden service rotates 'second_guard_set' every SECOND_GUARD_ROTATION hours, and 'third_guard_set' every THIRD_GUARD_ROTATION hours.
These extra guard nodes should be picked with the same path selection procedure that is used for regular guard nodes. Care should be taken such that guard sets do not share any common relays. XXX or simply that they are not used in the same circuit?
XXX maybe pick the second and third layer guards from the set of middle nodes but also enforce some kind of uptime requirement? that should greatly help our load balancing.
XXX maybe we should also introduce consensus flags for the extra guard layers? Vanguard?
XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here?
2.1. Security parameters
We set NUM_SECOND_GUARDS to 2 nodes and NUM_THIRD_GUARDS to 6 nodes. We set SECOND_GUARD_ROTATION to 2 weeks and THIRD_GUARD_ROTATION to 1 day.
See the discussion section for more analysis on these constants.
- Discussion
3.1 How were these security parameters chosen?
Consider an adversary with the following powers:
- Can launch a Sybil guard discovery attack against any node of a
rendezvous circuit. The slower the rotation period of the node, the longer the attack takes.
- Can compromise any node on the network. We assume that the
adversary cannot compromise too many nodes, otherwise Tor's security would be breached anyhow.
We now calculate the time it takes for the adversary to launch a Sybil attack with 50% success assuming 5% network control. This depends solely on how frequently the hidden service rotates that node:
+-------------------+-------------------------------+-----------------
- -------+----------------------------+
| Rotation period | Sybil attack with 50% success | Sybil attack (2 guards)| Sybil attack (6 guards) |
+-------------------+-------------------------------+-----------------
- -------+----------------------------+
| 1 hour | 14 hours | 7 hours | 2.5 hours |
| 1 day | 14 days | 7 days | 2.5 days | | 1 week | 3.5 months | 6 weeks | 2.5 weeks | | 2 weeks | 7 months | 3.5 months | 5 weeks | | 1 month | 1 year and 2 months | 6 months | 2.5 months | | 3 months | 3.5 years | 1.7 years | 6 months | +-------------------+-------------------------------+-----------------
- -------+----------------------------+
Required time for Sybil attack by a 5% adversary
Our security parameters were selected so that the first two layers of guards should be hard to attack using a Sybil guard discovery attack and hence require a coercion attack. On the other hand, the outmost layer of guards should rotate fast enough to _require_ a Sybil attack.
XXX does this security model even make sense? what about a network adversary, or an adversary that can launch congestion attacks etc.????
3.2. Distinguishing new HS circuits from normal HS circuits
By pinning the middle nodes of rendezvous circuits, we make it easier for all hops of the circuit to detect that they are part of a special hidden service circuit. XXX hm how does the outermost guard knows?
Compare this to the current Tor network, where only the guard and the third hop of the HS circuit can trivially distinguish whether it's part of an HS circuit.
3.3. Circuit nodes can now be linked to specific hidden services
With this proposal the hops of hidden service circuits will be static, and hence an adversrary will be able to map them to specific hidden services. For example, a middle node that sees two hidden service circuits with the same guard node in different times, can assume with non-negligible probability that both circuits correspond to the same hidden service.
That said, this is also partially the case for the current Tor network, where the middle node can associate a guard with a specific hidden service.
3.4 Why is the torrc setting disabled by default, if it's so good?
We suggest this torrc option to be optional because it puts additional load on guard nodes and we are not sure if the network will be able to handle it.
However, by having this setting be disabled by default, we make hidden services who use it stand out a lot. For this reason, we should eventually fix our hidden service circuit load balancing so that we can enable this globally.
XXX But hidden services traffic is only 6% of the total network, so maybe it's not that big of a problem and we should just enable this feature by default anyway
3.4.1. How should we load balance to enable this feature globally?
The load balancing issue with this feature is that hidden service traffic that would otherwise be passing through middle nodes, will now be passing through guard nodes.
Furthermore, this additional traffic is not accounted for in our bandwidth weights. This means that a guard node that had 1% probability of being chosen as a guard for normal circuits, will still have the same probability of being chosen as a guard even though the hidden service traffic that it pushes increases.
To improve the load balancing here, we could have each relay report the amount of hidden service traffic it pushes every day (#15254), and have the authorities take this into account when they calculate bandwidth weights. The idea here would be that the dirauths would know that N% of the network is hidden services traffic, hence they would tweak the bandwidth weights such that guards would reserve some N% of their bandwidth for hidden service purposes.
- Future directions
Here are some more ideas for improvements that should be done sooner or later:
- Maybe we should make the size and rotation period of
secondary/third guard sets to be configurable by the user.
- To make it harder for an adversary, a hidden service MAY extend
the path length of its circuits by an additional static hop. This forces the adversary to use another coercion attack to walk the chain up to the hidden service.
- Acknowledgements
Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else who helped with this idea. _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
I find it better to add a new consensus flag called 'Vanguard' which will be assigned to relays with lower requirements than the 'Guard' (less bandwidth, just the Stable flag). We will then select second_guard_set and third_guard_set from relays having 'Vanguard' flag. I know this could theoretically make a Sybil attack cheaper, but by selecting first guard, second_guard_set and third_guard_set just from the 'Guard' relays in the consensus, which are currently:
$ cat /var/lib/tor/cached-microdesc-consensus | grep "Guard" | wc -l
1552
would result in a less quality service for the users and it would hammer the existent Guards way too much.
I don't think it's wise to change how we assign 'Guard' flag to the relays - the requirements we now have give great results for both user performance and load balancing. It would be better to just implement a second/third class of guards called Vanguards. To have a bigger pool of Vanguards so that the network will be better load balanced and also offer more diverse possible paths it is simple - remove the relays which are not helping at all and just eating consensus document space (or at least a big part of them) so the vast majority of relays in the consensus would be Vanguards. A Guard relay can also have the Vanguard flag, but if it's selected as the Guard it should not be taken in consideration for second_guard_set and third_guard_set. This is the easy part.
What requires more research is: 3.2 - Distinguishing new HS circuits from normal HS circuits 3.3 - Circuit nodes can now be linked to specific hidden services.
I see 3.2 as a worse problem than 3.3. I don't see a real fix for 3.3, it is by logic that if a middle node sees two HS circuits with the same Vanguard can assume with high probability that those circuits correspond to the same HS. This is just a tradeoff for this approach, but as I said I don't see 3.3 as a flaw. What I feel little more uncomfortable about is 3.2 which I think we should focus on.
On 7/10/2015 11:58 PM, George Kadianakis wrote:
Hello,
I'm attaching a proposal draft that should help us defend against guard discovery attacks.
There are a few pieces left unfinished (see the XXXs) but I decided to release early and release often for the sake of moving forward with this. I consider this issue very important and any feedback is greatly appreciated.
Thanks!
Filename: 246-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Author: George Kadianakis Created: 2015-07-10 Status: Draft
- Motivation
A guard discovery attack allow attackers to determine the guard node of a Tor client. The hidden service rendezvous protocol provides an attack vector for a guard discovery attack since anyone can force an HS to construct a 3-hop circuit to a relay (#9001).
Following the guard discovery attack with a compromise or coercion of the guard node can lead to the deanonymization of a hidden service.
- Overview
This document tries to make the above guard discovery + coersion attack harder to launch. It introduces an optional configuration option which makes the hidden service also pin the second and third hops of its circuits for a longer duration.
With this new path selection, we force the adversary to perform a Sybil attack and two coercion attacks before succeeding. This is an improvement over the current state where the Sybil attack is trivial to pull off, and only a single coercion attack is required.
With this new path selection, an attacker is forced to do a coercion attack before learning the guard node of a hidden service. This increases the uncertainty of the attacker, since he will need to perform a coercion attack before he learns the identity of the guard node and whether he can compromise it. Coercion attacks are costly and potentially detectable, so an attacker will have to think twice before beginning a chain of coercion attacks that he might not be able to complete.
1.1. Visuals
Here is how a hidden service rendezvous circuit currently looks like:
-> middle_1 -> middle_A -> middle_2 -> middle_B -> middle_3 -> middle_C -> middle_4 -> middle_D HS -> guard -> middle_5 -> middle_E -> Rendezvous Point -> middle_6 -> middle_F -> middle_7 -> middle_G -> middle_8 -> middle_H -> ... -> ... -> middle_n -> middle_n
this proposal pins the two middles nodes to a much more restricted set, as follows:
-> guard_3_A -> guard_3_B HS -> guard_1 -> guard_2_A -> guard_3_C -> Rendezvous Point -> guard_2_B -> guard_3_D -> guard_3_E -> guard_3_F
- Design
This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled.
When a hidden service picks its guard nodes, it also picks two additional sets of guard nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively.
When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from `third_guard_set` as third hops of the circuit.
A hidden service rotates 'second_guard_set' every SECOND_GUARD_ROTATION hours, and 'third_guard_set' every THIRD_GUARD_ROTATION hours.
These extra guard nodes should be picked with the same path selection procedure that is used for regular guard nodes. Care should be taken such that guard sets do not share any common relays. XXX or simply that they are not used in the same circuit?
XXX maybe pick the second and third layer guards from the set of middle nodes but also enforce some kind of uptime requirement? that should greatly help our load balancing.
XXX maybe we should also introduce consensus flags for the extra guard layers? Vanguard?
XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here?
2.1. Security parameters
We set NUM_SECOND_GUARDS to 2 nodes and NUM_THIRD_GUARDS to 6 nodes. We set SECOND_GUARD_ROTATION to 2 weeks and THIRD_GUARD_ROTATION to 1 day.
See the discussion section for more analysis on these constants.
- Discussion
3.1 How were these security parameters chosen?
Consider an adversary with the following powers:
- Can launch a Sybil guard discovery attack against any node of a
rendezvous circuit. The slower the rotation period of the node, the longer the attack takes.
- Can compromise any node on the network. We assume that the
adversary cannot compromise too many nodes, otherwise Tor's security would be breached anyhow.
We now calculate the time it takes for the adversary to launch a Sybil attack with 50% success assuming 5% network control. This depends solely on how frequently the hidden service rotates that node:
+-------------------+-------------------------------+-----------------
- -------+----------------------------+
| Rotation period | Sybil attack with 50% success | Sybil attack (2 guards)| Sybil attack (6 guards) |
+-------------------+-------------------------------+-----------------
- -------+----------------------------+
| 1 hour | 14 hours | 7 hours | 2.5 hours |
| 1 day | 14 days | 7 days | 2.5 days | | 1 week | 3.5 months | 6 weeks | 2.5 weeks | | 2 weeks | 7 months | 3.5 months | 5 weeks | | 1 month | 1 year and 2 months | 6 months | 2.5 months | | 3 months | 3.5 years | 1.7 years | 6 months | +-------------------+-------------------------------+-----------------
- -------+----------------------------+
Required time for Sybil attack by a 5% adversary
Our security parameters were selected so that the first two layers of guards should be hard to attack using a Sybil guard discovery attack and hence require a coercion attack. On the other hand, the outmost layer of guards should rotate fast enough to _require_ a Sybil attack.
XXX does this security model even make sense? what about a network adversary, or an adversary that can launch congestion attacks etc.????
3.2. Distinguishing new HS circuits from normal HS circuits
By pinning the middle nodes of rendezvous circuits, we make it easier for all hops of the circuit to detect that they are part of a special hidden service circuit. XXX hm how does the outermost guard knows?
Compare this to the current Tor network, where only the guard and the third hop of the HS circuit can trivially distinguish whether it's part of an HS circuit.
3.3. Circuit nodes can now be linked to specific hidden services
With this proposal the hops of hidden service circuits will be static, and hence an adversrary will be able to map them to specific hidden services. For example, a middle node that sees two hidden service circuits with the same guard node in different times, can assume with non-negligible probability that both circuits correspond to the same hidden service.
That said, this is also partially the case for the current Tor network, where the middle node can associate a guard with a specific hidden service.
3.4 Why is the torrc setting disabled by default, if it's so good?
We suggest this torrc option to be optional because it puts additional load on guard nodes and we are not sure if the network will be able to handle it.
However, by having this setting be disabled by default, we make hidden services who use it stand out a lot. For this reason, we should eventually fix our hidden service circuit load balancing so that we can enable this globally.
XXX But hidden services traffic is only 6% of the total network, so maybe it's not that big of a problem and we should just enable this feature by default anyway
3.4.1. How should we load balance to enable this feature globally?
The load balancing issue with this feature is that hidden service traffic that would otherwise be passing through middle nodes, will now be passing through guard nodes.
Furthermore, this additional traffic is not accounted for in our bandwidth weights. This means that a guard node that had 1% probability of being chosen as a guard for normal circuits, will still have the same probability of being chosen as a guard even though the hidden service traffic that it pushes increases.
To improve the load balancing here, we could have each relay report the amount of hidden service traffic it pushes every day (#15254), and have the authorities take this into account when they calculate bandwidth weights. The idea here would be that the dirauths would know that N% of the network is hidden services traffic, hence they would tweak the bandwidth weights such that guards would reserve some N% of their bandwidth for hidden service purposes.
- Future directions
Here are some more ideas for improvements that should be done sooner or later:
- Maybe we should make the size and rotation period of
secondary/third guard sets to be configurable by the user.
- To make it harder for an adversary, a hidden service MAY extend
the path length of its circuits by an additional static hop. This forces the adversary to use another coercion attack to walk the chain up to the hidden service.
- Acknowledgements
Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else who helped with this idea. _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Sat, Jul 11, 2015 at 03:50:16PM +0300, s7r wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
I find it better to add a new consensus flag called 'Vanguard' which will be assigned to relays with lower requirements than the 'Guard' (less bandwidth, just the Stable flag). We will then select second_guard_set and third_guard_set from relays having 'Vanguard' flag. I know this could theoretically make a Sybil attack cheaper, but by selecting first guard, second_guard_set and third_guard_set just from the 'Guard' relays in the consensus, which are currently:
$ cat /var/lib/tor/cached-microdesc-consensus | grep "Guard" | wc -l
1552
would result in a less quality service for the users and it would hammer the existent Guards way too much.
This seems like a good idea, and was also something that was discussed. Something we still must understand is what do we mean by "hammering". If onion service traffic is such a small percentage of the overall network throughput, why does it cause guard failure? When we have a better understanding of why popular onion services are overloading their guards, then we can better measure the tradeoffs between using distinct relay sets.
I don't think it's wise to change how we assign 'Guard' flag to the relays - the requirements we now have give great results for both user performance and load balancing.
Definitely. It's also possible the current criteria are not restrictive enough.
It would be better to just implement a second/third class of guards called Vanguards. To have a bigger pool of Vanguards so that the network will be better load balanced and also offer more diverse possible paths it is simple - remove the relays which are not helping at all and just eating consensus document space (or at least a big part of them) so the vast majority of relays in the consensus would be Vanguards. A Guard relay can also have the Vanguard flag, but if it's selected as the Guard it should not be taken in consideration for second_guard_set and third_guard_set. This is the easy part.
What requires more research is: 3.2 - Distinguishing new HS circuits from normal HS circuits 3.3 - Circuit nodes can now be linked to specific hidden services.
I see 3.2 as a worse problem than 3.3. I don't see a real fix for 3.3, it is by logic that if a middle node sees two HS circuits with the same Vanguard can assume with high probability that those circuits correspond to the same HS. This is just a tradeoff for this approach, but as I said I don't see 3.3 as a flaw. What I feel little more uncomfortable about is 3.2 which I think we should focus on.
But is 3.2 really important? It isn't a primary goal. We really want to make it difficult to locate location-hidden services. Distinguishability between different types of circuits may hurt the overall blending effect and, therefore, may not increase the overall anonymity provided by the network, but pinning any nodes in the circuit result in this fingerprint (unless someone wants to challenge this assumption) - the guard discovery attack and bridge enumeration are examples of this. Basically, pinning the middle hop(s) let an adversary know that it is part of an onion service's circuit, even potentially which one. If we assume that guards are the best way to mitigate certain attacks on onion services, then using layered-guards now seems like an important extension of that. Therefore, if the second hop is pinned, those nodes may be able to determine that they are included in the vanguard set of a high(er) security hidden service. As long as traffic analysis is useful, I don't know if we can have 3.3 without having 3.2 (but please think about design improvements).
On Fri, Jul 10, 2015 at 04:58:28PM -0400, George Kadianakis wrote:
I'm attaching a proposal draft that should help us defend against guard discovery attacks.
Thanks George. I have added this proposal as 247. (Sorry for the numbering confusion -- let that be a lesson to others who try to pick a number for their proposal. :)
--Roger
George Kadianakis:
Hello,
I'm attaching a proposal draft that should help us defend against guard discovery attacks.
There are a few pieces left unfinished (see the XXXs) but I decided to release early and release often for the sake of moving forward with this. I consider this issue very important and any feedback is greatly appreciated.
This looks pretty good, George. Some comments in-line.
Filename: 246-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Author: George Kadianakis Created: 2015-07-10 Status: Draft
- Design
This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled.
When a hidden service picks its guard nodes, it also picks two additional sets of guard nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively.
When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from `third_guard_set` as third hops of the circuit.
A hidden service rotates 'second_guard_set' every SECOND_GUARD_ROTATION hours, and 'third_guard_set' every THIRD_GUARD_ROTATION hours.
These extra guard nodes should be picked with the same path selection procedure that is used for regular guard nodes. Care should be taken such that guard sets do not share any common relays. XXX or simply that they are not used in the same circuit?
XXX maybe pick the second and third layer guards from the set of middle nodes but also enforce some kind of uptime requirement? that should greatly help our load balancing.
I think for path fingerprinting reasons ("WTF, I appear to be a middle node, but I'm a Guard node in between two Guard nodes. This must be an *extra* interesting hidden service!") you should use nodes from the same flag set as normal path selection for the middle and third hop.
XXX maybe we should also introduce consensus flags for the extra guard layers? Vanguard?
XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here?
2.1. Security parameters
We set NUM_SECOND_GUARDS to 2 nodes and NUM_THIRD_GUARDS to 6 nodes. We set SECOND_GUARD_ROTATION to 2 weeks and THIRD_GUARD_ROTATION to 1 day.
See the discussion section for more analysis on these constants.
I am worried that THIRD_GUARD_ROTATION of 1 day is not fast enough to be sure that a targeted attack on a specific currently-used third-level guard wouldn't work. Can you maybe expand your table from Section 3 to show us datapoints for 4, 8, 12 hour THIRD_GUARD_ROTATION rates?
It would also be interesting to see tables for a 1%, and 10% adversary as well.
In terms of specific implementation, what if the THIRD_GUARD_ROTATION period was randomized between M and N hours (for M~=2, N~=10 for an average of 6 hours, or M~=8, N~=16 for an average of 12), so that the adversary could not be sure that a given third-level guard would remain in place for a predictable time period long enough that they could successfully mount a reliable attack on a specific relay itself?
The overall time-to-sybil-compromise of a randomized guard rotation duration should be the same as a fixed duration, and I think the randomization would remove the certainty of being able to walk right up the whole chain if you timed everything right beforehand.
This might also be a good idea for SECOND_GUARD_ROTATION, too, and IIRC we already to a form of this for the first Guard (or at least we used to).
I think each guard should be rotated independently, for similar reasons, and the same intuition is also making me think we want more than two second-level Guards, as well. If you are not sure when the second-level guard you found might rotate away, you're going to be forced to try to compromise a larger fraction of them if you want more certainty before you have to start over. This will also be noisy.
- Future directions
Here are some more ideas for improvements that should be done sooner or later:
- Maybe we should make the size and rotation period of secondary/third guard sets to be configurable by the user.
What about as a function of the consensus? That third hop could also rotate faster if we had more nodes/more diversity..
Here's another crazy idea that would potentially bring this Vanguards idea closer to "Virtual Circuits": What if you divided your third-level Vanguards into NUM_SECOND_GUARDS isolated buckets, and mapped exactly one these buckets to each of your second-level guards? Then, when making Rend Circs, after you select a second-level hop for a given circuit, you can only extend to the third-level guards that correspond to the bucket for that second-level guard.
That way, if one third-level guard was compromised via a successful Sybil attack, the adversary would learn at most 1 of your second-level guards, instead of learning all of them and then getting to take their pick which one(s) they want to try to compromise.
Does that make sense?
With this change, it seems like we could then make the third-level rotation even faster, because even successful Sybils get you much less info.
Comments inline.
On 7/16/2015 7:26 AM, Mike Perry wrote:
George Kadianakis:
Hello,
I'm attaching a proposal draft that should help us defend against guard discovery attacks.
There are a few pieces left unfinished (see the XXXs) but I decided to release early and release often for the sake of moving forward with this. I consider this issue very important and any feedback is greatly appreciated.
This looks pretty good, George. Some comments in-line.
Filename: 246-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Author: George Kadianakis Created: 2015-07-10 Status: Draft
- Design
This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled.
When a hidden service picks its guard nodes, it also picks two additional sets of guard nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively.
When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from `third_guard_set` as third hops of the circuit.
A hidden service rotates 'second_guard_set' every SECOND_GUARD_ROTATION hours, and 'third_guard_set' every THIRD_GUARD_ROTATION hours.
These extra guard nodes should be picked with the same path selection procedure that is used for regular guard nodes. Care should be taken such that guard sets do not share any common relays. XXX or simply that they are not used in the same circuit?
XXX maybe pick the second and third layer guards from the set of middle nodes but also enforce some kind of uptime requirement? that should greatly help our load balancing.
I think for path fingerprinting reasons ("WTF, I appear to be a middle node, but I'm a Guard node in between two Guard nodes. This must be an *extra* interesting hidden service!") you should use nodes from the same flag set as normal path selection for the middle and third hop.
+1; I have commented on this in my previous mails related to this thread, but I will repeat again in order to highlight your comment. Some things to take into consideration:
- add another flag called 'Vanguard', which will be assigned with lower requirements than the 'Guard' flag, in order to make the vast majority of relays Vanguards and gain more diversity / increase possible paths; - choose first_guard_set from relays with 'Guard' flag, as we currently do, no changes; - choose second_guard_set and third_guard_set from relays with 'Vanguard' flag. A certain Vanguard relay should not be in second_guard_set and third_guard_set at the same time.
XXX maybe we should also introduce consensus flags for the extra guard layers? Vanguard?
XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here?
2.1. Security parameters
We set NUM_SECOND_GUARDS to 2 nodes and NUM_THIRD_GUARDS to 6 nodes. We set SECOND_GUARD_ROTATION to 2 weeks and THIRD_GUARD_ROTATION to 1 day.
See the discussion section for more analysis on these constants.
I am worried that THIRD_GUARD_ROTATION of 1 day is not fast enough to be sure that a targeted attack on a specific currently-used third-level guard wouldn't work. Can you maybe expand your table from Section 3 to show us datapoints for 4, 8, 12 hour THIRD_GUARD_ROTATION rates?
It would also be interesting to see tables for a 1%, and 10% adversary as well.
In terms of specific implementation, what if the THIRD_GUARD_ROTATION period was randomized between M and N hours (for M~=2, N~=10 for an average of 6 hours, or M~=8, N~=16 for an average of 12), so that the adversary could not be sure that a given third-level guard would remain in place for a predictable time period long enough that they could successfully mount a reliable attack on a specific relay itself?
Randomization of the rotation periods is a must, from my point of view. second_guard_set should have more than 2 Vanguards. Maybe we can look into this numbers:
second_guard_set -> NUM_SECOND_GUARDS = 6 Vanguards SECOND_GUARD_ROTATION = random between M and N (M~=1 month ; N~=3 months)
third_guard_set -> NUM_THIRD_GUARDS 18 Vanguards THIRD_GUARD_ROTATION = random between M and N (M~=6 hours ; N~=10 hours; average ~= 8 hours)
Of course not just any hidden service would need so many Vanguards in second_guard_set and third_guard_set - only the popular ones. The low usage hidden services can select and keep in memory a second_guard_set and third_guard_set but only use 2 relays from the second_guard_set and 6 relays from the third_guard_set unless required to build a large amount of rendezvous circuits. An attacker could force a hidden service to build a large amount of rendezvous circuits, but if the entire second_guard_set and third_guard_set is already selected and kept somewhere with the rotation periods configured, it shouldn't open the door for a new attack which is not possible already.
The overall time-to-sybil-compromise of a randomized guard rotation duration should be the same as a fixed duration, and I think the randomization would remove the certainty of being able to walk right up the whole chain if you timed everything right beforehand.
This might also be a good idea for SECOND_GUARD_ROTATION, too, and IIRC we already to a form of this for the first Guard (or at least we used to).
I think each guard should be rotated independently, for similar reasons, and the same intuition is also making me think we want more than two second-level Guards, as well. If you are not sure when the second-level guard you found might rotate away, you're going to be forced to try to compromise a larger fraction of them if you want more certainty before you have to start over. This will also be noisy.
- Future directions
Here are some more ideas for improvements that should be done sooner or later:
- Maybe we should make the size and rotation period of secondary/third guard sets to be configurable by the user.
What about as a function of the consensus? That third hop could also rotate faster if we had more nodes/more diversity..
A consensus parameter is always better, it means less headache for the users and should give better results, because everyone will be using it regardless if they know about it or not.
Here's another crazy idea that would potentially bring this Vanguards idea closer to "Virtual Circuits": What if you divided your third-level Vanguards into NUM_SECOND_GUARDS isolated buckets, and mapped exactly one these buckets to each of your second-level guards? Then, when making Rend Circs, after you select a second-level hop for a given circuit, you can only extend to the third-level guards that correspond to the bucket for that second-level guard.
That way, if one third-level guard was compromised via a successful Sybil attack, the adversary would learn at most 1 of your second-level guards, instead of learning all of them and then getting to take their pick which one(s) they want to try to compromise.
Does that make sense?
With this change, it seems like we could then make the third-level rotation even faster, because even successful Sybils get you much less info.
Neat idea! Concept is interesting and makes sense, since we need to protect second_guard_set for way longer period of time, and first_guard_set for even longer.
Third_guard_set should always be considered known by an attacker. Given that the client selects the rendezvous relay in a hidden service circuit, we can start with the assumption that the rendezvous relay is evil and all the relays from third_guard_set are known to the attacker. It's quite easy with a few evil clients and evil relays being selected as rendezvous every time, to learn about all the relays (let's assume number is 18) from third_guard_set in < 30 minutes? We need a value for THIRD_GUARD_ROTATION as low as possible and possibly a bigger value for NUM_THIRD_GUARDS.
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Here's another crazy idea that would potentially bring this Vanguards idea closer to "Virtual Circuits": What if you divided your third-level Vanguards into NUM_SECOND_GUARDS isolated buckets, and mapped exactly one these buckets to each of your second-level guards?
...
That way, if one third-level guard was compromised via a successful Sybil attack, the adversary would learn at most 1 of your second-level guards, instead of learning all of them and then getting to take their pick which one(s) they want to try to compromise.
Neat idea! Concept is interesting and makes sense, since we need to protect second_guard_set for way longer period of time, and first_guard_set for even longer.
Not having the third guards be selected by every second guard makes sense when you consider that the adversary may not be able to compromise all relays equally. That was not a consideration I had in mind when I argued for “completely connected” guard sets. The purpose of multiple guards in a given position is (i) to allow onion services that have unusually high load to uses the excess capacity of more than one relay, (ii) to lower the variance of relay performance, and (iii) to recover more quickly from nodes leaving or failing. Note how these are all related to performance. The drawbacks of multiple relays are all related to security, and they are (i) more relays gives the adversary more chances to have his relays selected, (ii) more relays gives the adversary more options and opportunities to compromise relays once they are observed.
The best design for security purposes would be a single relay in each guard "set", with the final relay expiring the fastest. Currently, the suggestion was to keep one relay in the first guard position (or whatever NumEntryGuards is), because the security issue there is very serious (it can observe the onion service directly), and also because guards are already selected to have somewhat higher bandwidth and stability. Hopefully (currently unverified) this means that they have more *excess* capacity as well.
If you don’t allow all second guards to connect to all third guards, then you have to make a performance / security tradeoff. For example, suppose you give each second guard its own set (or “bucket”) of third guards. You shouldn’t give each second guard the same number of guards that we had planned to have in the third guard set, because that only makes things worse. Now Instead of choosing one set (of size, say, six), you have for each second relay a new chance for the adversary to own one of those relays. There is only a benefit if you reduce the number. It makes sense to me to reduce that number to one, because there is no reason to expect the third guard to be a bottleneck before the second guard (unlike the first guards, which it sounds like may be the only ones requiring the Guard flag). You still have some of the benefit of reducing performance variance and improving failure recovery because you have multiple second guards. And you have gained the benefit that was Mike’s original motivation of giving the adversary that compromises a third guard a view of only one of the second guards, with the hope that maybe this one he finds difficult to compromise.
There is an unanswered question of what to do with multiple first guards. The main reason I can see not to have single second guards for each first guard is that the first guard is likely to have more excess capacity and thus not be the bottleneck for high amounts of onion-service traffic. Assuming that's true, then I could imagine having separate second-guard buckets for each first guard as well. However, there is a bad multiplier effect here. For example, with three first guards and three second guards per first guard, there are now *nine* relays in the second-guard position, each of which presents a chance for a malicious relay to be chosen.
My current thought is that the following is a better design than the one currently in the proposal: 1. One first guard 2. Two second guards per first guard (so two unless NumEntryGuards is changed) 3. One third guard per second guard (so two unless NumEntryGuards is changed) This will only perform worse than the design suggested in the proposal (which is one first guard, two second guards, and six third guards, with all guards connected to all following guards). I think it might not much worse, though.
And I agree with the suggestions to randomize the expiration times.
Best, Aaron
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 7/18/2015 12:49 AM, A. Johnson wrote:
Not having the third guards be selected by every second guard makes sense when you consider that the adversary may not be able to compromise all relays equally. That was not a consideration I had in mind when I argued for “completely connected” guard sets. The purpose of multiple guards in a given position is (i) to allow onion services that have unusually high load to uses the excess capacity of more than one relay, (ii) to lower the variance of relay performance, and (iii) to recover more quickly from nodes leaving or failing. Note how these are all related to performance. The drawbacks of multiple relays are all related to security, and they are (i) more relays gives the adversary more chances to have his relays selected, (ii) more relays gives the adversary more options and opportunities to compromise relays once they are observed.
The best design for security purposes would be a single relay in each guard "set", with the final relay expiring the fastest. Currently, the suggestion was to keep one relay in the first guard position (or whatever NumEntryGuards is), because the security issue there is very serious (it can observe the onion service directly), and also because guards are already selected to have somewhat higher bandwidth and stability. Hopefully (currently unverified) this means that they have more *excess* capacity as well.
If you don’t allow all second guards to connect to all third guards, then you have to make a performance / security tradeoff. For example, suppose you give each second guard its own set (or “bucket”) of third guards. You shouldn’t give each second guard the same number of guards that we had planned to have in the third guard set, because that only makes things worse. Now Instead of choosing one set (of size, say, six), you have for each second relay a new chance for the adversary to own one of those relays. There is only a benefit if you reduce the number. It makes sense to me to reduce that number to one, because there is no reason to expect the third guard to be a bottleneck before the second guard (unlike the first guards, which it sounds like may be the only ones requiring the Guard flag). You still have some of the benefit of reducing performance variance and improving failure recovery because you have multiple second guards. And you have gained the benefit that was Mike’s original motivation of giving the adversary that compromises a third guard a view of only one of the second guards, with the hope that maybe this one he finds difficult to compromise.
There is an unanswered question of what to do with multiple first guards. The main reason I can see not to have single second guards for each first guard is that the first guard is likely to have more excess capacity and thus not be the bottleneck for high amounts of onion-service traffic. Assuming that's true, then I could imagine having separate second-guard buckets for each first guard as well. However, there is a bad multiplier effect here. For example, with three first guards and three second guards per first guard, there are now *nine* relays in the second-guard position, each of which presents a chance for a malicious relay to be chosen.
My current thought is that the following is a better design than the one currently in the proposal: 1. One first guard 2. Two second guards per first guard (so two unless NumEntryGuards is changed) 3. One third guard per second guard (so two unless NumEntryGuards is changed) This will only perform worse than the design suggested in the proposal (which is one first guard, two second guards, and six third guards, with all guards connected to all following guards). I think it might not much worse, though.
And I agree with the suggestions to randomize the expiration times.
Best, Aaron
Agreed. The probability to run into a compromised relay will decrease dramatically if we use fewer second and third guards. However, this will put a penalty over performance, when talking about a popular hidden service.
I still see the third hop (speaking from hidden service server start point) is the weak part here. An attacker can connect to a hidden service at his malicious relay selected as rendezvous. Before you know it, all relays in third_guard_set are enumerated by the attacker. This is why I think it's better to have a bigger value for NUM_THIRD_GUARDS and a shorter period for THIRD_GUARD_ROTATION.
Here's why I think so:
Let's assume a journalist has published some documents offered by a whistleblower regarding the abuses done by "Malicious-ISP" and made them available via a hidden service to remain anonymous, until authorities react to apply the law and punish "Malicious-ISP". Let's assume "Malicious-ISP" is a multinational telecom company with branches in many countries and good relations with other ISPs around the world, who would cooperate with them (unfortunately this is not exactly science fiction).
Currently a big part of Tor network's bandwidth power is concentrated in DE,NL,FR. Many times, while browsing with Tor Browser, I had circuits with all 3 hops in the same country. Even more times with 2 out of 3 hops in the same country.
Think that an attacker has enumerated all the relays in third_guard_set, which we know it is trivial to do (for the sake of argument, let's pretend we use 2 relays in third_guard_set). Now, it all comes down to how fast can this attacker watch the IX point* of those 2 relays. Note that the attacker does not necessarily need to actually compromise these relays in a special way, just be able to watch their traffic somewhere upstream. By watching the traffic somewhere upstream, the attacker can enumerate the relays in second_guard_set, and so on to eventually first_guard_set.
If the hidden service is unlucky and picked all hops in the same country, its location can be deanon "before dinner". Even if the hops are in different countries, depends from case to case, with a simple phone call made by the attacker, the IX point* of the previous hop can be watched and so on, until the location of the hidden service is discovered. The attacker can establish unlimited connections to the hidden service while watching all these points and, simple, that's it.
*IX point - by IX point I don't necessarily mean the internet exchange points where the ISP of the relay is connected to, more like the master gateway responsible for the traffic of the relay in question.
I know Tor does not try to protect against a Global Passive adversary, but in this scenario there is no need for the adversary to be global - this is the problem. The adversary just needs to watch few points on the internet, which can be trivially done _before_ our guard rotation period expires.
In short, we give the power which we now think a global passive adversary has to a 'local passive adversary' or a 'rather ambitious adversary'. What now we think can be accomplished by watching the entire network, something we assume is _hard_ to do, will be possible just by watching a few parts on the internet. To make it even worse, the attacker will even have time to climb up on circuits (time offered by our guard rotation period).
This would be harder to do and require way more effort if the third_guard_set was bigger/more diverse and rotated much faster - but this increases the chances of a Sybil attack. I think the success chances of a Sybil attack are not tragic anyway, and will continue to decrease as more 'honest' relays are added to the consensus (this is the whole point of Tor anyway, that is why relays can be run by volunteers, if the majority of relays aren't honest, then anonymity is already killed).
By not choosing relays in our circuits so often and being paranoid that our attacker is 1%, 5%, 10% or whatever (I am not saying it can't be) we undermine the concept of onion routing and mixing with as many hops as possible. There is a reason why we keep the 10 minutes lifetime for a circuit - what's the point of it if after 10 minutes we mark a circuit as dirty just to create another one with the very same hops. This is intended to protect clients, I know, but in the context of a hidden service we assume that _the client is the attacker_.
I agree with most of what you said regarding the threat of a targeted observer. What I disagree with is that an adversary running his own relays is less of a threat. Running relays is trivial, and running 5% of the guards is fairly cheap (I estimate ~$3000/month (please ask for details)). If you increase the number of third guards to, say, ten, with average expiration times of, say, 6 hours, then after only 24 hours an adversary with 5% of the bandwidth has an 87% chance of compromising at least one third guard. By reducing the number of third guards to 2 and increasing the expiration time to an average of 24 hours, then it takes 20 days until 87% chance of compromise is reached.
Observing the network traffic of third parties is, it seems to me, actually much more difficult and excludes most adversaries except legal authorities. And legal authorities often do have to follow some legal procedures to observe network traffic, especially if it is being done between different legal jurisdictions.
But ultimately the point of the quickly-rotating third guards is only to *force* the adversary to invest resources into running malicious relays. That is a different type of attack than just traffic observation, and requiring it may stymie some adversaries who don’t have those skills or processes in place. So more and more-quickly rotating third guards would serve that purpose just as well. If the consensus is that an adversary that can observe arbitrary third-party traffic within hours is more of a concern than an adversary that can run 1-5% of the vanguards, then very short-lived third guards would be a reasonable solution, in my opinion. I would not be part of that consensus, but figuring out the speed and size of the adversaries we care about is unfortunately educated guessing at this point.
Best, Aaron
On Jul 17, 2015, at 8:11 PM, s7r s7r@sky-ip.org wrote:
Signed PGP part On 7/18/2015 12:49 AM, A. Johnson wrote:
Not having the third guards be selected by every second guard makes sense when you consider that the adversary may not be able to compromise all relays equally. That was not a consideration I had in mind when I argued for “completely connected” guard sets. The purpose of multiple guards in a given position is (i) to allow onion services that have unusually high load to uses the excess capacity of more than one relay, (ii) to lower the variance of relay performance, and (iii) to recover more quickly from nodes leaving or failing. Note how these are all related to performance. The drawbacks of multiple relays are all related to security, and they are (i) more relays gives the adversary more chances to have his relays selected, (ii) more relays gives the adversary more options and opportunities to compromise relays once they are observed.
The best design for security purposes would be a single relay in each guard "set", with the final relay expiring the fastest. Currently, the suggestion was to keep one relay in the first guard position (or whatever NumEntryGuards is), because the security issue there is very serious (it can observe the onion service directly), and also because guards are already selected to have somewhat higher bandwidth and stability. Hopefully (currently unverified) this means that they have more *excess* capacity as well.
If you don’t allow all second guards to connect to all third guards, then you have to make a performance / security tradeoff. For example, suppose you give each second guard its own set (or “bucket”) of third guards. You shouldn’t give each second guard the same number of guards that we had planned to have in the third guard set, because that only makes things worse. Now Instead of choosing one set (of size, say, six), you have for each second relay a new chance for the adversary to own one of those relays. There is only a benefit if you reduce the number. It makes sense to me to reduce that number to one, because there is no reason to expect the third guard to be a bottleneck before the second guard (unlike the first guards, which it sounds like may be the only ones requiring the Guard flag). You still have some of the benefit of reducing performance variance and improving failure recovery because you have multiple second guards. And you have gained the benefit that was Mike’s original motivation of giving the adversary that compromises a third guard a view of only one of the second guards, with the hope that maybe this one he finds difficult to compromise.
There is an unanswered question of what to do with multiple first guards. The main reason I can see not to have single second guards for each first guard is that the first guard is likely to have more excess capacity and thus not be the bottleneck for high amounts of onion-service traffic. Assuming that's true, then I could imagine having separate second-guard buckets for each first guard as well. However, there is a bad multiplier effect here. For example, with three first guards and three second guards per first guard, there are now *nine* relays in the second-guard position, each of which presents a chance for a malicious relay to be chosen.
My current thought is that the following is a better design than the one currently in the proposal: 1. One first guard 2. Two second guards per first guard (so two unless NumEntryGuards is changed) 3. One third guard per second guard (so two unless NumEntryGuards is changed) This will only perform worse than the design suggested in the proposal (which is one first guard, two second guards, and six third guards, with all guards connected to all following guards). I think it might not much worse, though.
And I agree with the suggestions to randomize the expiration times.
Best, Aaron
Agreed. The probability to run into a compromised relay will decrease dramatically if we use fewer second and third guards. However, this will put a penalty over performance, when talking about a popular hidden service.
I still see the third hop (speaking from hidden service server start point) is the weak part here. An attacker can connect to a hidden service at his malicious relay selected as rendezvous. Before you know it, all relays in third_guard_set are enumerated by the attacker. This is why I think it's better to have a bigger value for NUM_THIRD_GUARDS and a shorter period for THIRD_GUARD_ROTATION.
Here's why I think so:
Let's assume a journalist has published some documents offered by a whistleblower regarding the abuses done by "Malicious-ISP" and made them available via a hidden service to remain anonymous, until authorities react to apply the law and punish "Malicious-ISP". Let's assume "Malicious-ISP" is a multinational telecom company with branches in many countries and good relations with other ISPs around the world, who would cooperate with them (unfortunately this is not exactly science fiction).
Currently a big part of Tor network's bandwidth power is concentrated in DE,NL,FR. Many times, while browsing with Tor Browser, I had circuits with all 3 hops in the same country. Even more times with 2 out of 3 hops in the same country.
Think that an attacker has enumerated all the relays in third_guard_set, which we know it is trivial to do (for the sake of argument, let's pretend we use 2 relays in third_guard_set). Now, it all comes down to how fast can this attacker watch the IX point* of those 2 relays. Note that the attacker does not necessarily need to actually compromise these relays in a special way, just be able to watch their traffic somewhere upstream. By watching the traffic somewhere upstream, the attacker can enumerate the relays in second_guard_set, and so on to eventually first_guard_set.
If the hidden service is unlucky and picked all hops in the same country, its location can be deanon "before dinner". Even if the hops are in different countries, depends from case to case, with a simple phone call made by the attacker, the IX point* of the previous hop can be watched and so on, until the location of the hidden service is discovered. The attacker can establish unlimited connections to the hidden service while watching all these points and, simple, that's it.
*IX point - by IX point I don't necessarily mean the internet exchange points where the ISP of the relay is connected to, more like the master gateway responsible for the traffic of the relay in question.
I know Tor does not try to protect against a Global Passive adversary, but in this scenario there is no need for the adversary to be global - this is the problem. The adversary just needs to watch few points on the internet, which can be trivially done _before_ our guard rotation period expires.
In short, we give the power which we now think a global passive adversary has to a 'local passive adversary' or a 'rather ambitious adversary'. What now we think can be accomplished by watching the entire network, something we assume is _hard_ to do, will be possible just by watching a few parts on the internet. To make it even worse, the attacker will even have time to climb up on circuits (time offered by our guard rotation period).
This would be harder to do and require way more effort if the third_guard_set was bigger/more diverse and rotated much faster - but this increases the chances of a Sybil attack. I think the success chances of a Sybil attack are not tragic anyway, and will continue to decrease as more 'honest' relays are added to the consensus (this is the whole point of Tor anyway, that is why relays can be run by volunteers, if the majority of relays aren't honest, then anonymity is already killed).
By not choosing relays in our circuits so often and being paranoid that our attacker is 1%, 5%, 10% or whatever (I am not saying it can't be) we undermine the concept of onion routing and mixing with as many hops as possible. There is a reason why we keep the 10 minutes lifetime for a circuit - what's the point of it if after 10 minutes we mark a circuit as dirty just to create another one with the very same hops. This is intended to protect clients, I know, but in the context of a hidden service we assume that _the client is the attacker_.
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Hi,
Your points are correct and I cannot agree more. I don't think that an adversary running his own relays is less of a threat, just that it depends on the adversary and the context. Running his own relays might be the more expensive and time consuming way, as opposite to the one described by me. After all, having the majority of relays 'honest' is key to the network's strength, it's basically a fundamental limitation of the technology itself. We can't go too paranoid about the number of evil relays and only choose 5 relays from a total of ~6600 (even for a limited period of time [guard rotation period]) in the circuits of a hidden service. If too many relays are compromised, this raises the question if it's safe to still use Tor or not.
If we are talking about legal authorities being the targeted observer, they have simplified procedure for information exchange especially within the EU or maybe even US-EU, which means some target IP addresses can be monitored for incoming and outgoing traffic with a simple fax letter. A 6 hours guard rotation period is more than sufficient.
While I see both Sybil attack and targeted observer attack as big threats, I don't think we can say one is less important than the other. While a Sybil attack would require more time and money to run evil relays (costs grow proportional with Tor network's grow), a targeted observer attack would just need proper motivation. Also, forcing an adversary to do a Sybil attack and run his own evil relays will increase the network capacity (making Sybil attacks more expensive and with lower probabilistic rates for future attackers targeting other users) as well as help the non-targeted users by relaying their anonymous traffic. The targeted observer attack does not offer any benefit at all.
We need to put these things in balance somehow and find the correct balance between lowering the chances of a successful Sybil attack and also making an adversary to have to watch as many parts of the internet as possible, to make it harder/ more expensive/ not worth it / maybe even impossible. You are right, probably I am seeing it wrong and the threat of a Sybil attack is more realistic. Let's meet somewhere half way.
That's why I only made reference to third_guard_set number of relays and their rotation period and didn't say to increase the relays in second_guard_set or rotate the second guards faster. We can implement the Sybil defense at second_guard_set level and of course first_guard_set (already implemented). This way, we have a little from both of our goals (win, win). 'Better is the enemy of perfect' doesn't apply in such a complex structure, with so many possible threats and attacks, and if we can have some gains from both points it's a step forward.
third_guard_set is always assumed to be known to the attacker - it can be trivially discovered just by running one evil client and one or two evil relays to act as rendezvous points. If I recall correctly, an attacker won't even need special flags, uptime or any kind of history in the consensus for the relay(s) so enumerating all relays in third_guard_set is very cheap and simple. You can even do it from a laptop with a decent connection in < 30 minutes.
I just don't want to leave an equation too easy to solve for an attacker. If we can have protection for both (Sybil protection at second_guard_set and targeted observer protection and third_guard_set) I would prefer to have them both.
On 7/18/2015 7:19 AM, Aaron Johnson wrote:
I agree with most of what you said regarding the threat of a targeted observer. What I disagree with is that an adversary running his own relays is less of a threat. Running relays is trivial, and running 5% of the guards is fairly cheap (I estimate ~$3000/month (please ask for details)). If you increase the number of third guards to, say, ten, with average expiration times of, say, 6 hours, then after only 24 hours an adversary with 5% of the bandwidth has an 87% chance of compromising at least one third guard. By reducing the number of third guards to 2 and increasing the expiration time to an average of 24 hours, then it takes 20 days until 87% chance of compromise is reached.
Observing the network traffic of third parties is, it seems to me, actually much more difficult and excludes most adversaries except legal authorities. And legal authorities often do have to follow some legal procedures to observe network traffic, especially if it is being done between different legal jurisdictions.
But ultimately the point of the quickly-rotating third guards is only to *force* the adversary to invest resources into running malicious relays. That is a different type of attack than just traffic observation, and requiring it may stymie some adversaries who don’t have those skills or processes in place. So more and more-quickly rotating third guards would serve that purpose just as well. If the consensus is that an adversary that can observe arbitrary third-party traffic within hours is more of a concern than an adversary that can run 1-5% of the vanguards, then very short-lived third guards would be a reasonable solution, in my opinion. I would not be part of that consensus, but figuring out the speed and size of the adversaries we care about is unfortunately educated guessing at this point.
Best, Aaron
On Jul 17, 2015, at 8:11 PM, s7r s7r@sky-ip.org wrote:
Signed PGP part On 7/18/2015 12:49 AM, A. Johnson wrote:
Not having the thir
d guards be selected by every second guard makes
sense when you consider that the adversary may not be able to compromise all relays equally. That was not a consideration I had in mind when I argued for “completely connected” guard sets. The purpose of multiple guards in a given position is (i) to allow onion services that have unusually high load to uses the excess capacity of more than one relay, (ii) to lower the variance of relay performance, and (iii) to recover more quickly from nodes leaving or failing. Note how these are all related to performance. The drawbacks of multiple relays are all related to security, and they are (i) more relays gives the adversary more chances to have his relays selected, (ii) more relays gives the adversary more options and opportunities to compromise relays once they are observed.
The best design for security purposes would be a single relay in each guard "set", with the final relay expiring the fastest. Currently, the suggestion was to keep one relay in the first guard position (or whatever NumEntryGuards is), because the security issue there is very serious (it can observe the onion service directly), and also because guards are already selected to have somewhat higher bandwidth and stability. Hopefully (currently unverified) this means that they have more *excess* capacity as well.
If you don’t allow all second guards to connect to all third guards, then you have to make a performance / security tradeoff. For example, suppose you give each second guard its own set (or “bucket”) of third guards. You shouldn’t give each second guard the same number of guards that we had planned to have in the third guard set, because that only makes things worse. Now Instead of choosing one set (of size, say, six), you have for each second relay a new chance for the adversary to own one of those relays. There is only a benefit if you reduce the number. It makes sense to me to reduce that number to one, because there is no reason to expect the third guard to be a bottleneck before the second guard (unlike the first guards, which it sounds like may be the only ones requiring the Guard flag). You still have some of the benefit of reducing performance variance and improving failure recovery because you have multiple second guards. And you have gained the benefit that was Mike’s original motivation of giving the adversary that compromises a third guard a view of only one of the second guards, with the hope that maybe this one he finds difficult to compromise.
There is an unanswered question of what to do with multiple first guards. The main reason I can see not to have single second guards for each first guard is that the first guard is likely to have more excess capacity and thus not be the bottleneck for high amounts of onion-service traffic. Assuming that's true, then I could imagine having separate second-guard buckets for each first guard as well. However, there is a bad multiplier effect here. For example, with three first guards and three second guards per first guard, there are now *nine* relays in the second-guard position, each of which presents a chance for a malicious relay to be chosen.
My current thought is that the following is a better design than the one currently in the proposal: 1. One first guard 2. Two second guards per first guard (so two unless NumEntryGuards is changed) 3. One third guard per second guard (so two unless NumEntryGuards is changed) This will only perform worse than the design suggested in the proposal (which is one first guard, two second guards, and six third guards, with all guards connected to all following guards). I think it might not much worse, though.
And I agree with the suggestions to randomize the expiration times.
Best, Aaron
Agreed. The probability to run into a compromised relay will decrease dramatically if we use fewer second and third guards. However, this will put a penalty over performance, when talking about a popular hidden service.
I still see the third hop (speaking from hidden service server start point) is the weak part here. An attacker can connect to a hidden service at his malicious relay selected as rendezvous. Before you know it, all relays in third_guard_set are enumerated by the attacker. This is why I think it's better to have a bigger value for NUM_THIRD_GUARDS and a shorter period for THIRD_GUARD_ROTATION.
Here's why I think so:
Let's assume a journalist has published some documents offered by a whistleblower regarding the abuses done by "Malicious-ISP" and made them available via a hidden service to remain anonymous, until authorities react to apply the law and punish "Malicious-ISP". Let's assume "Malicious-ISP" is a multinational telecom company with branches in many countries and good relations with other ISPs around the world, who would cooperate with them (unfortunately this is not exactly science fiction).
Currently a big part of Tor network's bandwidth power is concentrated in DE,NL,FR. Many times, while browsing with Tor Browser, I had circuits with all 3 hops in the same country. Even more times with 2 out of 3 hops in the same country.
Think that an attacker has enumerated all the relays in third_guard_set, which we know it is trivial to do (for the sake of argument, let's pretend we use 2 relays in third_guard_set). Now, it all comes down to how fast can this attacker watch the IX point* of those 2 relays. Note that the attacker does not necessarily need to actually compromise these relays in a special way, just be able to watch their traffic somewhere upstream. By watching the traffic somewhere upstream, the attacker can enumerate the relays in second_guard_set, and so on to eventually first_guard_set.
If the hidden service is unlucky and picked all hops in the same country, its location can be deanon "before dinner". Even if the hops are in different countries, depends from case to case, with a simple phone call made by the attacker, the IX point* of the previous hop can be watched and so on, until the location of the hidden service is discovered. The attacker can establish unlimited connections to the hidden service while watching all these points and, simple, that's it.
*IX point - by IX point I don't necessarily mean the internet exchange points where the ISP of the relay is connected to, more like the master gateway responsible for the traffic of the relay in question.
I know Tor does not try to protect against a Global Passive adversary, but in this scenario there is no need for the adversary to be global - this is the problem. The adversary just needs to watch few points on the internet, which can be trivially done _before_ our guard rotation period expires.
In short, we give the power which we now think a global passive adversary has to a 'local passive adversary' or a 'rather ambitious adversary'. What now we think can be accomplished by watching the entire network, something we assume is _hard_ to do, will be possible just by watching a few parts on the internet. To make it even worse, the attacker will even have time to climb up on circuits (time offered by our guard rotation period).
This would be harder to do and require way more effort if the third_guard_set was bigger/more diverse and rotated much faster - but this increases the chances of a Sybil attack. I think the success chances of a Sybil attack are not tragic anyway, and will continue to decrease as more 'honest' relays are added to the consensus (this is the whole point of Tor anyway, that is why relays can be run by volunteers, if the majority of relays aren't honest, then anonymity is already killed).
By not choosing relays in our circuits so often and being paranoid that our attacker is 1%, 5%, 10% or whatever (I am not saying it can't be) we undermine the concept of onion routing and mixing with as many hops as possible. There is a reason why we keep the 10 minutes lifetime for a circuit - what's the point of it if after 10 minutes we mark a circuit as dirty just to create another one with the very same hops. This is intended to protect clients, I know, but in the context of a hidden service we assume that _the client is the attacker_.
On Sat, Jul 18, 2015 at 03:11:26AM +0300, s7r wrote:
I still see the third hop (speaking from hidden service server start point) is the weak part here. An attacker can connect to a hidden service at his malicious relay selected as rendezvous. Before you know it, all relays in third_guard_set are enumerated by the attacker. This is why I think it's better to have a bigger value for NUM_THIRD_GUARDS and a shorter period for THIRD_GUARD_ROTATION.
I haven't been keeping up with this thread well, so maybe that has already been mentioned, but in case it hasn't: also consider the case where the Tor client runs two hidden services, and so they share the same guard infrastructure. In today's design, they each have one guard, but many Tor clients have that one guard, so you can't conclude much from noticing it. In last year's design, they each have the same three guards, which is a nearly unique fingerprint. In the above design, where we have more third-level guards, we're heading back into the unique fingerprint territory.
So many constraints to consider at once! :)
--Roger
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 7/19/2015 9:26 AM, Roger Dingledine wrote:
On Sat, Jul 18, 2015 at 03:11:26AM +0300, s7r wrote:
I still see the third hop (speaking from hidden service server start point) is the weak part here. An attacker can connect to a hidden service at his malicious relay selected as rendezvous. Before you know it, all relays in third_guard_set are enumerated by the attacker. This is why I think it's better to have a bigger value for NUM_THIRD_GUARDS and a shorter period for THIRD_GUARD_ROTATION.
I haven't been keeping up with this thread well, so maybe that has already been mentioned, but in case it hasn't: also consider the case where the Tor client runs two hidden services, and so they share the same guard infrastructure. In today's design, they each have one guard, but many Tor clients have that one guard, so you can't conclude much from noticing it. In last year's design, they each have the same three guards, which is a nearly unique fingerprint. In the above design, where we have more third-level guards, we're heading back into the unique fingerprint territory.
So many constraints to consider at once! :)
--Roger
Yet another very important aspect - thanks for stepping in.
Considering the fingerprint threat which is not less dangerous, I assume that fingerprinting will work for an adversary only if we have a reasonable number of relays in third_guard_set. If third_guard_set is > 75% of all relays in the consensus, fingerprinting becomes impossible since all clients would have similar fingerprints.
This would be a better approach. Sybil defense at second_guard_set and first_guard_set and targeted observer attack defense considering anti-fingerprinting measure at third_guard_set.
I have no problem in keeping second_guard_set as small as possible (2 relays) and for a reasonable rotation period. My concern is only with third_guard_set which is always before an assumed evil rendezvous point. I would like to rely on diversity and paths as random as possible here and have second_guard_set + first_guard_set as Sybil protections.
How would the numbers for a 5% adversary look like if we consider:
first_guard_set NUM_FIRST_GUARDS = 1, with the current rotation period
second_guard_set NUM_SECOND_GUARDS = two guards per guard in first_guard_set (so two, if we don't change NumEntryGuards) SECOND_GUARD_ROTATION = randomized, between M and N (where M~=2 weeks ; N~=1,5 months ; average 1 month)
third_guard_set not defined, choose random every time from all relays in the consensus and keep the normal circuit lifetime per Tor's default circuit design.
Obviously a 5% adversary will have super good chances to be selected as third hop and learn about at least one relay in second_guard_set, but how dangerous is this really? After such an event, the adversary either has to compromise at least one relay in second_guard_set either wait more and use more resources until he is selected as second guard. But then he would need to also be selected as third guard + second guard in the same circuit and learn the first guard (very small chances). Then again try to compromise the relay in first_guard_set or wait more to be picked as first guard + second guard + third hop on a circuit requested by him at his evil rendezvous point - this becomes noisy and with very very little chances of total deanon.
This is the maximum a Sybil adversary can do - wait for the victim to pick his evil relays in a circuit, and it offers in exchange bandwidth to the network, handling the traffic for non-targeted users as well as increasing the costs for other Sybil adversaries targeting other users. It is true that a 5% adversary doesn't necessarily always target just one single Tor client, but the success probability would be calculated effectively per single Tor client.
The targeted observer attack offers nothing in return and it is just as real, it requires only proper motivation (it won't be done just for any boring reason). As stated in my previous email, perfect is impossible currently, with so many threats, but why not make it better and take in consideration all aspects.