Hello everyone,
In order that our paper does not fall into the list of "yet another seems-to-be-cool-feature is that never going to be discussed because researchers moved on another topic", here is an attached proposal in which we summarize our work published in the last PETS event and how it can be implemented.
And, if you find this interesting, I would be glad to submit a patch :)
Any kind of feedbacks is more than welcome!
Cheers!
Florentin
Hi Florentin,
I have copied your proposal below, so I can respond to it inline.
On 11 Jan 2018, at 21:00, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hello everyone,
In order that our paper does not fall into the list of "yet another seems-to-be-cool-feature is that never going to be discussed because researchers moved on another topic", here is an attached proposal in which we summarize our work published in the last PETS event and how it can be implemented.
And, if you find this interesting, I would be glad to submit a patch :)
Any kind of feedbacks is more than welcome!
Cheers!
Florentin
Filename: waterfilling-balancing-with-max-diversity.txt Title: Waterfilling Authors: Florentin Rochet and Olivier Pereira Reviewed by (thanks!): George Kadianakis, Edouard Cuvelier Created: Jan 2018 Status: Open
0 Motivation
An adversary who monitors the entry and exit of a Tor communication path is usually able to de-anonymize that communication by traffic correlation. In order to limit the number of users that a single corrupted entry node could attack, the users keep using the same entry node, also called a "guard" for long periods of time: since guard rotation is limited, the users are less likely to use a corrupted guard at some point in their communication. In the current design, the amount of traffic that a given guard sees is directly proportional to the bandwidth that is provided by this guard. As a result, the few guards offering the highest amount of bandwidth become very attractive targets for an attacker.
Waterfilling is a new path selection mechanism designed to make the guard selection even more efficient: if an adversary wants to profile more users, she has to increase her bandwidth _and_ increase the number of relays injected/hacked into the network.
Waterfilling mitigates the risks of end-to-end traffic correlation by balancing the load as evenly as possible on endpoints of user circuits. More precisely, waterfilling modifies the probability distribution according to which users select guard nodes my making that distribution closer to the uniform distribution, without impacting the performance of the Tor network.
1 Overview
The current Tor path selection algorithm is designed to satisfy two important properties:
- Each relay should handle a fraction of connections that is
proportional to its bandwidth.
- The nodes in each circuit position (entry-middle-exit) should
be able to handle the same volume of connections.
What about onion service circuits?
They consist of entry - middle - middle, and for the purposes of this analysis, make up about 4% of the network. (2% of traffic at rend points, going through 2 x 3-hop circuits.)
https://metrics.torproject.org/hidserv-rend-relayed-cells.html
Hence, in addition to select paths in a probabilistic manner, the path selection algorithm is responsible for balancing the network, that is, making sure that enough bandwidth is made available in all the positions. The current balance of the network is decided by the bandwidth-weights (see dir-spec.txt section 3.8.3. or/and the Waterfilling PETS2017 paper https://petsymposium.org/2017/papers/issue2/paper05-2017-2-source.pdf). This balance does not achieve maximum diversity in end-positions of user paths: the same network throughput could be achieved by decreasing the use of high-bandwidth relays and increasing the use of lower-bandwidth relays in the guard position, instead of using these relays in a way that is just proportional to their bandwidth.
When you say "bandwidth", it's not clear whether you mean consensus weight (measured bandwidth) or advertised bandwidth (bandwidth capacity). They're not the same.
I'm going to assume consensus weight from now on. Please fix all uses of bandwidth in the rest of the proposal.
Such a change would make top relays less attractive targets to adversaries, and would increase the number of relays that need to be compromised in order to obtain a given probability of mounting a successful correlation attack.
Our proposal only modifies the balance between the relays in a given position in the network. It does not modify, and actually takes as its starting point, any allocation mechanism used to decide the bandwidth that is allocated in guard, middle and exit positions. As a consequence, the changes that we propose are quite minimal in terms of code base and performance and, in particular, they do not interfere with prop 265.
2 Design
Correlation attacks require to control guard and exit nodes, but the scarcity of exit bandwidth is such that there is no real freedom in the way to use it.
As a result, the Waterfilling proposal focuses on the guard position. However, it could be easily extended to the exit position if, someday, nodes in that position happen not to be exploited to their full capacity.
No, exit bandwidth is only exploited to its full capacity on high-bandwidth exits in the northern EU and North America.
Select "Consensus Weight vs Bandwidth" on this map: https://atlas.torproject.org/#map/flag:Exit All the exits in all the purple countries are probably under-loaded. And some exits elsewhere are under-loaded. (That's why we let Exits be fallback directories.)
So the network might actually benefit the most from a reallocation of Exit bandwidth. But you'd have to use the advertised bandwidth rather than Wee and Wed.
What would it look like if we used waterfilling on the advertised bandwidths of Exits? Is there a way to do this that avoids gaming the system by increasing advertised bandwidth? Does the feedback loop with bandwidth authority measurements mitigate this risk?
_Recall_: Tor currently computes bandwidth-weights in order to balance the bandwidth made available by nodes between the different path positions. In particular the Wgg weight indicates to each guard which proportion of its bandwidth should be used for entry traffic (the rest being normally devoted to the middle position). This proportion is the same for all guards.
Wgg indicates to *clients* how often they should select guards in the guard position in *circuits*.
It doesn't influence relays themselves, and it only indirectly affects bandwidth.
Please fix similar wording in the rest of the proposal.
_Proposal_: We use Tor's bandwidth-weight Wgg as the basis of Waterfilling. This Wgg, combined with the total bandwidth made available by all guards, defines the total bandwidth made available in the guard position. In order to allocate this bandwidth, the Waterfilling proposal proceeds by "raising the water level": it requires all guard relays to devote to their guard role all the bandwidth that they have, until a so-called "water level" is reached. This water level is positioned in such a way that he total
typo: the
bandwidth provided in the guard position is exactly the same as the one that is currently made available in the Tor network.
As a result, guards offering a small amount of bandwidth, below the water level, will fully allocate their bandwidth to guard traffic, while all the guards offering a bandwidth that is higher than the water level will limit their guard bandwidth to the water level, and allocate the rest to the middle traffic (assuming that they are not exit relays).
flagged as Exits (some relays which allow exiting aren't flagged as Exits, because they don't have ports that are useful for web traffic).
Concretely, we compute the weight Wgg_i for each guard-flagged relay_i as follows:
- Sort all the guard relays by bandwidth in decreasing order
(i.e. the i-th guard has more bandwidth than the i+1-th).
a greater or equal consensus weight?
- Let K be the total number of guards, BW_i be the bandwidth of the
i-th ranked guard and G be the total bandwidth that guards make available.
Compute a "pivot" N and the weight Wgg_i assigned to relay_i in such a way that:
(a) Wgg_i * BW_i == Wgg_i+1 * BW_i+1 forall i in [1, N]
subscripts in brackets, otherwise it's ambiguous:
Wgg_(i+1) * BW_(i+1)
(b) Wgg_i == 1 forall i in [N+1, K] (c) sum_{i=1}^{K} Wgg_i*BW_i = Wgg*G (Wgg is provided by Tor)
These equations are under-specified, because they allow solutions with: Wgg*G > 0 Wgg_1 == 0 That is, no guard selections for high-bandwidth relays.
From the descriptions, I think the missing condition is: Wgg_N * BW_N >= Wgg_(N+1) * BW_(N+1)
Also, Wgg is provided by the Tor directory authorities based on consensus weights from the bandwidth authorities.
And what happens to any remainder in the calculations? (This is most important for small, low bandwidth test networks.)
For example, if: G = 10 Wgg = 0.6 BW_i = 6, 2, 2 What are the final weighted bandwidths? 2, 2, 2?
What if: Wgg = 0.5 Are the bandwidths: 1, 2, 2? 2, 1, 2 2, 2, 1?
As a result of this process, each guard ranked before the pivot N dedicates the same bandwidth to its guard role (equation (a)) -- we say that these guards achieve the water level, while each guard ranked after the pivot N dedicates its full bandwidth to the guard role (equation (b)) -- they are below the water level. Equation (c) makes sure that the pivot and the water level are positioned in a way that guarantees that the total amount of bandwidth dedicated to the guard position is the same as before. In practice, the value of N can be efficiently computed by dichotomic search on Equation (c),
Is this the same as a binary search? Does it require any division? Because division is slow on some Tor client architectures.
and the value of the Wgg_i then immediately follows from Equations (a) and (b).
Once Wgg_i is computed, we can compute Wmg_i = 1 - Wgg_i, which allocates to the middle position all the bandwidth that is left above the water level in the first N relays. The bigger the node is, the more it contributes to the middle position compared to the others. A visual representation of this process is available in Figure 1 in the Waterfilling paper.
2.1 Going further by tweaking original bandwidth-weights computation
As explained above, our Waterfilling equations are based on: 1) the Wgg weight computed by Tor 2) the assumption that the bandwidth available in exit is scarce, i.e., it is lower than the one available for guard (and middle) positions.
The second point is satisfied most of the time in Tor, and we do take it for granted here.
We, however, observe that Waterfilling could be made even more effective by applying a minor change in the way Tor computes the Wgg. For the moment, Tor computes Wgg in such a way that the same bandwidth is allocated to the guard and middle positions. As a result, both positions are in excess compared to the exit position.
The water level could be decreased and, as a result, the uniformity of the guard selection process could be improved, by computing Wgg in a way that allocates the same bandwidth to the guard and exit positions, putting the middle position as the only position in excess.
No, this would slow down Guards, and everything that isn't an exit circuit: * directory fetches (3% of total bandwidth to guard position) * onion services (rend is 4% of total bandwidth to guard and middle) * HSDir is unweighted, and we don't know how much bandwidth it uses * FallbackDir is unweighted, but mostly Guards, and small
We show in the performance section of the Waterfilling paper that scarcity on two positions does not reduce performance compared to vanilla bandwidth-weights.
What about guards that have low consensus weight due to latency, rather than available bandwidth?
I think this could also cause you huge latency issues as you push more bandwidth away from fast relays. I'm not sure if shadow captures this accurately.
3 Security implications
An analysis of the security impact of the Waterfilling proposal is made in Section 6 of the Waterfilling paper. It studies the expectation of the number of relays that an adversary needs to control in order to mount a successful correlation attack at a given time, as well as an analysis of the evolution of the time until first path compromise, based on TorPS.
Given that the distribution produced by Waterfilling is closer to the uniform distribution, the use of Waterfilling increases the expectation of the number of relays that an adversary needs to compromise in order mount a successful correlation attack. Concretely, and based on real data from 2015, this expectation increases by approximately 150 relays.
What percentage is this?
Waterfilling also considerably decreases the benefits of compromising a top Tor relay: based on the same data, we computed that around 35 relays
What percentage?
need to be compromised in order to obtain the benefits that would be obtained today by compromising Tor's top guard. On the flip side, the total bandwidth that those 35 relays would need to provide is 38% smaller than the one of the top relay, if they are designed to offer a bandwidth that is just at the water level. Moreover, these 35 relays used to equalize the impact of the current top guard is the lower bound. In practice, the adversary needs to predict the water level of all upcoming consensus to stay below it and not to waste bandwidth. A safe manner to achieve this is to split the resource into way more than 35 relays. At some point, the adversary would struggle between the need to stay off the radar with many machines and the waste of bandwidth if she has not enough of them.
4 Performance implications
This proposal aims at improving the anonymity provided by the Tor network without impacting its performance negatively.
From a theoretical viewpoint, since Waterfilling does not change the amount of bandwidth dedicated to the guard, middle and exit position, we should not observe any difference compared to vanilla Tor. The intuition is that, even if the top bandwidth relays that are currently affected to the guard position are less likely to be selected as guards, they become more likely to be selected as middle nodes, hence maintaining their contribution to fast Tor circuits.
We confirmed this intuition by running Shadow experiments with a Tor implementation of Waterfilling. Our results give the same CDF for ttfb and ttlb metrics under different network loads.
Please expand acronyms, and explain how similar the distributions are, and how latencies In shadow compare to the public network
Of course, these results depend on the accuracy with which the behavior of current relays is measured and reported.
However, an interesting feature of the Waterfilling proposal is that it is fully compatible with vanilla Tor: some Tor clients may run the current Tor path selection algorithm, and others may run Waterfilling without impacting the performance. This makes an experimental deployment fairly easy to operate at a small or medium scale, while maintaining the possibility to fall back to vanilla Tor if an unexpected behavior is detected.
5 Implementation
5.1 Overview
Most of the implementation of Waterfilling is on the directory authority side: only a few changes are needed on the client side and no change is needed on the relay side. A prototype implementation is available at https://github.com/frochet/waterfilling
I'm sorry, I can't read this, there are build files everywhere. I can't even find which files were changed. Can you provide clean commits in a fork, based on the Tor master branch?
Also, I think you will need a new consensus method for this change. We don't use consensus parameters or torrc options to modify the content of the consensus, we use consensus methods.
Here is how it works:
Every hour, directory authorities vote on a new consensus. Once the votes are collected, the dirauths produce a deterministic network consensus document containing information about each relay, including the waterfilling bandwidth-weights produced from the equations described above. e.g.:
...(more relays) r relayguard34 PPTH75+WkHl1GGw07DRE/S+JNdo 1970-01-01 00:02:37 51.254.112.52 9111 9112 m lp08MLTivsSZPhpZQy88i8NPeBNx10tPKBpHZsM3gYM s Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=10200 wfbw wgg=8029 wmg=1971. r 4uthority3 PYnzXGQ+67m0WtO64qtJkwsUzME 1970-01-01 00:02:33 11.0.5.71 9111 9112 m d8G2/8UQzAN3a9DixCtmaivhfUFTvlFKAxCAV1xHVKk s Authority Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=1890 wfbw wgg=10000 wmg=0. ...(more relays)
In this example, we see two relays having the Guard flag and their new waterfilling bandwidth allocation given on the lines starting with wfbw.
These are redundant, and could significantly expand the size of the consensus, and consensus diffs. (Consensus weights are one of the largest contributors to consensus diff size.)
Why not calculate wmg on clients? Why not calculate both wgg and wmg on clients?
If you must keep them in the consensus, please put these on the existing "w" lines, which are intended for this purpose:
"Other weighting keywords may be added later. Clients MUST ignore keywords they do not recognize."
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2234
The first relay has a high bandwidth, above the water level, and shares that bandwidth between the guard and the middle positions, as indicated by the wgg and wmg variables. The second relay has a lower bandwidth, below the water level, and fully uses it for guard traffic.
If no wgg or wmg weights are specified for a given relay, the vanilla bandwidth-weights are used, as provided at the bottom of the consensus.
Eventually, a modification of the client code is needed in order to parse and use the waterfilling weights. The changes are straightforward with a few lines of codes in existing functions.
6 Deployment discussion
Deploying a new feature that has a central role in security and performance of the network can be difficult due to the distributed nature of the network. Hopefully, this proposal does not suffer from such issue. We give here some arguments supporting this claim.
About performance: The balancing equations designed by the current path selection are kept untouched. Hence mixing a set of clients using Waterfilling in the network and another set of clients using the vanilla path selection is not a problem: they will both enforce the same allocation of bandwidth between the different path positions. We confirmed this with experiments in Shadow.
About user security: A co-existence of path selection algorithms may be a threat to anonymity if the transition is not handled carefully. A set of compromised middle relays may distinguish users with Waterfilling configuration from others. This is a problem if the anonymity set is not large enough. Hopefully, "large enough" can be ensured with a consensus parameter that only enables this feature when enough users have updated their client.
Unanswered questions:
What about the feedback loop between this new allocation system and the bandwidth authorities?
Should bandwidth authority clients use the new system?
How do we report on the new system? How can we determine whether it is better for security in practice? How can we determine if it is faster or slower in practice? How can we work out if someone is trying to game the system?
T
On 2018-01-11 08:47 AM, teor wrote:
What about guards that have low consensus weight due to latency, rather than available bandwidth?
I think this could also cause you huge latency issues as you push more bandwidth away from fast relays. I'm not sure if shadow captures this accurately.
FWIW, Shadow does accurately reflect latencies, assuming you supply a realistic network topology file and used relatively recent Tor metrics.
https://github.com/shadow/shadow/wiki/3.2-Network-Config https://github.com/shadow/shadow-plugin-tor/wiki#generating-a-new-tor-networ...
- Justin
Hello,
Thank you for your helpful review, teor.
I updated the proposal from most of your comments (see attached .txt) and I respond inline to add some precisions relative to a few of your questions.
Btw, I've mirrored my private repo to github https://github.com/frochet/Waterfilling, such that you have the proper commit history.
On 2018-01-11 14:47, teor wrote:
Hi Florentin,
I have copied your proposal below, so I can respond to it inline.
On 11 Jan 2018, at 21:00, Florentin Rochet <florentin.rochet@uclouvain.be mailto:florentin.rochet@uclouvain.be> wrote:
<skip> 1 Overview
The current Tor path selection algorithm is designed to satisfy two important properties:
1) Each relay should handle a fraction of connections that is proportional to its bandwidth.
2) The nodes in each circuit position (entry-middle-exit) should be able to handle the same volume of connections.
What about onion service circuits?
They consist of entry - middle - middle, and for the purposes of this analysis, make up about 4% of the network. (2% of traffic at rend points, going through 2 x 3-hop circuits.)
https://metrics.torproject.org/hidserv-rend-relayed-cells.html
That's a good point. Waterfilling uses the current bandwidth-weights logic as a basis and they doesn't account for onion service circuit, hence it also ignore this sort of traffic. Prop 265 tries to address that problem when producing the bandwidth-weights; Since our method achieves the same total consensus weight balance between position as the one produced by bandwidth-weights, Waterfilling would directly inherit Prop 265's properties if this proposal is merged.
Hence, in addition to select paths in a probabilistic manner, the path selection algorithm is responsible for balancing the network, that is, making sure that enough bandwidth is made available in all the positions. The current balance of the network is decided by the bandwidth-weights (see dir-spec.txt section 3.8.3. or/and the Waterfilling PETS2017 paper https://petsymposium.org/2017/papers/issue2/paper05-2017-2-source.pdf). This balance does not achieve maximum diversity in end-positions of user paths: the same network throughput could be achieved by decreasing the use of high-bandwidth relays and increasing the use of lower-bandwidth relays in the guard position, instead of using these relays in a way that is just proportional to their bandwidth.
When you say "bandwidth", it's not clear whether you mean consensus weight (measured bandwidth) or advertised bandwidth (bandwidth capacity). They're not the same.
I'm going to assume consensus weight from now on. Please fix all uses of bandwidth in the rest of the proposal.
Yes, we mean consensus weight :) I did s/bandwidth/consensus weight within the proposal
<skip> > 2 Design > > Correlation attacks require to control guard and exit nodes, but the > scarcity of exit bandwidth is such that there is no real freedom in > the way to use it. > > As a result, the Waterfilling proposal focuses > on the guard position. However, it could be easily extended to the > exit position if, someday, nodes in that position happen not to be > exploited to their full capacity.
No, exit bandwidth is only exploited to its full capacity on high-bandwidth exits in the northern EU and North America.
Well, I changed my wording here to avoid ambiguity. I was talking about relays flagged as Exits being used only in the exit position (Wee and Wed have the max value), which means that we cannot apply our method over those relays with the current state of the Tor network.
Select "Consensus Weight vs Bandwidth" on this map: https://atlas.torproject.org/#map/flag:Exit All the exits in all the purple countries are probably under-loaded. And some exits elsewhere are under-loaded. (That's why we let Exits be fallback directories.)
So the network might actually benefit the most from a reallocation of Exit bandwidth. But you'd have to use the advertised bandwidth rather than Wee and Wed.
What would it look like if we used waterfilling on the advertised bandwidths of Exits?
The problem your mention is more a measurement problem that would not exist if the bwauths were perfect, within a perfect network. I believe that research such as Peerflow[0] is the right path to track down such issue, and this is compatible to our proposal.
[0] www.robgjansen.com/publications/peerflow-popets2017.pdf
Is there a way to do this that avoids gaming the system by increasing advertised bandwidth? Does the feedback loop with bandwidth authority measurements mitigate this risk?
_Recall_: Tor currently computes bandwidth-weights in order to balance the bandwidth made available by nodes between the different path positions. In particular the Wgg weight indicates to each guard which proportion of its bandwidth should be used for entry traffic (the rest being normally devoted to the middle position). This proportion is the same for all guards.
Wgg indicates to *clients* how often they should select guards in the guard position in *circuits*.
It doesn't influence relays themselves, and it only indirectly affects bandwidth.
Please fix similar wording in the rest of the proposal.
Yep, this is basically what we tried to say. In fact, what I wrote was the description of the *consequence* of bandwidth-weights. I tried to re-word with your suggestion, thank you.
<skip> > Compute a "pivot" N and the weight Wgg_i assigned to > relay_i in such a way that: > (a) Wgg_i * BW_i == Wgg_i+1 * BW_i+1 forall i in [1, N]
subscripts in brackets, otherwise it's ambiguous:
Wgg_(i+1) * BW_(i+1)
(b) Wgg_i == 1 forall i in [N+1, K] (c) sum_{i=1}^{K} Wgg_i*BW_i = Wgg*G (Wgg is provided by Tor)
These equations are under-specified, because they allow solutions with: Wgg*G > 0 Wgg_1 == 0 That is, no guard selections for high-bandwidth relays.
From the descriptions, I think the missing condition is: Wgg_N * BW_N >= Wgg_(N+1) * BW_(N+1)
Yes, this can be added. But I think that this condition is redundant, since BW_i are sorted in decreasing order.
Also, Wgg is provided by the Tor directory authorities based on consensus weights from the bandwidth authorities.
And what happens to any remainder in the calculations?
This is a very good question. Currently in my implementation, I ignore the remainder. This is negligible for large network but can be weird for small one (of a few relays).
A possible solution would be to use floating precision for consensus weights.
(This is most important for small, low bandwidth test networks.)
For example, if: G = 10 Wgg = 0.6 BW_i = 6, 2, 2 What are the final weighted bandwidths? 2, 2, 2?
From my note, my current implementation would crash if the water level reaches the smallest relay. Since it was prototype code, I didn't mind to think about it, and I let it that way.
I think that below a fixed size of the network, it does not make sense to use this proposal. In this example, the remainder accounts for a large part of the network capacity and would just be wasted.
<skip> > As a result of this process, each guard ranked before the pivot N > dedicates the same bandwidth to its guard role (equation (a)) -- we > say that these guards achieve the water level, while each guard > ranked after the pivot N dedicates its full bandwidth to the guard > role (equation (b)) -- they are below the water level. Equation (c) > makes sure that the pivot and the water level are positioned in a > way that guarantees that the total amount of bandwidth dedicated to > the guard position is the same as before. In practice, the value of > N can be efficiently computed by dichotomic search on Equation (c),
Is this the same as a binary search? Does it require any division? Because division is slow on some Tor client architectures.
Yes, binary search. It does require division. However, waterfilling is designed to be executed in the authority side and called only when the consensus document is produced. Moreover, my tests indicates that the computation consumes a few ms.
<skip> > 2.1 Going further by tweaking original bandwidth-weights computation > > As explained above, our Waterfilling equations are based on: 1) the > Wgg weight computed by Tor 2) the assumption that the bandwidth > available in exit is scarce, i.e., it is lower than the one > available for guard (and middle) positions. > The second point is satisfied most of the time in Tor, and we do take > it for granted here. > We, however, observe that Waterfilling could be made even more > effective by applying a minor change in the way Tor computes the > Wgg. For the moment, Tor computes Wgg in such a way that the same > bandwidth is allocated to the guard and middle positions. As a > result, both positions are in excess compared to the exit position. > > The water level could be decreased and, as a result, the uniformity > of the guard selection process could be improved, by computing Wgg > in a way that allocates the same bandwidth to the guard and exit > positions, putting the middle position as the only position in > excess.
No, this would slow down Guards, and everything that isn't an exit circuit:
- directory fetches (3% of total bandwidth to guard position)
- onion services (rend is 4% of total bandwidth to guard and middle)
- HSDir is unweighted, and we don't know how much bandwidth it uses
- FallbackDir is unweighted, but mostly Guards, and small
That's difficult to predict, I cannot be sure if it is better or worse for that type of traffic since internal circuits use at least 2 middle relays + the guard and sometimes, even not the guard. Hence we might also think that pushing a bit more to the middle position could be a good thing to do. Moreover, middle relays are unstable and often at the edge of the internet, while guard are stable and most of them within the core of the internet. Hence, a little more of them within the middle position *could* be a good thing, especially if it makes entry's selection probability more uniform. Anyway, I don't have any proof to assert this, as well that I don't have any proof to assert that this optimization could be bad. What I got, is that, for exit circuits, it does not slow down anything.
This optimization is not mandatory, and could also be enabled/disabled at will by the directory auths.
We show in the performance section of the Waterfilling paper that scarcity on two positions does not reduce performance compared to vanilla bandwidth-weights.
What about guards that have low consensus weight due to latency, rather than available bandwidth?
I think this could also cause you huge latency issues as you push more bandwidth away from fast relays. I'm not sure if shadow captures this accurately.
If it happens that any bandwidth is pushed away from fast relays within the entry position and make the entry position slower, at average, then it will make the middle position faster (because it got that bandwidth pushed away). Since the latency of your traffic flow just care about the global latency of the circuit, this will not appear to be slower or faster, on average. This is exactly what we observe in Shadow, and yes, it captures latency accurately. At least, better than any other simulator.
3 Security implications
An analysis of the security impact of the Waterfilling proposal is made in Section 6 of the Waterfilling paper. It studies the expectation of the number of relays that an adversary needs to control in order to mount a successful correlation attack at a given time, as well as an analysis of the evolution of the time until first path compromise, based on TorPS.
Given that the distribution produced by Waterfilling is closer to the uniform distribution, the use of Waterfilling increases the expectation of the number of relays that an adversary needs to compromise in order mount a successful correlation attack. Concretely, and based on real data from 2015, this expectation increases by approximately 150 relays.
What percentage is this?
~ 25 %
<skip>
I'm sorry, I can't read this, there are build files everywhere. I can't even find which files were changed. Can you provide clean commits in a fork, based on the Tor master branch?
Also, I think you will need a new consensus method for this change. We don't use consensus parameters or torrc options to modify the content of the consensus, we use consensus methods.
Sorry about this. I've updated the repo with a proper commit history based on my fork. The code gives an idea about the easiness to plug-in this method to the current path selection.
<skip>
These are redundant, and could significantly expand the size of the consensus, and consensus diffs. (Consensus weights are one of the largest contributors to consensus diff size.)
True, each time the consensus weights are modified, those waterfilling weights need to be recomputed. It adds one line per guard, which is about 2~3% of the size of the full consensus. Is such a thing a problem?
Why not calculate wmg on clients? Why not calculate both wgg and wmg on clients?
This is again a very good question: for such a critical feature (path selection), it is important that the directory auths have full power over the weight computation. If it happens that some change are needed, then the Tor network is updated in a straightforward way. This is not the case if those weights are computed in client code. In fact, I believe that one of the strength of this proposal is the oriented design towards the dirauths.
<skip>
Unanswered questions:
What about the feedback loop between this new allocation system and the bandwidth authorities?
I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
Should bandwidth authority clients use the new system?
How do we report on the new system? How can we determine whether it is better for security in practice? How can we determine if it is faster or slower in practice? How can we work out if someone is trying to game the system?
T
I've added a section within the proposal for all upcoming questions.
Best, Florentin
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 13 Jan 2018, at 01:17, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hello,
Thank you for your helpful review, teor.
I updated the proposal from most of your comments (see attached .txt) and I respond inline to add some precisions relative to a few of your questions.
Btw, I've mirrored my private repo to github https://github.com/frochet/Waterfilling, such that you have the proper commit history.
Thanks!
I replied below in context.
On 2018-01-11 14:47, teor wrote:
Hi Florentin,
I have copied your proposal below, so I can respond to it inline. ...
Hence, in addition to select paths in a probabilistic manner, the path selection algorithm is responsible for balancing the network, that is, making sure that enough bandwidth is made available in all the positions. The current balance of the network is decided by the bandwidth-weights (see dir-spec.txt section 3.8.3. or/and the Waterfilling PETS2017 paper https://petsymposium.org/2017/papers/issue2/paper05-2017-2-source.pdf). This balance does not achieve maximum diversity in end-positions of user paths: the same network throughput could be achieved by decreasing the use of high-bandwidth relays and increasing the use of lower-bandwidth relays in the guard position, instead of using these relays in a way that is just proportional to their bandwidth.
When you say "bandwidth", it's not clear whether you mean consensus weight (measured bandwidth) or advertised bandwidth (bandwidth capacity). They're not the same.
I'm going to assume consensus weight from now on. Please fix all uses of bandwidth in the rest of the proposal.
Yes, we mean consensus weight :) I did s/bandwidth/consensus weight within the proposal
In some cases, you seem to expect the consensus weight to be the available bandwidth (or advertised bandwidth) of the relay. I'll try to find them, but you should check as well.
...
<skip> > Compute a "pivot" N and the weight Wgg_i assigned to > relay_i in such a way that: > > (a) Wgg_i * BW_i == Wgg_i+1 * BW_i+1 forall i in [1, N]
subscripts in brackets, otherwise it's ambiguous:
Wgg_(i+1) * BW_(i+1)
(b) Wgg_i == 1 forall i in [N+1, K] (c) sum_{i=1}^{K} Wgg_i*BW_i = Wgg*G (Wgg is provided by Tor)
These equations are under-specified, because they allow solutions with: Wgg*G > 0 Wgg_1 == 0 That is, no guard selections for high-bandwidth relays.
This is a really important point:
Your equations allow the largest guard to have zero guard weighting and zero guard bandwidth. They should require it to have at least as much guard bandwidth as the water level.
From the descriptions, I think the missing condition is: Wgg_N * BW_N >= Wgg_(N+1) * BW_(N+1)
Yes, this can be added. But I think that this condition is redundant, since BW_i are sorted in decreasing order.
No, it's not, because it's a condition on the *guard-weighted* consensus weights. The sorting is a condition on the *original* consensus weights.
It might help to provide a procedural algorithm for assigning bandwidth, as well as equations. It would resolve some ambiguity.
Also, Wgg is provided by the Tor directory authorities based on consensus weights from the bandwidth authorities.
And what happens to any remainder in the calculations?
This is a very good question. Currently in my implementation, I ignore the remainder. This is negligible for large network but can be weird for small one (of a few relays).
A possible solution would be to use floating precision for consensus weights.
No, this is not a good solution. It simply passes the issue to the floating-point implementation. And the result will depend on the libraries, processor, and compiler.
For this reason, we work very hard to use as little floating point as possible in Tor. We really don't want to use it when generating the consensus, because mismatches will break consensus.
Instead, I suggest you apply this missing condition:
From the descriptions, I think the missing condition is: Wgg_N * BW_N >= Wgg_(N+1) * BW_(N+1)
And then allocate the remainder to the highest-bandwidth relays, perhaps in proportion to their consensus weight. Please allocate bandwidth to equally weighted relays using a stable sort that is the same on all directory authorities. (The existing relay sorting code might already do this for you.)
(This is most important for small, low bandwidth test networks.)
For example, if: G = 10 Wgg = 0.6 BW_i = 6, 2, 2 What are the final weighted bandwidths? 2, 2, 2?
From my note, my current implementation would crash if the water level reaches the smallest relay. Since it was prototype code, I didn't mind to think about it, and I let it that way.
We would need a solution for this crash as part of the proposal.
I think that below a fixed size of the network, it does not make sense to use this proposal. In this example, the remainder accounts for a large part of the network capacity and would just be wasted.
We will test this in small networks, and it must work. I suggest you allocate the remainder to high-bandwidth relays.
...
<skip> > 2.1 Going further by tweaking original bandwidth-weights computation > > As explained above, our Waterfilling equations are based on: 1) the > Wgg weight computed by Tor 2) the assumption that the bandwidth > available in exit is scarce, i.e., it is lower than the one > available for guard (and middle) positions. > > The second point is satisfied most of the time in Tor, and we do take > it for granted here. > > We, however, observe that Waterfilling could be made even more > effective by applying a minor change in the way Tor computes the > Wgg. For the moment, Tor computes Wgg in such a way that the same > bandwidth is allocated to the guard and middle positions. As a > result, both positions are in excess compared to the exit position. > > The water level could be decreased and, as a result, the uniformity > of the guard selection process could be improved, by computing Wgg > in a way that allocates the same bandwidth to the guard and exit > positions, putting the middle position as the only position in > excess.
No, this would slow down Guards, and everything that isn't an exit circuit:
- directory fetches (3% of total bandwidth to guard position)
If we reduce guard weights on large relays, does this slow down Tor bootstrap? (Because smaller relays with higher latency get more micro descriptor requests.)
Does it slow down directory fetches in general?
- onion services (rend is 4% of total bandwidth to guard and middle)
- HSDir is unweighted, and we don't know how much bandwidth it uses
- FallbackDir is unweighted, but mostly Guards, and small
That's difficult to predict, I cannot be sure if it is better or worse for that type of traffic since internal circuits use at least 2 middle relays + the guard and sometimes, even not the guard. Hence we might also think that pushing a bit more to the middle position could be a good thing to do. Moreover, middle relays are unstable and often at the edge of the internet, while guard are stable and most of them within the core of the internet. Hence, a little more of them within the middle position *could* be a good thing, especially if it makes entry's selection probability more uniform. Anyway, I don't have any proof to assert this, as well that I don't have any proof to assert that this optimization could be bad. What I got, is that, for exit circuits, it does not slow down anything.
This optimization is not mandatory, and could also be enabled/disabled at will by the directory auths.
We show in the performance section of the Waterfilling paper that scarcity on two positions does not reduce performance compared to vanilla bandwidth-weights.
What about guards that have low consensus weight due to latency, rather than available bandwidth?
I think this could also cause you huge latency issues as you push more bandwidth away from fast relays. I'm not sure if shadow captures this accurately.
If it happens that any bandwidth is pushed away from fast relays within the entry position and make the entry position slower, at average, then it will make the middle position faster (because it got that bandwidth pushed away). Since the latency of your traffic flow just care about the global latency of the circuit, this will not appear to be slower or faster, on average. This is exactly what we observe in Shadow, and yes, it captures latency accurately. At least, better than any other simulator.
I am also concerned that this proposal may overload small guards and underload large guards. Because it does not consider advertised bandwidth.
For example, some guards are weighted at 10x their advertised bandwidth. Others are weighted at 0.1x their advertised bandwidth.
There is also a geographical bias to guard bandwidth: https://atlas.torproject.org/#map_consensus_weight_to_bandwidth/flag:Guard
Does shadow capture the consensus weight to advertised bandwidth ratio? (That is, the actual load on the relay.) Or the advertised bandwidth? (The approximate actual capacity of the relay.)
I think this proposal will actually be good for the network, because some large consensus weight relays are overloaded. And giving them more middle weight will reduce their overall load. (Middles get less load than guards.)
But I would like others to think through this, too.
3 Security implications
An analysis of the security impact of the Waterfilling proposal is made in Section 6 of the Waterfilling paper. It studies the expectation of the number of relays that an adversary needs to control in order to mount a successful correlation attack at a given time, as well as an analysis of the evolution of the time until first path compromise, based on TorPS.
Given that the distribution produced by Waterfilling is closer to the uniform distribution, the use of Waterfilling increases the expectation of the number of relays that an adversary needs to compromise in order mount a successful correlation attack. Concretely, and based on real data from 2015, this expectation increases by approximately 150 relays.
What percentage is this?
~ 25 %
<skip>
I'm sorry, I can't read this, there are build files everywhere. I can't even find which files were changed. Can you provide clean commits in a fork, based on the Tor master branch?
Also, I think you will need a new consensus method for this change. We don't use consensus parameters or torrc options to modify the content of the consensus, we use consensus methods.
Sorry about this. I've updated the repo with a proper commit history based on my fork. The code gives an idea about the easiness to plug-in this method to the current path selection.
I tried reviewing this code, but it's hard. The changes are scattered through a lot of different commits.
Usually, we group changes into commits by topic or function area. And we use the commit title to say what the commit changes.
Then, when we fix a bug, we use "git commit --fixup" so we can see when the bug was fixed.
For example, I have to look at 20 commits to find out if this bug in the array size for commit e81a55ea has been fixed:
+ if (exits) { + if (search_pivot_and_compute_wfbw_weights_(exits, bwweights->wee, 0, + guards->num_used) < 0)
This function is also missing a case for guardsexits.
<skip>
You skipped this, but it's really important for the implementation:
On 12 Jan 2018, at 00:47, teor teor2345@gmail.com wrote:
Also, I think you will need a new consensus method for this change. We don't use consensus parameters or torrc options to modify the content of the consensus, we use consensus methods.
Someone who understands consensus methods will need to revise your proposal and code before it is merged.
Here is how it works:
Every hour, directory authorities vote on a new consensus. Once the votes are collected, the dirauths produce a deterministic network consensus document containing information about each relay, including the waterfilling bandwidth-weights produced from the equations described above. e.g.:
...(more relays) r relayguard34 PPTH75+WkHl1GGw07DRE/S+JNdo 1970-01-01 00:02:37 51.254.112.52 9111 9112 m lp08MLTivsSZPhpZQy88i8NPeBNx10tPKBpHZsM3gYM s Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=10200 wfbw wgg=8029 wmg=1971. r 4uthority3 PYnzXGQ+67m0WtO64qtJkwsUzME 1970-01-01 00:02:33 11.0.5.71 9111 9112 m d8G2/8UQzAN3a9DixCtmaivhfUFTvlFKAxCAV1xHVKk s Authority Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=1890 wfbw wgg=10000 wmg=0. ...(more relays)
In this example, we see two relays having the Guard flag and their new waterfilling bandwidth allocation given on the lines starting with wfbw.
These are redundant, and could significantly expand the size of the consensus, and consensus diffs. (Consensus weights are one of the largest contributors to consensus diff size.)
True, each time the consensus weights are modified, those waterfilling weights need to be recomputed. It adds one line per guard, which is about 2~3% of the size of the full consensus. Is such a thing a problem?
Changing bandwidths in each consensus makes consensus diffs *much* bigger.
See this proposal for details and an analysis: https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
The same issue applies to any bandwidth figures. Would it be sufficient to publish a small ratio instead?
For example, you could publish lines like:
w Bandwidth=100000 wfratio=1
or
w Bandwidth=100000 wfratio=3
And have clients calculate wfbw as follows:
wfbw = Bandwidth * 2^-wfratio
This would be much smaller, much more compressible, and would also solve some of your other issues around remainders, and also equal-weight guards and stable sorts.
You would need to use equations something like this:
Compute a "pivot" N and a weight ratio Wggr_i assigned to relay_i in such a way that:
Each high-bandwidth guard has a guard consensus weight within a factor of 2 of water level:
(a) 2 * 2^-Wggr_(N+1) * BW_(N+1) > 2^-Wggr_i * BW_i >= 2^-Wggr_(N+1) * BW_(N+1) forall i in [1, N]
Each low-bandwidth guard is under the water level: (unchanged)
(b) Wggr_i == 0 forall i in [N+1, K]
The total guard weight is within a ratio of the desired guard weight:
(c) 2 * Wgg * G > sum_{i=1}^{K} 2^-Wggr_i * BW_i >= Wgg * G
The upper bound on (c) is actually a worst-case for a single-relay network. We could get closer to Wgg * G by finding the lowest Wggr for the largest relay that satisfies:
(c.1) sum_{i=1}^{K} 2^-Wggr_i * BW_i >= Wgg * G
This might make the largest relay an exception to (a). I'm not sure if we'll need this optimisation or not.
Why not calculate wmg on clients? Why not calculate both wgg and wmg on clients?
This is again a very good question: for such a critical feature (path selection), it is important that the directory auths have full power over the weight computation. If it happens that some change are needed, then the Tor network is updated in a straightforward way. This is not the case if those weights are computed in client code. In fact, I believe that one of the strength of this proposal is the oriented design towards the dirauths.
Consensus changes on dirauths: * need to use consensus methods * require a 6-12 month deployment time * increases the size of all directory documents * change the whole network * if designed carefully, can be turned off using a network-wide parameter
Weight changes on clients: * require a 3-6 month deployment time * can be deployed to small numbers of clients * if designed carefully, can be turned off using a network-wide parameter
I don't know which we'll prefer if we deploy this.
You skipped this, but it's really important for the implementation:
If you must keep them in the consensus, please put these on the existing "w" lines, which are intended for this purpose:
"Other weighting keywords may be added later. Clients MUST ignore keywords they do not recognize."
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2234
Please revise the proposal to conform to the existing directory spec.
Unanswered questions:
What about the feedback loop between this new allocation system and the bandwidth authorities?
I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists: 1. Consensus weights on guards and middles change 2. Client use of guards and middles change 3. Bandwidth authority measurements of guards and middles change 4. Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
Should bandwidth authority clients use the new system?
How do we report on the new system? How can we determine whether it is better for security in practice? How can we determine if it is faster or slower in practice? How can we work out if someone is trying to game the system?
I've added a section within the proposal for all upcoming questions.
Thanks.
If you skip sections with unanswered questions or issues that need to be fixed, please put them in this section as well. Otherwise, it's easy to lose track of them.
Can you store the proposal in a git repository somewhere, so people can see what you've changed in each revision? Otherwise, they have to re-read the whole proposal each time.
T
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n ------------------------------------------------------------------------
Hello,
Thank you again for your valuable comments and efforts for improving!
I have added the proposal here: https://github.com/frochet/wf_proposal. And, btw, the slides from the talk might help to understand the proposal (for anyone that hesitates to dive into it: You can browse the slide at http://ndouuqkqdsd6v56h.onion/and get the big picture. Especially from slides 14.1, 14.2, .3, .4 and .5).
On 15/01/18 03:26, teor wrote:
On 13 Jan 2018, at 01:17, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hello,
Thank you for your helpful review, teor.
I updated the proposal from most of your comments (see attached .txt) and I respond inline to add some precisions relative to a few of your questions.
Btw, I've mirrored my private repo to github https://github.com/frochet/Waterfilling, such that you have the proper commit history.
Thanks!
I replied below in context.
On 2018-01-11 14:47, teor wrote:
Hi Florentin,
I have copied your proposal below, so I can respond to it inline. ...
Hence, in addition to select paths in a probabilistic manner, the path selection algorithm is responsible for balancing the network, that is, making sure that enough bandwidth is made available in all the positions. The current balance of the network is decided by the bandwidth-weights (see dir-spec.txt section 3.8.3. or/and the Waterfilling PETS2017 paper https://petsymposium.org/2017/papers/issue2/paper05-2017-2-source.pdf). This balance does not achieve maximum diversity in end-positions of user paths: the same network throughput could be achieved by decreasing the use of high-bandwidth relays and increasing the use of lower-bandwidth relays in the guard position, instead of using these relays in a way that is just proportional to their bandwidth.
When you say "bandwidth", it's not clear whether you mean consensus weight (measured bandwidth) or advertised bandwidth (bandwidth capacity). They're not the same.
I'm going to assume consensus weight from now on. Please fix all uses of bandwidth in the rest of the proposal.
Yes, we mean consensus weight :) I did s/bandwidth/consensus weight within the proposal
In some cases, you seem to expect the consensus weight to be the available bandwidth (or advertised bandwidth) of the relay. I'll try to find them, but you should check as well.
Ok, I will look into it. If you see some, let me know :)
...
<skip> > Compute a "pivot" N and the weight Wgg_i assigned to > relay_i in such a way that: > > (a) Wgg_i * BW_i == Wgg_i+1 * BW_i+1 forall i in [1, N] subscripts in brackets, otherwise it's ambiguous:
Wgg_(i+1) * BW_(i+1) (b) Wgg_i == 1 forall i in [N+1, K] (c) sum_{i=1}^{K} Wgg_i*BW_i = Wgg*G (Wgg is provided by Tor)
These equations are under-specified, because they allow solutions with: Wgg*G > 0 Wgg_1 == 0 That is, no guard selections for high-bandwidth relays.
This is a really important point:
Your equations allow the largest guard to have zero guard weighting and zero guard bandwidth. They should require it to have at least as much guard bandwidth as the water level.
From the descriptions, I think the missing condition is: Wgg_N * BW_N >= Wgg_(N+1) * BW_(N+1)
Yes, this can be added. But I think that this condition is redundant, since BW_i are sorted in decreasing order.
No, it's not, because it's a condition on the *guard-weighted* consensus weights. The sorting is a condition on the *original* consensus weights.
I have added the condition, thanks :)
It might help to provide a procedural algorithm for assigning bandwidth, as well as equations. It would resolve some ambiguity.
Not sure to understand what 'bandwidth' signifies here. I can provide a procedural algorithm to compute the waterfilling bandwidth-weights, if this is what you meant?
Also, Wgg is provided by the Tor directory authorities based on consensus weights from the bandwidth authorities.
And what happens to any remainder in the calculations?
This is a very good question. Currently in my implementation, I ignore the remainder. This is negligible for large network but can be weird for small one (of a few relays).
A possible solution would be to use floating precision for consensus weights.
No, this is not a good solution. It simply passes the issue to the floating-point implementation. And the result will depend on the libraries, processor, and compiler.
For this reason, we work very hard to use as little floating point as possible in Tor. We really don't want to use it when generating the consensus, because mismatches will break consensus.
Instead, I suggest you apply this missing condition:
From the descriptions, I think the missing condition is: Wgg_N * BW_N >= Wgg_(N+1) * BW_(N+1)
And then allocate the remainder to the highest-bandwidth relays, perhaps in proportion to their consensus weight. Please allocate bandwidth to equally weighted relays using a stable sort that is the same on all directory authorities. (The existing relay sorting code might already do this for you.)
I've added the condition and also added a section for implementation note with your comments. Thanks for pointing this out!
(This is most important for small, low bandwidth test networks.)
For example, if: G = 10 Wgg = 0.6 BW_i = 6, 2, 2 What are the final weighted bandwidths? 2, 2, 2?
From my note, my current implementation would crash if the water level reaches the smallest relay. Since it was prototype code, I didn't mind to think about it, and I let it that way.
We would need a solution for this crash as part of the proposal.
Added in implementation note.
I think that below a fixed size of the network, it does not make sense to use this proposal. In this example, the remainder accounts for a large part of the network capacity and would just be wasted.
We will test this in small networks, and it must work. I suggest you allocate the remainder to high-bandwidth relays.
Fine with me :) Added in implementation note.
...
<skip> > 2.1 Going further by tweaking original bandwidth-weights computation > > As explained above, our Waterfilling equations are based on: 1) the > Wgg weight computed by Tor 2) the assumption that the bandwidth > available in exit is scarce, i.e., it is lower than the one > available for guard (and middle) positions. > > The second point is satisfied most of the time in Tor, and we do take > it for granted here. > > We, however, observe that Waterfilling could be made even more > effective by applying a minor change in the way Tor computes the > Wgg. For the moment, Tor computes Wgg in such a way that the same > bandwidth is allocated to the guard and middle positions. As a > result, both positions are in excess compared to the exit position. > > The water level could be decreased and, as a result, the uniformity > of the guard selection process could be improved, by computing Wgg > in a way that allocates the same bandwidth to the guard and exit > positions, putting the middle position as the only position in > excess. No, this would slow down Guards, and everything that isn't an exit circuit: * directory fetches (3% of total bandwidth to guard position)
If we reduce guard weights on large relays, does this slow down Tor bootstrap? (Because smaller relays with higher latency get more micro descriptor requests.)
Does it slow down directory fetches in general?
Assuming smallest guards have an higher latency in average, this could indeed slow down directory fetches and then, maybe it would slow down Tor bootstrapping. I guess that measuring this with Shadow is possible, as well as collecting statistics on a small set of deployed Tor clients.
- onion services (rend is 4% of total bandwidth to guard and middle)
- HSDir is unweighted, and we don't know how much bandwidth it uses
- FallbackDir is unweighted, but mostly Guards, and small
That's difficult to predict, I cannot be sure if it is better or worse for that type of traffic since internal circuits use at least 2 middle relays + the guard and sometimes, even not the guard. Hence we might also think that pushing a bit more to the middle position could be a good thing to do. Moreover, middle relays are unstable and often at the edge of the internet, while guard are stable and most of them within the core of the internet. Hence, a little more of them within the middle position *could* be a good thing, especially if it makes entry's selection probability more uniform. Anyway, I don't have any proof to assert this, as well that I don't have any proof to assert that this optimization could be bad. What I got, is that, for exit circuits, it does not slow down anything.
This optimization is not mandatory, and could also be enabled/disabled at will by the directory auths.
We show in the performance section of the Waterfilling paper that scarcity on two positions does not reduce performance compared to vanilla bandwidth-weights.
What about guards that have low consensus weight due to latency, rather than available bandwidth?
I think this could also cause you huge latency issues as you push more bandwidth away from fast relays. I'm not sure if shadow captures this accurately.
If it happens that any bandwidth is pushed away from fast relays within the entry position and make the entry position slower, at average, then it will make the middle position faster (because it got that bandwidth pushed away). Since the latency of your traffic flow just care about the global latency of the circuit, this will not appear to be slower or faster, on average. This is exactly what we observe in Shadow, and yes, it captures latency accurately. At least, better than any other simulator.
I am also concerned that this proposal may overload small guards and underload large guards. Because it does not consider advertised bandwidth.
I've added this concern within the 'unanswered questions' section. This proposal assumes relay measurement are reliable (consensus weight). If this is not the case, then strange things might happen. In our Shadow results, we have seen that the tails of the CDFs for measurements were a little bit "longer", meaning that the few worst circuits (3%) that we can obtain with the current path selection were a little bit worse with Waterfilling on a loaded network (no change on a light network). And, this fraction of worse circuits was even reduced (less than 1%) when we applied Section 2.1 of the proposal (moving more bandwidth to the middle position).
How does the current path selection considers advertised bandwidth? (Apart bwauths when measuring relays?).
For example, some guards are weighted at 10x their advertised bandwidth. Others are weighted at 0.1x their advertised bandwidth.
There is also a geographical bias to guard bandwidth: https://atlas.torproject.org/#map_consensus_weight_to_bandwidth/flag:Guard
Yep, Thanks for the link. Atlas becomes nicer :) For what it worth, a possible explanation for this bias would be to consider how big ISPs route internal traffics to their own AS: most of internal links are good 10GBits optic fiber while the connections to the outside world is mostly through smaller links. Bwauths can be tricked to believe that some relays have larger capacity because they have build circuits with relays internally connected to strong 10 Gbits. I don't know how the bwauths account for such thing? But, it is an another topic of discussion.
Does shadow capture the consensus weight to advertised bandwidth ratio? (That is, the actual load on the relay.) Or the advertised bandwidth? (The approximate actual capacity of the relay.)
Well, Shadow uses both. See doc of function https://github.com/shadow/shadow-plugin-tor/blob/master/tools/generate.py#L8... used to generate topology and set up link speed from the relay's advertised bandwidth. Then, for the path selection, Shadow uses the measured bandwidth from torflow plugins or from a static v3bw file.
I think this proposal will actually be good for the network, because some large consensus weight relays are overloaded. And giving them more middle weight will reduce their overall load. (Middles get less load than guards.)
But I would like others to think through this, too.
Cool :) And regarding the main point of this proposal? Removing top-guard threat anonymity and hardening path compromise? Do you think that the method presented here is valuable? And, how this 'weight' compared to performance concerns?
3 Security implications
An analysis of the security impact of the Waterfilling proposal is made in Section 6 of the Waterfilling paper. It studies the expectation of the number of relays that an adversary needs to control in order to mount a successful correlation attack at a given time, as well as an analysis of the evolution of the time until first path compromise, based on TorPS.
Given that the distribution produced by Waterfilling is closer to the uniform distribution, the use of Waterfilling increases the expectation of the number of relays that an adversary needs to compromise in order mount a successful correlation attack. Concretely, and based on real data from 2015, this expectation increases by approximately 150 relays.
What percentage is this?
~ 25 %
<skip>
I'm sorry, I can't read this, there are build files everywhere. I can't even find which files were changed. Can you provide clean commits in a fork, based on the Tor master branch?
Also, I think you will need a new consensus method for this change. We don't use consensus parameters or torrc options to modify the content of the consensus, we use consensus methods.
Sorry about this. I've updated the repo with a proper commit history based on my fork. The code gives an idea about the easiness to plug-in this method to the current path selection.
I tried reviewing this code, but it's hard. The changes are scattered through a lot of different commits.
Usually, we group changes into commits by topic or function area. And we use the commit title to say what the commit changes.
Then, when we fix a bug, we use "git commit --fixup" so we can see when the bug was fixed.
I will pay attention to that, thanks for the advice.
For example, I have to look at 20 commits to find out if this bug in the array size for commit e81a55ea has been fixed:
- if (exits) {
if (search_pivot_and_compute_wfbw_weights_(exits, bwweights->wee, 0,
guards->num_used) < 0)
This function is also missing a case for guardsexits.
It appears afterwards, in another commit. Yep, sorry for the mess.
<skip>
You skipped this, but it's really important for the implementation:
On 12 Jan 2018, at 00:47, teor teor2345@gmail.com wrote:
Also, I think you will need a new consensus method for this change. We don't use consensus parameters or torrc options to modify the content of the consensus, we use consensus methods.
Someone who understands consensus methods will need to revise your proposal and code before it is merged.
Added in the section "implementation note".
Here is how it works:
Every hour, directory authorities vote on a new consensus. Once the votes are collected, the dirauths produce a deterministic network consensus document containing information about each relay, including the waterfilling bandwidth-weights produced from the equations described above. e.g.:
...(more relays) r relayguard34 PPTH75+WkHl1GGw07DRE/S+JNdo 1970-01-01 00:02:37 51.254.112.52 9111 9112 m lp08MLTivsSZPhpZQy88i8NPeBNx10tPKBpHZsM3gYM s Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=10200 wfbw wgg=8029 wmg=1971. r 4uthority3 PYnzXGQ+67m0WtO64qtJkwsUzME 1970-01-01 00:02:33 11.0.5.71 9111 9112 m d8G2/8UQzAN3a9DixCtmaivhfUFTvlFKAxCAV1xHVKk s Authority Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=1890 wfbw wgg=10000 wmg=0. ...(more relays)
In this example, we see two relays having the Guard flag and their new waterfilling bandwidth allocation given on the lines starting with wfbw.
These are redundant, and could significantly expand the size of the consensus, and consensus diffs. (Consensus weights are one of the largest contributors to consensus diff size.)
True, each time the consensus weights are modified, those waterfilling weights need to be recomputed. It adds one line per guard, which is about 2~3% of the size of the full consensus. Is such a thing a problem?
Changing bandwidths in each consensus makes consensus diffs *much* bigger.
See this proposal for details and an analysis: https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
Ok. If I understand correctly, any bandwidth change would need a full re-calculation and a broadcast of the new weights (1 per guard) in the next consensus diff. I didn't imagine that it could be a problem. But if consensus diff size matters, then moving the weight calculation to the client code would fix it.
The same issue applies to any bandwidth figures. Would it be sufficient to publish a small ratio instead?
For example, you could publish lines like:
w Bandwidth=100000 wfratio=1
or
w Bandwidth=100000 wfratio=3
And have clients calculate wfbw as follows:
wfbw = Bandwidth * 2^-wfratio
This would be much smaller, much more compressible, and would also solve some of your other issues around remainders, and also equal-weight guards and stable sorts.
You would need to use equations something like this:
Compute a "pivot" N and a weight ratio Wggr_i assigned to relay_i in such a way that:
Each high-bandwidth guard has a guard consensus weight within a factor of 2 of water level:
(a) 2 * 2^-Wggr_(N+1) * BW_(N+1) > 2^-Wggr_i * BW_i >= 2^-Wggr_(N+1) * BW_(N+1) forall i in [1, N]
Each low-bandwidth guard is under the water level: (unchanged)
(b) Wggr_i == 0 forall i in [N+1, K]
The total guard weight is within a ratio of the desired guard weight:
(c) 2 * Wgg * G > sum_{i=1}^{K} 2^-Wggr_i * BW_i >= Wgg * G
The upper bound on (c) is actually a worst-case for a single-relay network. We could get closer to Wgg * G by finding the lowest Wggr for the largest relay that satisfies:
(c.1) sum_{i=1}^{K} 2^-Wggr_i * BW_i >= Wgg * G
This might make the largest relay an exception to (a). I'm not sure if we'll need this optimisation or not.
Wow, so first of all, thanks for suggesting new ways of resolving issues. I really appreciate your consideration for this work.
This suggestion indeed reduces the diff problem, and I find it elegant. But, I have the following concerns:
- The upper bound in (a) is huge, and would be appreciated for an adversary running relays. The adversary could manage to set relays with almost 2 times the consensus weight of the water level, and still being used at 100% in the entry position. This would reduce a lot the benefits of this proposal, right?. - The analysis in our paper works for equations we showed. Even if these ones are very similar, I am not sure we can say that it would be good without re-doing the research.
With your explanations below (weight change on clients), and given that the consensus diff size is a thing, I am leaning to believe that the weight calculation should be done on clients. Anyway, I have added a remark about this possibility within the proposal.
Why not calculate wmg on clients? Why not calculate both wgg and wmg on clients?
This is again a very good question: for such a critical feature (path selection), it is important that the directory auths have full power over the weight computation. If it happens that some change are needed, then the Tor network is updated in a straightforward way. This is not the case if those weights are computed in client code. In fact, I believe that one of the strength of this proposal is the oriented design towards the dirauths.
Consensus changes on dirauths:
- need to use consensus methods
- require a 6-12 month deployment time
- increases the size of all directory documents
- change the whole network
- if designed carefully, can be turned off using a network-wide parameter
Weight changes on clients:
- require a 3-6 month deployment time
- can be deployed to small numbers of clients
- if designed carefully, can be turned off using a network-wide parameter
I don't know which we'll prefer if we deploy this.
I have added a subsection in the proposal with those lines. Looks like we have to agree on this topic first before going in details in the implementation section?
You skipped this, but it's really important for the implementation:
If you must keep them in the consensus, please put these on the existing "w" lines, which are intended for this purpose:
"Other weighting keywords may be added later. Clients MUST ignore keywords they do not recognize."
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2234
Please revise the proposal to conform to the existing directory spec.
OK, I have modified the example and opened discussion for other directions within the implementation (client side and your suggestion above).
Unanswered questions:
What about the feedback loop between this new allocation system and the bandwidth authorities?
I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists:
- Consensus weights on guards and middles change
- Client use of guards and middles change
- Bandwidth authority measurements of guards and middles change
- Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
I have added those questions to the proposal. This looks difficult to know.
Should bandwidth authority clients use the new system?
How do we report on the new system? How can we determine whether it is better for security in practice? How can we determine if it is faster or slower in practice? How can we work out if someone is trying to game the system?
I've added a section within the proposal for all upcoming questions.
Thanks.
If you skip sections with unanswered questions or issues that need to be fixed, please put them in this section as well. Otherwise, it's easy to lose track of them.
I should have added everything now. Thanks!
Best, Florentin
Can you store the proposal in a git repository somewhere, so people can see what you've changed in each revision? Otherwise, they have to re-read the whole proposal each time.
T
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 18 Jan 2018, at 08:10, Florentin Rochet florentin.rochet@uclouvain.be wrote: ...
It might help to provide a procedural algorithm for assigning bandwidth, as well as equations. It would resolve some ambiguity.
Not sure to understand what 'bandwidth' signifies here. I can provide a procedural algorithm to compute the waterfilling bandwidth-weights, if this is what you meant?
Yes.
...
...
<skip> > 2.1 Going further by tweaking original bandwidth-weights computation > > As explained above, our Waterfilling equations are based on: 1) the > Wgg weight computed by Tor 2) the assumption that the bandwidth > available in exit is scarce, i.e., it is lower than the one > available for guard (and middle) positions. > > The second point is satisfied most of the time in Tor, and we do take > it for granted here. > > We, however, observe that Waterfilling could be made even more > effective by applying a minor change in the way Tor computes the > Wgg. For the moment, Tor computes Wgg in such a way that the same > bandwidth is allocated to the guard and middle positions. As a > result, both positions are in excess compared to the exit position. > > The water level could be decreased and, as a result, the uniformity > of the guard selection process could be improved, by computing Wgg > in a way that allocates the same bandwidth to the guard and exit > positions, putting the middle position as the only position in > excess. No, this would slow down Guards, and everything that isn't an exit circuit: * directory fetches (3% of total bandwidth to guard position)
If we reduce guard weights on large relays, does this slow down Tor bootstrap? (Because smaller relays with higher latency get more micro descriptor requests.)
Does it slow down directory fetches in general?
Assuming smallest guards have an higher latency in average, this could indeed slow down directory fetches and then, maybe it would slow down Tor bootstrapping. I guess that measuring this with Shadow is possible, as well as collecting statistics on a small set of deployed Tor clients.
This would be helpful. Although the overall effect might not be measurable. If it is, we could use the unmodified guard weights for directory fetches.
- onion services (rend is 4% of total bandwidth to guard and middle)
- HSDir is unweighted, and we don't know how much bandwidth it uses
- FallbackDir is unweighted, but mostly Guards, and small
That's difficult to predict, I cannot be sure if it is better or worse for that type of traffic since internal circuits use at least 2 middle relays + the guard and sometimes, even not the guard. Hence we might also think that pushing a bit more to the middle position could be a good thing to do. Moreover, middle relays are unstable and often at the edge of the internet, while guard are stable and most of them within the core of the internet. Hence, a little more of them within the middle position *could* be a good thing, especially if it makes entry's selection probability more uniform. Anyway, I don't have any proof to assert this, as well that I don't have any proof to assert that this optimization could be bad. What I got, is that, for exit circuits, it does not slow down anything.
This optimization is not mandatory, and could also be enabled/disabled at will by the directory auths.
We show in the performance section of the Waterfilling paper that scarcity on two positions does not reduce performance compared to vanilla bandwidth-weights.
What about guards that have low consensus weight due to latency, rather than available bandwidth?
I think this could also cause you huge latency issues as you push more bandwidth away from fast relays. I'm not sure if shadow captures this accurately.
If it happens that any bandwidth is pushed away from fast relays within the entry position and make the entry position slower, at average, then it will make the middle position faster (because it got that bandwidth pushed away). Since the latency of your traffic flow just care about the global latency of the circuit, this will not appear to be slower or faster, on average. This is exactly what we observe in Shadow, and yes, it captures latency accurately. At least, better than any other simulator.
I am also concerned that this proposal may overload small guards and underload large guards. Because it does not consider advertised bandwidth.
I've added this concern within the 'unanswered questions' section. This proposal assumes relay measurement are reliable (consensus weight).
How reliable?
Current variance is 30% - 40% between identical bandwidth authorities, and 30% - 60% between all bandwidth authorities.
Sources: https://tomrittervg.github.io/bwauth-tools/#apples-to-apples-comparison https://tomrittervg.github.io/bwauth-tools/#updated-01
Is this sufficient?
If this is not the case, then strange things might happen. In our Shadow results, we have seen that the tails of the CDFs for measurements were a little bit "longer", meaning that the few worst circuits (3%) that we can obtain with the current path selection were a little bit worse with Waterfilling on a loaded network (no change on a light network).
This might not be great. UX studies show that users notice the worst delays, and the variance between the best and worst delays.
And, this fraction of worse circuits was even reduced (less than 1%) when we applied Section 2.1 of the proposal (moving more bandwidth to the middle position).
Maybe we should do this then.
How does the current path selection considers advertised bandwidth? (Apart bwauths when measuring relays?).
It does not, because advertised bandwidth is easily modified by each relay.
For example, some guards are weighted at 10x their advertised bandwidth. Others are weighted at 0.1x their advertised bandwidth.
There is also a geographical bias to guard bandwidth: https://atlas.torproject.org/#map_consensus_weight_to_bandwidth/flag:Guard
Yep, Thanks for the link. Atlas becomes nicer :) For what it worth, a possible explanation for this bias would be to consider how big ISPs route internal traffics to their own AS: most of internal links are good 10GBits optic fiber while the connections to the outside world is mostly through smaller links. Bwauths can be tricked to believe that some relays have larger capacity because they have build circuits with relays internally connected to strong 10 Gbits. I don't know how the bwauths account for such thing? But, it is an another topic of discussion.
They do not.
Does shadow capture the consensus weight to advertised bandwidth ratio? (That is, the actual load on the relay.) Or the advertised bandwidth? (The approximate actual capacity of the relay.)
Well, Shadow uses both. See doc of function https://github.com/shadow/shadow-plugin-tor/blob/master/tools/generate.py#L8... used to generate topology and set up link speed from the relay's advertised bandwidth. Then, for the path selection, Shadow uses the measured bandwidth from torflow plugins or from a static v3bw file.
Ok, this is pretty close to the public network.
I think this proposal will actually be good for the network, because some large consensus weight relays are overloaded. And giving them more middle weight will reduce their overall load. (Middles get less load than guards.)
But I would like others to think through this, too.
Cool :) And regarding the main point of this proposal? Removing top-guard threat anonymity and hardening path compromise? Do you think that the method presented here is valuable?
You assume that many smaller relays are harder to run/compromise than fewer large relays. I do not know if this is true. But it seems that limited IPv4 address space makes it partly true.
Maybe it will make malicious operators split large relays into smaller relays. This is bad for client anonymity in general. But so are malicious operators.
And, how this 'weight' compared to performance concerns?
It seems ok to me, if we can limit the increase in the slow tail.
...
Here is how it works:
Every hour, directory authorities vote on a new consensus. Once the votes are collected, the dirauths produce a deterministic network consensus document containing information about each relay, including the waterfilling bandwidth-weights produced from the equations described above. e.g.:
...(more relays) r relayguard34 PPTH75+WkHl1GGw07DRE/S+JNdo 1970-01-01 00:02:37 51.254.112.52 9111 9112 m lp08MLTivsSZPhpZQy88i8NPeBNx10tPKBpHZsM3gYM s Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=10200 wfbw wgg=8029 wmg=1971. r 4uthority3 PYnzXGQ+67m0WtO64qtJkwsUzME 1970-01-01 00:02:33 11.0.5.71 9111 9112 m d8G2/8UQzAN3a9DixCtmaivhfUFTvlFKAxCAV1xHVKk s Authority Fast Guard HSDir Running Stable V2Dir Valid v Tor 0.2.6.7 w Bandwidth=1890 wfbw wgg=10000 wmg=0. ...(more relays)
In this example, we see two relays having the Guard flag and their new waterfilling bandwidth allocation given on the lines starting with wfbw.
These are redundant, and could significantly expand the size of the consensus, and consensus diffs. (Consensus weights are one of the largest contributors to consensus diff size.)
True, each time the consensus weights are modified, those waterfilling weights need to be recomputed. It adds one line per guard, which is about 2~3% of the size of the full consensus. Is such a thing a problem?
Changing bandwidths in each consensus makes consensus diffs *much* bigger.
See this proposal for details and an analysis: https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
Ok. If I understand correctly, any bandwidth change would need a full re-calculation and a broadcast of the new weights (1 per guard) in the next consensus diff. I didn't imagine that it could be a problem. But if consensus diff size matters, then moving the weight calculation to the client code would fix it.
Consensus diff size does matter, particularly on mobile and slow connections.
The same issue applies to any bandwidth figures. Would it be sufficient to publish a small ratio instead?
For example, you could publish lines like:
w Bandwidth=100000 wfratio=1
or
w Bandwidth=100000 wfratio=3
And have clients calculate wfbw as follows:
wfbw = Bandwidth * 2^-wfratio
This would be much smaller, much more compressible, and would also solve some of your other issues around remainders, and also equal-weight guards and stable sorts.
You would need to use equations something like this:
Compute a "pivot" N and a weight ratio Wggr_i assigned to relay_i in such a way that:
Each high-bandwidth guard has a guard consensus weight within a factor of 2 of water level:
(a) 2 * 2^-Wggr_(N+1) * BW_(N+1) > 2^-Wggr_i * BW_i >= 2^-Wggr_(N+1) * BW_(N+1) forall i in [1, N]
Each low-bandwidth guard is under the water level: (unchanged)
(b) Wggr_i == 0 forall i in [N+1, K]
The total guard weight is within a ratio of the desired guard weight:
(c) 2 * Wgg * G > sum_{i=1}^{K} 2^-Wggr_i * BW_i >= Wgg * G
The upper bound on (c) is actually a worst-case for a single-relay network. We could get closer to Wgg * G by finding the lowest Wggr for the largest relay that satisfies:
(c.1) sum_{i=1}^{K} 2^-Wggr_i * BW_i >= Wgg * G
This might make the largest relay an exception to (a). I'm not sure if we'll need this optimisation or not.
Wow, so first of all, thanks for suggesting new ways of resolving issues. I really appreciate your consideration for this work.
This suggestion indeed reduces the diff problem, and I find it elegant. But, I have the following concerns:
- The upper bound in (a) is huge, and would be appreciated for an
adversary running relays. The adversary could manage to set relays with almost 2 times the consensus weight of the water level, and still being used at 100% in the entry position. This would reduce a lot the benefits of this proposal, right?
I do not know how much the benefits of the proposal depend on the exact water level, and how close relays are to the water level.
I chose the exponent "2" as an example that is easy to calculate. Smaller exponents are possible.
How much variance will your proposal tolerate? Because current variance is 30% - 60% anyway (see above). (I think using the median reduces this, but I can't see it going below 40%, because that's what identical authorities get.)
- The analysis in our paper works for equations we showed. Even if
these ones are very similar, I am not sure we can say that it would be good without re-doing the research.
I do not know enough to answer this.
With your explanations below (weight change on clients), and given that the consensus diff size is a thing, I am leaning to believe that the weight calculation should be done on clients. Anyway, I have added a remark about this possibility within the proposal.
Another alternative is to apply proposal 276 weight rounding to these weights as well.
https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
I think this may be our best option. Because running all these divisions on some mobile clients will be very slow and cost a lot of power.
Why not calculate wmg on clients? Why not calculate both wgg and wmg on clients?
This is again a very good question: for such a critical feature (path selection), it is important that the directory auths have full power over the weight computation. If it happens that some change are needed, then the Tor network is updated in a straightforward way. This is not the case if those weights are computed in client code. In fact, I believe that one of the strength of this proposal is the oriented design towards the dirauths.
Consensus changes on dirauths:
- need to use consensus methods
- require a 6-12 month deployment time
- increases the size of all directory documents
- change the whole network
- if designed carefully, can be turned off using a network-wide parameter
Weight changes on clients:
- require a 3-6 month deployment time
- can be deployed to small numbers of clients
- if designed carefully, can be turned off using a network-wide parameter
I don't know which we'll prefer if we deploy this.
I have added a subsection in the proposal with those lines. Looks like we have to agree on this topic first before going in details in the implementation section?
It is ok to preset two alternatives with pros and cons in a proposal. The implementation section can provide an algorithm for assigning waterfilling weights. And then we can decide later where to run it.
...
Unanswered questions:
What about the feedback loop between this new allocation system and the bandwidth authorities?
I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists:
- Consensus weights on guards and middles change
- Client use of guards and middles change
- Bandwidth authority measurements of guards and middles change
- Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
I have added those questions to the proposal. This looks difficult to know.
Can shadow simulate this?
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n ------------------------------------------------------------------------
Hi,
I have some more questions:
On 18 Jan 2018, at 11:03, teor teor2345@gmail.com wrote:
Unanswered questions:
The Tor network has been experiencing excessive load on guards and middles since December 2017.
Does the waterfilling proposal make excessive load on guards worse, by allocating more guard weight to lower capacity relays? Is the extra security worth the increased risk of failure?
Does the waterfilling proposal make excessive load on middles better, by allocating more middle weight to higher capacity relays? Is there a cascading failure mode, where excess middle weight overwhelms our top relays one by one? (It seems unlikely.)
I also have another practical question:
We struggle to have time to maintain the current bandwidth authority system.
Is it a good idea to make it more complicated? Who will maintain the new code we add to Tor to implement waterfilling? Who will build the analysis tools to show that waterfilling benefits the network?
Do the benefits of waterfilling justify this extra effort?
And even if they do, should we focus on getting the bandwidth authorities in a maintainable state, before adding new features? (I just gave similar advice to another developer who has some great ideas about improving bandwidth measurement.)
What about the feedback loop between this new allocation system and the bandwidth authorities?
I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists:
- Consensus weights on guards and middles change
- Client use of guards and middles change
- Bandwidth authority measurements of guards and middles change
- Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
I have added those questions to the proposal. This looks difficult to know.
Can shadow simulate this?
I am still interested in this feedback loop. If it fails to converge, the system will break down.
T
Hello,
On 28/01/18 11:52, teor wrote:
Hi,
I have some more questions:
Nice, thanks! I still have to answer your previous email and push an update to the proposal. I should do it this week, sorry for late answers :)
See inline a few answers to your questions:
On 18 Jan 2018, at 11:03, teor teor2345@gmail.com wrote:
Unanswered questions:
The Tor network has been experiencing excessive load on guards and middles since December 2017.
If I have correctly followed what was happening: around 1M tor2web clients appeared at OVH and started to overload the network with circuit creation requests using the old and costly TAP handshake. Tor2web clients make direct connections to the intro point and to the rendezvous point, right? And, looking into the code right now, it does not looks like Tor2webs make any distinction to flags. So, basically, the Tor2web load is only weighted by consensus weight (bandwidth-weights have no impact) on the overall network (exits too). Guess: shouldn't that the reason why all exits logs are flooded with the message "[warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)"? Those messages would be caused by tor2web clients picking exit relays as rendezvous node :/ I started to see them increasing more and more since August 2017.
So basically, I *think* we can drop the questions below because bandwidth-weights do not play any role in the excessive load that the network is handling with those tor2webs.
Does the waterfilling proposal make excessive load on guards worse, by allocating more guard weight to lower capacity relays? Is the extra security worth the increased risk of failure?
Does the waterfilling proposal make excessive load on middles better, by allocating more middle weight to higher capacity relays? Is there a cascading failure mode, where excess middle weight overwhelms our top relays one by one? (It seems unlikely.)
I also have another practical question:
We struggle to have time to maintain the current bandwidth authority system.
Is it a good idea to make it more complicated?
Hm, I don't see how Waterfilling plays any role with torflow or bwscanner? I mean, there is still this feedback loop thing but it has no impact on the design of the current torflow or bwscanner? Could you be more specific about your concerns with the bandwidth authorities and this proposal?
Who will maintain the new code we add to Tor to implement waterfilling?
I would volunteer to that.
Who will build the analysis tools to show that waterfilling benefits the network?
Volunteers or master students. I can definitely suggest this topic in my university.
Do the benefits of waterfilling justify this extra effort?
Question for the other Tor devs :) I am definitely biased towards the "yes"
And even if they do, should we focus on getting the bandwidth authorities in a maintainable state, before adding new features? (I just gave similar advice to another developer who has some great ideas about improving bandwidth measurement.)
Bandwidth-weights and measurements (consensus weights) are two different things that solve 2 different problems. So, we can work independently on improving measurements (like what is currently done with bwscanner) and improving Tor's balancing (bandwidth-weights) with this proposal.
What about the feedback loop between this new allocation system and the bandwidth authorities?
I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists:
- Consensus weights on guards and middles change
- Client use of guards and middles change
- Bandwidth authority measurements of guards and middles change
- Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
I have added those questions to the proposal. This looks difficult to know.
Can shadow simulate this?
I am still interested in this feedback loop. If it fails to converge, the system will break down.
Yup. Going to answer this on your previous email.
Best, Florentin
T _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 29 Jan 2018, at 08:00, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hello,
On 28/01/18 11:52, teor wrote: Hi,
I have some more questions:
Nice, thanks! I still have to answer your previous email and push an update to the proposal. I should do it this week, sorry for late answers :)
See inline a few answers to your questions:
On 18 Jan 2018, at 11:03, teor teor2345@gmail.com wrote:
> Unanswered questions:
The Tor network has been experiencing excessive load on guards and middles since December 2017.
If I have correctly followed what was happening: around 1M tor2web clients appeared at OVH
Not just OVH, at least 3 different providers.
And not just Tor2web, either. There are onion services which are overloading the network as well, probably in response to these clients. The onion services are mostly overloading guard-weighted nodes.
and started to overload the network with circuit creation requests using the old and costly TAP handshake.
Not just TAP. The sheer number of entry connections, extend requests, and destroy cells is also creating overloads on some relays.
Tor2web clients make direct connections to the intro point and to the rendezvous point, right?
Yes.
And, looking into the code right now, it does not looks like Tor2webs make any distinction to flags. So, basically, the Tor2web load is only weighted by consensus weight (bandwidth-weights have no impact) on the overall network (exits too).
This only applies if Tor2webRendezvousPoints is set. Otherwise, the nodes are middle-weighted.
Guess: shouldn't that the reason why all exits logs are flooded with the message "[warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)"? Those messages would be caused by tor2web clients picking exit relays as rendezvous node :/ I started to see them increasing more and more since August 2017.
No, this is a different issue. Exit relays are allowed as rendezvous nodes.
So basically, I *think* we can drop the questions below because bandwidth-weights do not play any role in the excessive load that the network is handling with those tor2webs.
Guard weights are used by overloading onion services, and middle weights are used by overloading Tor2web clients.
Does the waterfilling proposal make excessive load on guards worse, by allocating more guard weight to lower capacity relays? Is the extra security worth the increased risk of failure?
We want to design a network that can handle different kinds of extra load. So these questions are important, even if they don't apply right now.
Does the waterfilling proposal make excessive load on middles better, by allocating more middle weight to higher capacity relays? Is there a cascading failure mode, where excess middle weight overwhelms our top relays one by one? (It seems unlikely.)
I'm going to re-ask this questions, in light of the extra middle load from Tor2web clients:
Does the waterfilling proposal make excessive load on middles worse, by allocating more middle weight to higher capacity relays?
In particular, connections are limited by file descriptors, and file descriptor limits typically don't scale with the bandwidth of the relay. As far as I can tell, waterfilling would have directed additional Tor2web traffic to large guards. It would have brought down my guards faster, and made it much harder for me to keep them up.
If we had implemented waterfilling before this attack, would it have lead to cascading failures on our top guards? They would have been carrying significantly more middle load, and mine barely managed to cope.
Can you redesign the proposal so there is some limit on the extra middle load assigned to a guard? Or does this ruin the security properties?
Is there a compelling argument for security over network robustness?
I also have another practical question:
We struggle to have time to maintain the current bandwidth authority system.
Is it a good idea to make it more complicated?
Hm, I don't see how Waterfilling plays any role with torflow or bwscanner? I mean, there is still this feedback loop thing but it has no impact on the design of the current torflow or bwscanner?
I can't really say. I look forward to your explanation of the feedback loop.
Could you be more specific about your concerns with the bandwidth authorities and this proposal?
It takes time and effort from Tor people to integrate and maintain the code and monitoring for a new proposal like this one.
We will need to take extra time on this proposal, because we already need more monitoring for the current bandwidth authority system. And only then would we have time to build monitoring specific to this proposal.
Also, when we change bandwidth measurement or allocation, we need to change one thing at a time, and then monitor the change. So depending on our priorities, this proposal may need to wait until after we implement and monitor other urgent fixes.
Who will maintain the new code we add to Tor to implement waterfilling?
I would volunteer to that.
Typically, experienced Core Tor team members review and maintain code.
And there's still a lot of development and testing work to be done before the code is ready to merge. Are you able to do this development?
How much help will you need to write a new consensus method? How much help will you need to write unit tests? (This help will come from existing team members.)
Does your current code pass: * make check * make test-network-all * in particular, any new consensus method must pass the "mixed" network, with an unpatched Tor version in your path as "tor-stable"
Who will build the analysis tools to show that waterfilling benefits the network?
Volunteers or master students. I can definitely suggest this topic in my university.
Typically, experienced Tor Metrics team members write, review, and maintain monitoring systems. And they don't have a lot of extra capacity right now.
Even if students do this task, they would need help from existing team members.
Do the benefits of waterfilling justify this extra effort?
Question for the other Tor devs :) I am definitely biased towards the "yes"
It seems plausible, but I don't feel I have seen a compelling enough argument to prioritise it above fixing bandwidth authorities.
At the moment, reasonably fast guards in Eastern North America and Western Europe are overloaded with client traffic. And guards in the rest of the world are under-loaded. Reducing this bias is something we need to do.
And this proposal gets us better security if we fix this geographical bias first. Otherwise, adversaries can simply pick a location that massively increases their consensus weight, and get lots of client traffic.
And even if they do, should we focus on getting the bandwidth authorities in a maintainable state, before adding new features? (I just gave similar advice to another developer who has some great ideas about improving bandwidth measurement.)
Bandwidth-weights and measurements (consensus weights) are two different things that solve 2 different problems. So, we can work independently on improving measurements (like what is currently done with bwscanner) and improving Tor's balancing (bandwidth-weights) with this proposal.
I don't think this is realistic. There is always contention for shared resources.
Integrating and testing new code, and monitoring its effects, will take effort from the teams I mentioned above. This takes away from the urgent work of fixing the bandwidth authority system. Which also takes effort from the Core Tor and Metrics teams.
> What about the feedback loop between this new allocation system > and the bandwidth authorities? > I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists:
- Consensus weights on guards and middles change
- Client use of guards and middles change
- Bandwidth authority measurements of guards and middles change
- Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
I have added those questions to the proposal. This looks difficult to know.
Can shadow simulate this?
I am still interested in this feedback loop. If it fails to converge, the system will break down.
Yup. Going to answer this on your previous email.
T
Hi teor,
Sorry about the huge delay :)
I've added your following idea to the proposal (seems we come up to the right way to do it :-)):
Why not list the waterfilling level on a single line in the consensus?
That way:
- authorities do the expensive calculation
- clients can re-weight relays using a simple calculation:
if it is less than or equal to the waterfilling level: use the relay's weight as its guard weight use 0 as its middle weight otherwise: use the waterfilling level as the relay's guard weight use the relay's weight minus the waterfilling level as its middle weight
This is O(n) and requires one comparison and one subtraction in the worst case.
I have a few questions about convergence/divergence of weights, but maybe we could take advantage of the meeting in Rome to discuss this avenue?
On 28/01/18 23:40, teor wrote:
<skip> I'm going to re-ask this questions, in light of the extra middle load from Tor2web clients:
Does the waterfilling proposal make excessive load on middles worse, by allocating more middle weight to higher capacity relays?
If bwauths overestimate top relays, or if we reach some soft limit, yes. But the reverse would be true too: if we have excessive load of guards, then this proposal will make things better.
In particular, connections are limited by file descriptors, and file descriptor limits typically don't scale with the bandwidth of the relay. As far as I can tell, waterfilling would have directed additional Tor2web traffic to large guards. It would have brought down my guards faster, and made it much harder for me to keep them up.
If we had implemented waterfilling before this attack, would it have lead to cascading failures on our top guards? They would have been carrying significantly more middle load, and mine barely managed to cope.
Probably yes, but they would also carrying less load at the guard position from normal Tor users. In normal condition, that should tie. In a DDoS situation, I would say we face difficulties no matter what we do.
Can you redesign the proposal so there is some limit on the extra middle load assigned to a guard? Or does this ruin the security properties?
Is there a compelling argument for security over network robustness?
Questions added to the proposal :)
<skip> > Could you be > more specific about your concerns with the bandwidth authorities and > this proposal?
It takes time and effort from Tor people to integrate and maintain the code and monitoring for a new proposal like this one.
We will need to take extra time on this proposal, because we already need more monitoring for the current bandwidth authority system. And only then would we have time to build monitoring specific to this proposal.
Also, when we change bandwidth measurement or allocation, we need to change one thing at a time, and then monitor the change. So depending on our priorities, this proposal may need to wait until after we implement and monitor other urgent fixes.
Yes, this proposal can wait, it's totally up to you. I agree that fixing the current bwauthorities should be somewhat on the top of priorities. But nothing prevents us to further discuss this proposal and make plans.
Who will maintain the new code we add to Tor to implement waterfilling?
I would volunteer to that.
Typically, experienced Core Tor team members review and maintain code.
And there's still a lot of development and testing work to be done before the code is ready to merge. Are you able to do this development?
If I can make this as a research project, that can be very fast. If I have to do this on my spare time, that's going to be a bit slower.
How much help will you need to write a new consensus method? How much help will you need to write unit tests? (This help will come from existing team members.)
I should be fine with both. Adding a new consensus method does not seem too much difficult at first glance, but removing one looks a bit more challenging. Hopefully, I just need to add one :)
Does your current code pass:
- make check
- make test-network-all
* in particular, any new consensus method must pass the "mixed" network, with an unpatched Tor version in your path as "tor-stable"
As much as I recall, make check passed as well as chutney networks. I don't remember what test I did back at that time, though. If we go for an up-to-date implementation, I willl make sure everything's ok :)
<skip> > Bandwidth-weights and measurements (consensus weights) are two different > things that solve 2 different problems. So, we can work independently on > improving measurements (like what is currently done with bwscanner) and > improving Tor's balancing (bandwidth-weights) with this proposal.
I don't think this is realistic. There is always contention for shared resources.
Integrating and testing new code, and monitoring its effects, will take effort from the teams I mentioned above. This takes away from the urgent work of fixing the bandwidth authority system. Which also takes effort from the Core Tor and Metrics teams.
Right, totally agree that the first focus should be on bwauths. Could we try to make plan, or at least move this proposal into the todo list? Looking forward to meet you in Rome :)
Best, Florentin
Hi,
On 4 Mar 2018, at 02:37, Florentin Rochet florentin.rochet@uclouvain.be wrote:
I have a few questions about convergence/divergence of weights, but maybe we could take advantage of the meeting in Rome to discuss this avenue?
I was wrong. The current network doesn't attempt to converge on a stable set of weights, because the feedback loop is too weak. So I think we can disregard this question.
On 28/01/18 23:40, teor wrote:
<skip> I'm going to re-ask this questions, in light of the extra middle load from Tor2web clients:
Does the waterfilling proposal make excessive load on middles worse, by allocating more middle weight to higher capacity relays?
If bwauths overestimate top relays, or if we reach some soft limit, yes. But the reverse would be true too: if we have excessive load of guards, then this proposal will make things better.
I'm not sure that this statement is true.
Rather: If we have excessive load on middles, then top relays will suffer more. If we have excessive load on guards, then lower relays will suffer more.
You can actually lose both ways when you unbalance network load. It depends on the type of load.
In particular, connections are limited by file descriptors, and file descriptor limits typically don't scale with the bandwidth of the relay. As far as I can tell, waterfilling would have directed additional Tor2web traffic to large guards. It would have brought down my guards faster, and made it much harder for me to keep them up.
If we had implemented waterfilling before this attack, would it have lead to cascading failures on our top guards? They would have been carrying significantly more middle load, and mine barely managed to cope.
Probably yes, but they would also carrying less load at the guard position from normal Tor users. In normal condition, that should tie. In a DDoS situation, I would say we face difficulties no matter what we do.
I think that unbalanced networks tend to suffer more during a DDoS.
…
<skip> > Bandwidth-weights and measurements (consensus weights) are two different > things that solve 2 different problems. So, we can work independently on > improving measurements (like what is currently done with bwscanner) and > improving Tor's balancing (bandwidth-weights) with this proposal.
I don't think this is realistic. There is always contention for shared resources.
Integrating and testing new code, and monitoring its effects, will take effort from the teams I mentioned above. This takes away from the urgent work of fixing the bandwidth authority system. Which also takes effort from the Core Tor and Metrics teams.
Right, totally agree that the first focus should be on bwauths. Could we try to make plan, or at least move this proposal into the todo list?
While we're talking priorities, I would also prioritise fixing the guardfraction code over this proposal. When we work out how to test guardfraction, we can use similar tests for this proposal.
At the moment, this proposal is "OPEN", because it is still under discussion: https://gitweb.torproject.org/torspec.git/tree/proposals/001-process.txt#n14...
If you want to move it to "ACCEPTED": * answer as many open questions as you can * reply to this thread with a link to the final version of the proposal * we will open a ticket for the proposal * we will schedule a proposal meeting on IRC to discuss the proposal
Please be aware: * It sometimes takes years for research to make it into Tor. * Some research is good research, but it is not a good fit for the public Tor network.
T
Hi,
I updated the proposal with some more of your advises, questions and concerns.
On 18/01/18 01:03, teor wrote:
I've added this concern within the 'unanswered questions' section. This proposal assumes relay measurement are reliable (consensus weight).
How reliable?
Current variance is 30% - 40% between identical bandwidth authorities, and 30% - 60% between all bandwidth authorities.
Sources: https://tomrittervg.github.io/bwauth-tools/#apples-to-apples-comparison https://tomrittervg.github.io/bwauth-tools/#updated-01
Is this sufficient?
My apologies, I was not enough specific: we assume bandwidth measurements reliable as an hypothesis to make the claim that Waterfilling is not going to reduce or improve the performance. If these measurements are not reliable enough, then Waterfilling might make things better, worse or both compared to the current bandwidth-weights is some unpredictable way. All of this depends on the bandwidth authority. Anyway, I willingly agree that we need some kind of tools able to report on such modification. Besides, those tools could be reused for any new proposal impacting the path selection, such as research protecting against network adversaries or even some of the changes you already plan to do (such as Prop 276).
<skip> > Wow, so first of all, thanks for suggesting new ways of resolving > issues. I really appreciate your consideration for this work. > > This suggestion indeed reduces the diff problem, and I find it elegant. > But, I have the following concerns: > > - The upper bound in (a) is huge, and would be appreciated for an > adversary running relays. The adversary could manage to set relays with > almost 2 times the consensus weight of the water level, and still being > used at 100% in the entry position. This would reduce a lot the benefits > of this proposal, right? I do not know how much the benefits of the proposal depend on the exact water level, and how close relays are to the water level.
I chose the exponent "2" as an example that is easy to calculate. Smaller exponents are possible.
How much variance will your proposal tolerate? Because current variance is 30% - 60% anyway (see above).
The variance is not a problem if the water level is adapted (re-computed) at each consensus.
(I think using the median reduces this, but I can't see it going below 40%, because that's what identical authorities get.)
- The analysis in our paper works for equations we showed. Even if
these ones are very similar, I am not sure we can say that it would be good without re-doing the research.
I do not know enough to answer this.
With your explanations below (weight change on clients), and given that the consensus diff size is a thing, I am leaning to believe that the weight calculation should be done on clients. Anyway, I have added a remark about this possibility within the proposal.
Another alternative is to apply proposal 276 weight rounding to these weights as well.
https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
I think this may be our best option. Because running all these divisions on some mobile clients will be very slow and cost a lot of power.
Added this to the proposal. We might also "divide" the algorithm: what about computing the weights on dirauths but broadcasting only the pivot (the index of the relay at the water level). Clients can then resume the computation and produce the weights themselves with a reduced cost. Strength: - The weight calculation would be O(n) on clients (n being the size of the guard set) instead of O(n*log(n)) - No impact on the consensus diff (well, except 1 line, the pivot value). Weakness: - We still have O(n) divisions on the client, each time we download a new consensus.
<skip> >>>> Unanswered questions: >>>> >>>> What about the feedback loop between this new allocation system >>>> and the bandwidth authorities? >>>> >>> I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me. >> You do not need to add a feedback loop, one already exists: >> 1. Consensus weights on guards and middles change >> 2. Client use of guards and middles change >> 3. Bandwidth authority measurements of guards and middles change >> 4. Repeat from 1 >> >> My question is: >> >> How does this existing feedback loop affect your proposal? >> Does it increase or reduce the size of the guard and middle weight changes? > I have added those questions to the proposal. This looks difficult to know. Can shadow simulate this?
I am not fully sure about this. Shadow's topology is static, meaning that a change in advertised bandwidth does not change the uplink and downlink bandwidth configured by the initial advertised bandwidth. But, Shadow can use a Torflow plugin and continuously measure the virtual network, which would produce the feedback loop you describe. However, we have to run the simulation long enough to observe this effect, which was not done in our research. We ran 1 virtual hour for each simulation during our experiments (1 week of real time for each...).
Best, Florentin
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 1 Feb 2018, at 07:15, Florentin Rochet florentin.rochet@uclouvain.be wrote:
On 18/01/18 01:03, teor wrote:
I've added this concern within the 'unanswered questions' section. This proposal assumes relay measurement are reliable (consensus weight).
How reliable?
Current variance is 30% - 40% between identical bandwidth authorities, and 30% - 60% between all bandwidth authorities.
Sources: https://tomrittervg.github.io/bwauth-tools/#apples-to-apples-comparison https://tomrittervg.github.io/bwauth-tools/#updated-01
Is this sufficient?
My apologies, I was not enough specific: we assume bandwidth measurements reliable as an hypothesis to make the claim that Waterfilling is not going to reduce or improve the performance. If these measurements are not reliable enough, then Waterfilling might make things better, worse or both compared to the current bandwidth-weights is some unpredictable way.
This variance is measurement error. In this case, discretization error is less than 1%.
We need to know whether measurement inaccuracy makes the network weights converge or diverge under your scheme.
It looks like they converge on the current network with the current bandwidth authorities. This is an essential property we need to keep.
All of this depends on the bandwidth authority. Anyway, I willingly agree that we need some kind of tools able to report on such modification. Besides, those tools could be reused for any new proposal impacting the path selection, such as research protecting against network adversaries or even some of the changes you already plan to do (such as Prop 276).
Yes, we are hoping to introduce better tools over time.
<skip> > … > > - The upper bound in (a) is huge, and would be appreciated for an > adversary running relays. The adversary could manage to set relays with > almost 2 times the consensus weight of the water level, and still being > used at 100% in the entry position. This would reduce a lot the benefits > of this proposal, right? I do not know how much the benefits of the proposal depend on the exact water level, and how close relays are to the water level.
…
How much variance will your proposal tolerate? Because current variance is 30% - 60% anyway (see above).
The variance is not a problem if the water level is adapted (re-computed) at each consensus.
I'm not sure we're talking about the same thing here. The variance I am talking about here is measurement error and discretization error. Re-computation doesn't change the error. (And going from relay measurement to consensus bandwidth can take hours.)
See my comment above about convergence: we need to converge in the presence of discretization error, too.
…
With your explanations below (weight change on clients), and given that the consensus diff size is a thing, I am leaning to believe that the weight calculation should be done on clients. Anyway, I have added a remark about this possibility within the proposal.
Another alternative is to apply proposal 276 weight rounding to these weights as well.
https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
I think this may be our best option. Because running all these divisions on some mobile clients will be very slow and cost a lot of power.
Added this to the proposal. We might also "divide" the algorithm: what about computing the weights on dirauths but broadcasting only the pivot (the index of the relay at the water level). Clients can then resume the computation and produce the weights themselves with a reduced cost. Strength:
- The weight calculation would be O(n) on clients (n being the size of
the guard set) instead of O(n*log(n))
- No impact on the consensus diff (well, except 1 line, the pivot value).
Weakness:
- We still have O(n) divisions on the client, each time we download a
new consensus.
Why not list the waterfilling level on a single line in the consensus?
That way: * authorities do the expensive calculation * clients can re-weight relays using a simple calculation:
if it is less than or equal to the waterfilling level: use the relay's weight as its guard weight use 0 as its middle weight otherwise: use the waterfilling level as the relay's guard weight use the relay's weight minus the waterfilling level as its middle weight
This is O(n) and requires one comparison and one subtraction in the worst case.
T
Hello,
I recently took the time to read the waterfilling paper. I’m not sure its a good idea even for the goal of increasing the cost of traffic correlation attacks. It depends on whether it is easier for an adversary to run many small relays of total weight x or a few large relays of total weight y, where x = y*c with c the fraction of a Guard-flagged relay used in the guard position (I believe that c=2/3 currently, as Wgg=7268 and Wmg=2732). Just to emphasize it: waterfilling requires *less bandwidth* to achieve a given guard probability as is needed in Tor currently.
Based on prices I’ve seen (~$2/IP/month vs. ~$500/Gbps/month), its significantly cheaper to add a new relay than it is to add bandwidth commensurate with the highest-bandwidth relays. If an adversary finds it easier to compromise machines, then waterfilling might help as it lowers the guard probability of high-bandwidth relays. However, for adversaries with the resources to posses zero-day vulnerabilities against the well-run high-bandwidth relays, it seems to me that those adversaries would easily also have the resources to run relays instead, and in fact it would probably be cheaper for them to run relays as zero-days are expensive. Adversaries with botnets, which have many IPs but generally low bandwidth, would benefit from waterfilling, as it would increase the number of clients choosing them as guards that they can then attack. Waterfilling doesn’t clearly make things better or worse against network-level adversaries.
Thus, it doesn’t seem to me that waterfilling protects Tor’s users against their likely adversaries, and in fact is likely to make things less secure in a few important cases.
Best, Aaron
On Jan 31, 2018, at 5:01 PM, teor teor2345@gmail.com wrote:
On 1 Feb 2018, at 07:15, Florentin Rochet florentin.rochet@uclouvain.be wrote:
On 18/01/18 01:03, teor wrote:
I've added this concern within the 'unanswered questions' section. This proposal assumes relay measurement are reliable (consensus weight).
How reliable?
Current variance is 30% - 40% between identical bandwidth authorities, and 30% - 60% between all bandwidth authorities.
Sources: https://tomrittervg.github.io/bwauth-tools/#apples-to-apples-comparison https://tomrittervg.github.io/bwauth-tools/#updated-01
Is this sufficient?
My apologies, I was not enough specific: we assume bandwidth measurements reliable as an hypothesis to make the claim that Waterfilling is not going to reduce or improve the performance. If these measurements are not reliable enough, then Waterfilling might make things better, worse or both compared to the current bandwidth-weights is some unpredictable way.
This variance is measurement error. In this case, discretization error is less than 1%.
We need to know whether measurement inaccuracy makes the network weights converge or diverge under your scheme.
It looks like they converge on the current network with the current bandwidth authorities. This is an essential property we need to keep.
All of this depends on the bandwidth authority. Anyway, I willingly agree that we need some kind of tools able to report on such modification. Besides, those tools could be reused for any new proposal impacting the path selection, such as research protecting against network adversaries or even some of the changes you already plan to do (such as Prop 276).
Yes, we are hoping to introduce better tools over time.
<skip> > … > > - The upper bound in (a) is huge, and would be appreciated for an > adversary running relays. The adversary could manage to set relays with > almost 2 times the consensus weight of the water level, and still being > used at 100% in the entry position. This would reduce a lot the benefits > of this proposal, right? I do not know how much the benefits of the proposal depend on the exact water level, and how close relays are to the water level.
…
How much variance will your proposal tolerate? Because current variance is 30% - 60% anyway (see above).
The variance is not a problem if the water level is adapted (re-computed) at each consensus.
I'm not sure we're talking about the same thing here. The variance I am talking about here is measurement error and discretization error. Re-computation doesn't change the error. (And going from relay measurement to consensus bandwidth can take hours.)
See my comment above about convergence: we need to converge in the presence of discretization error, too.
…
With your explanations below (weight change on clients), and given that the consensus diff size is a thing, I am leaning to believe that the weight calculation should be done on clients. Anyway, I have added a remark about this possibility within the proposal.
Another alternative is to apply proposal 276 weight rounding to these weights as well.
https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
I think this may be our best option. Because running all these divisions on some mobile clients will be very slow and cost a lot of power.
Added this to the proposal. We might also "divide" the algorithm: what about computing the weights on dirauths but broadcasting only the pivot (the index of the relay at the water level). Clients can then resume the computation and produce the weights themselves with a reduced cost. Strength:
- The weight calculation would be O(n) on clients (n being the size of
the guard set) instead of O(n*log(n))
- No impact on the consensus diff (well, except 1 line, the pivot value).
Weakness:
- We still have O(n) divisions on the client, each time we download a
new consensus.
Why not list the waterfilling level on a single line in the consensus?
That way:
- authorities do the expensive calculation
- clients can re-weight relays using a simple calculation:
if it is less than or equal to the waterfilling level: use the relay's weight as its guard weight use 0 as its middle weight otherwise: use the waterfilling level as the relay's guard weight use the relay's weight minus the waterfilling level as its middle weight
This is O(n) and requires one comparison and one subtraction in the worst case.
T _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hi Aaron,
Thanks for your comments, you are definitely touching interesting aspects.
Here are thoughts regarding your objections:
1) The cost of IPs vs. bandwidth is definitely a function of market offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth, - the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month - the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
We do not believe that this is conclusive, as the market changes, and there certainly are dozens of other providers.
The same applies for 0-day attacks: if you need to buy them just for attacking Tor, then they are expensive. If you are an organization in the business of handling 0-day attacks for various other reasons, then the costs are very different. And it may be unclear to determine if it is easier/cheaper to compromise 1 top relay or 20 mid-level relays.
And we are not sure that the picture is so clear about botnets either: bots that can become guards need to have high availability (in order to pass the guard stability requirements), and such high availability bots are also likely to have a bandwidth that is higher than the water level (abandoned machines in university networks, ...). As a result, waterfilling would increase the number of high availability bots that are needed, which is likely to be hard.
2) Waterfilling makes it necessary for an adversary to run a larger number of relays. Apart from the costs of service providers, this large number of relays need to be managed in an apparently independent way, otherwise they would become suspicious to community members, like nusenu who is doing a great job spotting all anomalies. It seems plausible that running 100 relays in such a way that they look independent is at least as difficult as doing that with 10 relays.
3) The question of the protection from relays, ASes or IXPs is puzzling, and we do not have a strong opinion about it. We focused on relays because they are what is available to any attacker, compared to ASes or IXPs which are more specific adversaries. But, if there is a consensus that ASes or IXPs should rather be considered as the main target, it is easy to implement waterfilling at the AS or IXP level rather than at the IP level: just aggregate the bandwidth relayed per AS or IXP, and apply the waterfilling level computation method to them. Or we could mix the weights obtained for all these adversaries, in order to get some improvement against all of them instead of an improvement against only one and being agnostic about the others.
4) More fundamentally, since the fundamental idea of Tor is to mix traffic through a large number of relays, it seems to be a sound design principle to make the choice of the critical relays as uniform as possible, as Waterfilling aims to do. A casual Tor user may be concerned to see that his traffic is very likely to be routed through a very small number of top relays, and this effect is likely to increase as soon as a multi-cores compliant implementation of Tor rises (rust dev). Current top relays which suffer from the main CPU bottleneck will probably be free to relay even more bandwidth than they already do, and gain an even more disproportionate consensus weight. Waterfilling might prevent that, and keep those useful relays doing their job at the middle position of paths.
We hope those thoughts can help, and thanks again for sharing yours.
Best,
Florentin and Olivier
On 2018-03-05 23:30, Aaron Johnson wrote:
Hello,
I recently took the time to read the waterfilling paper. I’m not sure its a good idea even for the goal of increasing the cost of traffic correlation attacks. It depends on whether it is easier for an adversary to run many small relays of total weight x or a few large relays of total weight y, where x = y*c with c the fraction of a Guard-flagged relay used in the guard position (I believe that c=2/3 currently, as Wgg=7268 and Wmg=2732). Just to emphasize it: waterfilling requires *less bandwidth* to achieve a given guard probability as is needed in Tor currently.
Based on prices I’ve seen (~$2/IP/month vs. ~$500/Gbps/month), its significantly cheaper to add a new relay than it is to add bandwidth commensurate with the highest-bandwidth relays. If an adversary finds it easier to compromise machines, then waterfilling might help as it lowers the guard probability of high-bandwidth relays. However, for adversaries with the resources to posses zero-day vulnerabilities against the well-run high-bandwidth relays, it seems to me that those adversaries would easily also have the resources to run relays instead, and in fact it would probably be cheaper for them to run relays as zero-days are expensive. Adversaries with botnets, which have many IPs but generally low bandwidth, would benefit from waterfilling, as it would increase the number of clients choosing them as guards that they can then attack. Waterfilling doesn’t clearly make things better or worse against network-level adversaries.
Thus, it doesn’t seem to me that waterfilling protects Tor’s users against their likely adversaries, and in fact is likely to make things less secure in a few important cases.
Best, Aaron
On Jan 31, 2018, at 5:01 PM, teor teor2345@gmail.com wrote:
On 1 Feb 2018, at 07:15, Florentin Rochet florentin.rochet@uclouvain.be wrote:
On 18/01/18 01:03, teor wrote:
I've added this concern within the 'unanswered questions' section. This proposal assumes relay measurement are reliable (consensus weight).
How reliable?
Current variance is 30% - 40% between identical bandwidth authorities, and 30% - 60% between all bandwidth authorities.
Sources: https://tomrittervg.github.io/bwauth-tools/#apples-to-apples-comparison https://tomrittervg.github.io/bwauth-tools/#updated-01
Is this sufficient?
My apologies, I was not enough specific: we assume bandwidth measurements reliable as an hypothesis to make the claim that Waterfilling is not going to reduce or improve the performance. If these measurements are not reliable enough, then Waterfilling might make things better, worse or both compared to the current bandwidth-weights is some unpredictable way.
This variance is measurement error. In this case, discretization error is less than 1%.
We need to know whether measurement inaccuracy makes the network weights converge or diverge under your scheme.
It looks like they converge on the current network with the current bandwidth authorities. This is an essential property we need to keep.
All of this depends on the bandwidth authority. Anyway, I willingly agree that we need some kind of tools able to report on such modification. Besides, those tools could be reused for any new proposal impacting the path selection, such as research protecting against network adversaries or even some of the changes you already plan to do (such as Prop 276).
Yes, we are hoping to introduce better tools over time.
<skip> > … > > - The upper bound in (a) is huge, and would be appreciated for an > adversary running relays. The adversary could manage to set relays with > almost 2 times the consensus weight of the water level, and still being > used at 100% in the entry position. This would reduce a lot the benefits > of this proposal, right? I do not know how much the benefits of the proposal depend on the exact water level, and how close relays are to the water level.
…
How much variance will your proposal tolerate? Because current variance is 30% - 60% anyway (see above).
The variance is not a problem if the water level is adapted (re-computed) at each consensus.
I'm not sure we're talking about the same thing here. The variance I am talking about here is measurement error and discretization error. Re-computation doesn't change the error. (And going from relay measurement to consensus bandwidth can take hours.)
See my comment above about convergence: we need to converge in the presence of discretization error, too.
…
With your explanations below (weight change on clients), and given that the consensus diff size is a thing, I am leaning to believe that the weight calculation should be done on clients. Anyway, I have added a remark about this possibility within the proposal.
Another alternative is to apply proposal 276 weight rounding to these weights as well.
https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granul...
I think this may be our best option. Because running all these divisions on some mobile clients will be very slow and cost a lot of power.
Added this to the proposal. We might also "divide" the algorithm: what about computing the weights on dirauths but broadcasting only the pivot (the index of the relay at the water level). Clients can then resume the computation and produce the weights themselves with a reduced cost. Strength:
- The weight calculation would be O(n) on clients (n being the size of
the guard set) instead of O(n*log(n))
- No impact on the consensus diff (well, except 1 line, the pivot value).
Weakness:
- We still have O(n) divisions on the client, each time we download a
new consensus.
Why not list the waterfilling level on a single line in the consensus?
That way:
- authorities do the expensive calculation
- clients can re-weight relays using a simple calculation:
if it is less than or equal to the waterfilling level: use the relay's weight as its guard weight use 0 as its middle weight otherwise: use the waterfilling level as the relay's guard weight use the relay's weight minus the waterfilling level as its middle weight
This is O(n) and requires one comparison and one subtraction in the worst case.
T _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hello friends,
- The cost of IPs vs. bandwidth is definitely a function of market offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
The question of what the cheapest attack is can indeed be estimated by looking at market prices for the required resources. Your cost estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders of magnitude.
The numbers I gave ($2/IP/month and $500/Gbps/month) are the amounts currently charged by my US hosting provider. At the time that I shopped around (which was in 2015), it was by far the best bandwidth cost that I was able to find, and those costs haven’t changed much since then.
Currently on OVH the best I could find for hosting just now was $93.02/per month for 250Mbps unlimited (https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml). This yields $372.08/Gbps/month. I am far from certain that this is the best price that one could find - please do point me to better pricing if you have it!
I also just looked at Hetzter - another major Tor-friendly hosting provider. The best I could find was 1Gbps link capped at 100TB/month for $310.49 (https://wiki.hetzner.de/index.php/Traffic/en https://wiki.hetzner.de/index.php/Traffic/en). 1Gbps sustained upload is 334.8Terabytes (i.e. 1e12 bytes) for a 31-day month. If you exceed that limit, you can arrange to pay $1.24/TB. Therefore I would estimate the cost to be $601.64/Gbps/month. Again, I maybe missing an option more tailored to a high-traffic server, and I would be happy to be pointed to it :-)
Moreover, European bandwidth costs are among the lowest in the world. Other locations are likely to have even higher bandwidth costs (Australia, for example, has notoriously high bandwidth costs).
We do not believe that this is conclusive, as the market changes, and there certainly are dozens of other providers.
I do agree that the market changes, and in fact I expect the cost fo IPs to plummet as the shift to IPv6 becomes pervasive. The current high cost of IPv4 addresses is due to their recent scarcity. In any case, a good question to ask would be how Tor should adjust to changes in market pricing over time.
The same applies for 0-day attacks: if you need to buy them just for attacking Tor, then they are expensive. If you are an organization in the business of handling 0-day attacks for various other reasons, then the costs are very different. And it may be unclear to determine if it is easier/cheaper to compromise 1 top relay or 20 mid-level relays.
I agree that the cost of compromising machines is unclear. However, we should guess, and the business of 0-days has provided some signals of their value in terms of their price. 0-days for the Tor software stack are expensive, as, for security reasons, (well-run) Tor relays run few services other than the tor process. I haven’t seen great data on Linux zero-days, but recently a Windows zero-day (Windows being the second most-common Tor relays OS) appeared to cost $90K (https://www.csoonline.com/article/3077447/security/cost-of-a-windows-zero-da... https://www.csoonline.com/article/3077447/security/cost-of-a-windows-zero-day-exploit-this-one-goes-for-90000.html). Deploying a zero-day does impose a cost, as it increases the chance of that exploit being discovered and its value lost. Therefore, such exploits are likely to be deployed only on high-value targets. I would argue that Tor relays are unlikely to be such a target because it is so much cheaper to simply run your own relays. An exception could be a specific targeted investigation in which some suspect is behind a known relay (say, a hidden service behind a guard), because running other relays doesn’t help dislodge the target from behind its existing guard.
And we are not sure that the picture is so clear about botnets either: bots that can become guards need to have high availability (in order to pass the guard stability requirements), and such high availability bots are also likely to have a bandwidth that is higher than the water level (abandoned machines in university networks, ...). As a result, waterfilling would increase the number of high availability bots that are needed, which is likely to be hard.
This doesn’t seem like a good argument to me: “bots that become guards must have high availability, and thus they likely have high bandwidth”. How many bots would become guards in the first place? And why would availability (by which I understand you to mean uptime) imply bandwidth? The economics matter here, and I don’t know too much about botnet economics, but my impressions is that they generally include many thousands of machines and that each bot is generally quickly shut down by its service provider once it starts spewing traffic (i.e. acting as a high-bandwidth Tor relay). Thus waterfilling could benefit botnets by giving them more clients to attack while providing a small amount of bandwidth that falls below the radar of their ISP. This is a speculative argument, I admit, but seems to me to be somewhat more logical than the argument you outlined.
- Waterfilling makes it necessary for an adversary to run a larger number of relays. Apart from the costs of service providers, this large number of relays need to be managed in an apparently independent way, otherwise they would become suspicious to community members, like nusenu who is doing a great job spotting all anomalies. It seems plausible that running 100 relays in such a way that they look independent is at least as difficult as doing that with 10 relays.
Why is running a large number of relays more noticeable than running a high-bandwidth relay? Actually, it seems, if anything, *less* noticeable. An attacker could even indicate that all the relays are in the same family, and there is no Tor policy that would kick them out of the network for being “too large” of a family. If Tor wants to limit the size of single entities, then they would have to kick out some large existing families (Team Cymru, torservers.net http://torservers.net/, and the Chaos Communicration Congress come to mind), and moreover such a policy could apply equally well to total amounts of bandwidth as to total number of relays.
- The question of the protection from relays, ASes or IXPs is puzzling, and we do not have a strong opinion about it. We focused on relays because they are what is available to any attacker, compared to ASes or IXPs which are more specific adversaries. But, if there is a consensus that ASes or IXPs should rather be considered as the main target, it is easy to implement waterfilling at the AS or IXP level rather than at the IP level: just aggregate the bandwidth relayed per AS or IXP, and apply the waterfilling level computation method to them. Or we could mix the weights obtained for all these adversaries, in order to get some improvement against all of them instead of an improvement against only one and being agnostic about the others.
This suggestion of applying waterfilling to individual ASes is intriguing, but would require some a more developed design and argument. Would the attacker model be one that has a fixed cost to compromise/observe a given AS?
- More fundamentally, since the fundamental idea of Tor is to mix traffic through a large number of relays, it seems to be a sound design principle to make the choice of the critical relays as uniform as possible, as Waterfilling aims to do. A casual Tor user may be concerned to see that his traffic is very likely to be routed through a very small number of top relays, and this effect is likely to increase as soon as a multi-cores compliant implementation of Tor rises (rust dev). Current top relays which suffer from the main CPU bottleneck will probably be free to relay even more bandwidth than they already do, and gain an even more disproportionate consensus weight. Waterfilling might prevent that, and keep those useful relays doing their job at the middle position of paths.
I disagree that uniform relay selection is a sound design principle. Instead, one should consider various likely attackers and consider what design maximizes the attack cost (or maybe maximizes the minimum design cost among likely attackers). In the absence of detailed attacker information, a good design principle might be for clients to choose “diverse” relays, where diversity should take into account country, operator, operating system, AS, IXP connectivity, among other things.
Best, Aaron
- The cost of IPs vs. bandwidth is definitely a function of market offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
The question of what the cheapest attack is can indeed be estimated by looking at market prices for the required resources. Your cost estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders of magnitude.
I see that I misread your cost calculation, and that you estimated $37.20/Gbps/month instead of $3.72/Gbps/month. This still seems low by an order of magnitude. Thus, my argument stands: waterfilling would appear to decrease the cost to an adversary of getting guard probability compared to Tor’s current weighting scheme.
Best, Aaron
Aaron Johnson wrote:
Currently on OVH the best I could find for hosting just now was $93.02/per month for 250Mbps unlimited (https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml).
https://www.ovh.com/world/discover/poland.xml
$49.99/per month, 500 Mbps bandwidth (burst 1 Gbps ) unlimited
My new relay runs there http://rougmnvswfsmd4dq.onion/rs.html#details/2235E316DF8E737081A365A1386F36...
but I'm not very happy with the setup because this particular offer doesn't come with IPMI console and I had to install Proxmox Linux and run NetBSD in a virtual machine. Clock isn't very stable even with ntpd, I setup a cronjob to hard reset it periodically.
Alex
OVH and OVH resellers do seem to have some insane prices.
On the other end, the waterfilling assumption we were working off of was a water level of 10Mbps. A server that can sustain that seems quite cheap. In fact, a quick Google search for “cheap vps” yielded this offer of a VPS with one IPv4 address and a 1Gbps port capped at 2TB/month for $0.63/month: <https://my.hiformance.com/cart.php?a=confproduct&i=0 https://my.hiformance.com/cart.php?a=confproduct&i=0>. 2TB/month is about 6Mbps sustained, which falls below the supposed water level and thus gets fully allocated to guard probability. Thus to achieve in waterfilling a total guard probability equal to that of 1Gbps relay in today’s Tor (taking into account the 1/3 loss of bandwidth to the middle position in today’s Tor), one could run 1000*(2/3)/6 ~= 112 of these at $71/month. This would be cheaper than the price below of $100/month for 1Gbps.
Can we get even lower attacking either system…? :-)
Best, Aaron
On Mar 7, 2018, at 4:01 PM, Alexander Nasonov alnsn@yandex.ru wrote:
Aaron Johnson wrote:
Currently on OVH the best I could find for hosting just now was $93.02/per month for 250Mbps unlimited (https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml).
https://www.ovh.com/world/discover/poland.xml
$49.99/per month, 500 Mbps bandwidth (burst 1 Gbps ) unlimited
My new relay runs there http://rougmnvswfsmd4dq.onion/rs.html#details/2235E316DF8E737081A365A1386F36...
but I'm not very happy with the setup because this particular offer doesn't come with IPMI console and I had to install Proxmox Linux and run NetBSD in a virtual machine. Clock isn't very stable even with ntpd, I setup a cronjob to hard reset it periodically.
Alex _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Sorry, that link should have been <https://my.hiformance.com/cart.php?a=add&pid=165 https://my.hiformance.com/cart.php?a=add&pid=165>.
Best, Aaron
On Mar 7, 2018, at 4:18 PM, A. Johnson aaron.m.johnson@nrl.navy.mil wrote:
OVH and OVH resellers do seem to have some insane prices.
On the other end, the waterfilling assumption we were working off of was a water level of 10Mbps. A server that can sustain that seems quite cheap. In fact, a quick Google search for “cheap vps” yielded this offer of a VPS with one IPv4 address and a 1Gbps port capped at 2TB/month for $0.63/month: <https://my.hiformance.com/cart.php?a=confproduct&i=0 https://my.hiformance.com/cart.php?a=confproduct&i=0>. 2TB/month is about 6Mbps sustained, which falls below the supposed water level and thus gets fully allocated to guard probability. Thus to achieve in waterfilling a total guard probability equal to that of 1Gbps relay in today’s Tor (taking into account the 1/3 loss of bandwidth to the middle position in today’s Tor), one could run 1000*(2/3)/6 ~= 112 of these at $71/month. This would be cheaper than the price below of $100/month for 1Gbps.
Can we get even lower attacking either system…? :-)
Best, Aaron
On Mar 7, 2018, at 4:01 PM, Alexander Nasonov <alnsn@yandex.ru mailto:alnsn@yandex.ru> wrote:
Aaron Johnson wrote:
Currently on OVH the best I could find for hosting just now was $93.02/per month for 250Mbps unlimited (https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml).
https://www.ovh.com/world/discover/poland.xml https://www.ovh.com/world/discover/poland.xml
$49.99/per month, 500 Mbps bandwidth (burst 1 Gbps ) unlimited
My new relay runs there http://rougmnvswfsmd4dq.onion/rs.html#details/2235E316DF8E737081A365A1386F36...
but I'm not very happy with the setup because this particular offer doesn't come with IPMI console and I had to install Proxmox Linux and run NetBSD in a virtual machine. Clock isn't very stable even with ntpd, I setup a cronjob to hard reset it periodically.
Alex _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hello,
On 2018-03-07 14:31, Aaron Johnson wrote:
Hello friends,
- The cost of IPs vs. bandwidth is definitely a function of market
offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
The question of what the cheapest attack is can indeed be estimated by looking at market prices for the required resources. Your cost estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders of magnitude.
Let me merge your second answer here:
I see that I misread your cost calculation, and that you estimated $37.20/Gbps/month instead of $3.72/Gbps/month. This still seems low by an order of magnitude. Thus, my argument stands: waterfilling would appear to decrease the cost to an adversary of getting guard probability compared to Tor’s current weighting scheme.
There is still something wrong. Let's assume the adversary wants to run 1 Gbps of real guard bandwidth.
With vanilla Tor, the cheapest (considering only OVH) is:
VPS SSD 1 (https://www.ovh.com/fr/vps/vps-ssd.xml): You need 10 of them to reach 1Gbps of bandwidth, but you need 15 of them to actually relay 1 Gbps in the guard position (due to wgg = 2/3 roughly). This is our calculation above: 3*10*3/2 = 45 euros/month (or a few more dollars).
With Waterfilling, we assume above a water level of 10 Mbits, so we need:
100 VPS SSD 1 relaying 1Gbps at the guard position, which the cost turns to be 3*100 = 300 euros/month.
The numbers I gave ($2/IP/month and $500/Gbps/month) are the amounts currently charged by my US hosting provider. At the time that I shopped around (which was in 2015), it was by far the best bandwidth cost that I was able to find, and those costs haven’t changed much since then.
Currently on OVH the best I could find for hosting just now was $93.02/per month for 250Mbps unlimited (https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml). This yields $372.08/Gbps/month. I am far from certain that this is the best price that one could find - please do point me to better pricing if you have it!
I also just looked at Hetzter - another major Tor-friendly hosting provider. The best I could find was 1Gbps link capped at 100TB/month for $310.49 (https://wiki.hetzner.de/index.php/Traffic/en). 1Gbps sustained upload is 334.8Terabytes (i.e. 1e12 bytes) for a 31-day month. If you exceed that limit, you can arrange to pay $1.24/TB. Therefore I would estimate the cost to be $601.64/Gbps/month. Again, I maybe missing an option more tailored to a high-traffic server, and I would be happy to be pointed to it :-)
Moreover, European bandwidth costs are among the lowest in the world. Other locations are likely to have even higher bandwidth costs (Australia, for example, has notoriously high bandwidth costs).
We do not believe that this is conclusive, as the market changes, and there certainly are dozens of other providers.
I do agree that the market changes, and in fact I expect the cost fo IPs to plummet as the shift to IPv6 becomes pervasive. The current high cost of IPv4 addresses is due to their recent scarcity. In any case, a good question to ask would be how Tor should adjust to changes in market pricing over time.
The same applies for 0-day attacks: if you need to buy them just for attacking Tor, then they are expensive. If you are an organization in the business of handling 0-day attacks for various other reasons, then the costs are very different. And it may be unclear to determine if it is easier/cheaper to compromise 1 top relay or 20 mid-level relays.
I agree that the cost of compromising machines is unclear. However, we should guess, and the business of 0-days has provided some signals of their value in terms of their price. 0-days for the Tor software stack are expensive, as, for security reasons, (well-run) Tor relays run few services other than the tor process. I haven’t seen great data on Linux zero-days, but recently a Windows zero-day (Windows being the second most-common Tor relays OS) appeared to cost $90K (https://www.csoonline.com/article/3077447/security/cost-of-a-windows-zero-da...). Deploying a zero-day does impose a cost, as it increases the chance of that exploit being discovered and its value lost. Therefore, such exploits are likely to be deployed only on high-value targets. I would argue that Tor relays are unlikely to be such a target because it is so much cheaper to simply run your own relays. An exception could be a specific targeted investigation in which some suspect is behind a known relay (say, a hidden service behind a guard), because running other relays doesn’t help dislodge the target from behind its existing guard.
And we are not sure that the picture is so clear about botnets either: bots that can become guards need to have high availability (in order to pass the guard stability requirements), and such high availability bots are also likely to have a bandwidth that is higher than the water level (abandoned machines in university networks, ...). As a result, waterfilling would increase the number of high availability bots that are needed, which is likely to be hard.
This doesn’t seem like a good argument to me: “bots that become guards must have high availability, and thus they likely have high bandwidth”. How many bots would become guards in the first place? And why would availability (by which I understand you to mean uptime) imply bandwidth?
Our argument is speculative but here it is: Many computers in botnets have a diurnal behavior (home computers), which could not get the guard flag. Computers which have good uptime are located in more or less large companies or universities, because people are lazy to turn them off or the internal sysadmins don't want them to do that. Those structures have also good connectivity, and would probably be *above* the water level.
Anyway, I think our two opinions about botnets are just speculative, and we might be both wrong, right or a bit of the two. I suggest we debate the other points :)
The economics matter here, and I don’t know too much about botnet economics, but my impressions is that they generally include many thousands of machines and that each bot is generally quickly shut down by its service provider once it starts spewing traffic (i.e. acting as a high-bandwidth Tor relay). Thus waterfilling could benefit botnets by giving them more clients to attack while providing a small amount of bandwidth that falls below the radar of their ISP. This is a speculative argument, I admit, but seems to me to be somewhat more logical than the argument you outlined.
- Waterfilling makes it necessary for an adversary to run a larger
number of relays. Apart from the costs of service providers, this large number of relays need to be managed in an apparently independent way, otherwise they would become suspicious to community members, like nusenu who is doing a great job spotting all anomalies. It seems plausible that running 100 relays in such a way that they look independent is at least as difficult as doing that with 10 relays.
Why is running a large number of relays more noticeable than running a high-bandwidth relay? Actually, it seems, if anything, *less* noticeable. An attacker could even indicate that all the relays are in the same family, and there is no Tor policy that would kick them out of the network for being “too large” of a family. If Tor wants to limit the size of single entities, then they would have to kick out some large existing families (Team Cymru, torservers.net http://torservers.net, and the Chaos Communicration Congress come to mind), and moreover such a policy could apply equally well to total amounts of bandwidth as to total number of relays.
That depends on the kind of policy that the Tor network could put in place. If we decide that large families become a threat in end-positions, we may just aggregate all the bandwidth of the family, and apply Waterfilling. That would not kick them off, but would create a kind of 'quarantine'. Same kind of suggestion than the one just below.
- The question of the protection from relays, ASes or IXPs is
puzzling, and we do not have a strong opinion about it. We focused on relays because they are what is available to any attacker, compared to ASes or IXPs which are more specific adversaries. But, if there is a consensus that ASes or IXPs should rather be considered as the main target, it is easy to implement waterfilling at the AS or IXP level rather than at the IP level: just aggregate the bandwidth relayed per AS or IXP, and apply the waterfilling level computation method to them. Or we could mix the weights obtained for all these adversaries, in order to get some improvement against all of them instead of an improvement against only one and being agnostic about the others.
This suggestion of applying waterfilling to individual ASes is intriguing, but would require some a more developed design and argument. Would the attacker model be one that has a fixed cost to compromise/observe a given AS?
Yes, we agree that this specific point requires more research. We just laid those possibilities in our paper but we did not explore this avenue with details yet. So, to answer your question, I *think* the attacker model should be rather different than some budget, as it is not clear to me what budget is needed to take control over an AS. But, if you have a different opinion, I would be interested to heat it :) In our paper, we suggest to use the guessing entropy to evaluate the number of AS needed to compromise your path. That's still something, but probably not sound enough by itself.
- More fundamentally, since the fundamental idea of Tor is to mix
traffic through a large number of relays, it seems to be a sound design principle to make the choice of the critical relays as uniform as possible, as Waterfilling aims to do. A casual Tor user may be concerned to see that his traffic is very likely to be routed through a very small number of top relays, and this effect is likely to increase as soon as a multi-cores compliant implementation of Tor rises (rust dev). Current top relays which suffer from the main CPU bottleneck will probably be free to relay even more bandwidth than they already do, and gain an even more disproportionate consensus weight. Waterfilling might prevent that, and keep those useful relays doing their job at the middle position of paths.
I disagree that uniform relay selection is a sound design principle. Instead, one should consider various likely attackers and consider what design maximizes the attack cost (or maybe maximizes the minimum design cost among likely attackers). In the absence of detailed attacker information, a good design principle might be for clients to choose “diverse” relays
This is what Waterfilling does: increase the cost of a well-defined attacker and offer clients to choose into a more "diverse" network.
Thanks again for all your opinions and arguments,
Florentin
, where diversity should take into account country, operator, operating system, AS, IXP connectivity, among other things.
Best, Aaron
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hello
Florentin Rochet wrote:
Hello,
On 2018-03-07 14:31, Aaron Johnson wrote:
Hello friends,
- The cost of IPs vs. bandwidth is definitely a function of market
offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
The question of what the cheapest attack is can indeed be estimated by looking at market prices for the required resources. Your cost estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders of magnitude.
Let me merge your second answer here:
I see that I misread your cost calculation, and that you estimated $37.20/Gbps/month instead of $3.72/Gbps/month. This still seems low by an order of magnitude. Thus, my argument stands: waterfilling would appear to decrease the cost to an adversary of getting guard probability compared to Tor’s current weighting scheme.
There is still something wrong. Let's assume the adversary wants to run 1 Gbps of real guard bandwidth.
With vanilla Tor, the cheapest (considering only OVH) is:
VPS SSD 1 (https://www.ovh.com/fr/vps/vps-ssd.xml): You need 10 of them to reach 1Gbps of bandwidth, but you need 15 of them to actually relay 1 Gbps in the guard position (due to wgg = 2/3 roughly). This is our calculation above: 3*10*3/2 = 45 euros/month (or a few more dollars).
With Waterfilling, we assume above a water level of 10 Mbits, so we need:
100 VPS SSD 1 relaying 1Gbps at the guard position, which the cost turns to be 3*100 = 300 euros/month.
[....]
A VPS is a shared resource environment. All VPSes on a single physical server share the same NIC(s). While they do advertise a port speed (like unlimited traffic at 100 mbps, 250 mbps, 1gbps, etc) they actually refer to the theoretical physical NIC speed. Absolutely all of them have something like a 'fair usage policy', which means that if you use more than n % of your port's theoretical max speed during m % of time, they will either:
a) throttle your VPS to something they find reasonable, like 5mbps or 10mbps maximum (could be far less);
b) suspend your service and force you to get dedicated hardware + dedicated switch port and bandwidth.
I can guarantee you will never ever _ever_ run 1gpbs of total real effective bandwidth at the guard position at the cost of 45 euros / month nowhere in the world (doesn't matter if it's Europe, US or whatever). Try getting a 3 euros VPS and you'll see that you won't be able to saturate its port for too long.
On Mar 7, 2018, at 5:12 PM, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hello,
On 2018-03-07 14:31, Aaron Johnson wrote:
Hello friends,
- The cost of IPs vs. bandwidth is definitely a function of market
offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
The question of what the cheapest attack is can indeed be estimated by looking at market prices for the required resources. Your cost estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders of magnitude.
Let me merge your second answer here:
I see that I misread your cost calculation, and that you estimated $37.20/Gbps/month instead of $3.72/Gbps/month. This still seems low by an order of magnitude. Thus, my argument stands: waterfilling would appear to decrease the cost to an adversary of getting guard probability compared to Tor’s current weighting scheme.
There is still something wrong.
What’s wrong? $37.20Gbps/month = 30 Euros/Gbps/month, which is what you are claiming. This would be the lowest price for a sustained Gbps transfer by a significant margin among all of the deals that have appeared on this thread. The other lowest was from Alex, who found $100/Gbps/month. I somewhat doubt that you could actually achieve 1Gbps sustained for 30 Euros/month on a shared VPS or that OVH would actually tolerate using this much bandwidth at this little cost. It would at least be a notable new record for the cheapest possible Tor bandwidth, as far as I can tell.
With Waterfilling, we assume above a water level of 10 Mbits, so we need:
100 VPS SSD 1 relaying 1Gbps at the guard position, which the cost turns to be 3*100 = 300 euros/month.
This calculation is much too kind to waterfilling :-) Why pay for a full 100Mbps with only 1 IPv4 address when you only need 10Mbps/IP to achieve the same guard probability? Earlier I showed an example of a cheaper VPS (https://my.hiformance.com/cart.php?a=add&pid=165 https://my.hiformance.com/cart.php?a=add&pid=165) that appears to provide for just $0.63/month a VPS with an IPv4 address that is capped at 6Mbps sustained througput. This would be a more economical way (3.5x cheaper) to attack waterfilling. Alternatively, I bet you could get bulk IPv4 addresses for much cheaper than the $3/month that OVH charges for its SSD VPS, which you could then potentially apply to your 100Mbps (or larger) server to get 10Mbps and more cheaply attack waterfilling. For example, OVH provides 256 IP addresses for its dedicated servers at no monthly cost (https://www.ovh.co.uk/dedicated_servers/details-servers-range-GAME-id-MC-64-... https://www.ovh.co.uk/dedicated_servers/details-servers-range-GAME-id-MC-64-OC.xml). These servers can be had for at least 55 euros/month, which provides 500Mbps unlimited. With two of those, you could achieve the above attack on waterfilling for 110 euros = $136.36/month instead of 300 euros/month = $371.92/month. Once we’re talking about trying to achieve a large fraction of the Tor network, which requires many Gbps in vanilla Tor, the fixed cost of a server becomes a smaller fraction of the total cost and the savings from the free extra IPs become greater.
That depends on the kind of policy that the Tor network could put in place. If we decide that large families become a threat in end-positions, we may just aggregate all the bandwidth of the family, and apply Waterfilling. That would not kick them off, but would create a kind of 'quarantine'. Same kind of suggestion than the one just below.
This seems to be a different argument than you were making, which was that the many servers must appear to be run independently, which I still disagree with.
This is what Waterfilling does: increase the cost of a well-defined attacker and offer clients to choose into a more "diverse" network.
Sorry, I still don’t agree. It increases the cost in terms of number of IP addresses required and causes clients to spread out more across guards with different IP addresses. This is a narrow notion of diversity and not one that I think is useful as a design principle.
Best, Aaron
Hi all :)
On 2018-03-08 00:31, A. Johnson wrote:
On Mar 7, 2018, at 5:12 PM, Florentin Rochet <florentin.rochet@uclouvain.be mailto:florentin.rochet@uclouvain.be> wrote:
Hello,
On 2018-03-07 14:31, Aaron Johnson wrote:
Hello friends,
- The cost of IPs vs. bandwidth is definitely a function of market
offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45
euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
The question of what the cheapest attack is can indeed be estimated by looking at market prices for the required resources. Your cost estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders of magnitude.
Let me merge your second answer here:
I see that I misread your cost calculation, and that you estimated $37.20/Gbps/month instead of $3.72/Gbps/month. This still seems low by an order of magnitude. Thus, my argument stands: waterfilling would appear to decrease the cost to an adversary of getting guard probability compared to Tor’s current weighting scheme.
There is still something wrong.
What’s wrong? $37.20Gbps/month = 30 Euros/Gbps/month, which is what you are claiming.
I am sorry, "wrong" was a bad chosen word. It is just that we are not comparing the same bandwidth. What is written above is 1Gbps of *guard* bandwidth, which means 1,5 Gbps of bandwidth due to the 2/3 ratio on vanilla Tor. Either one are fine, but since we started with 1Gbps of *guard* bandwidth, let's keep using this baseline not to get confused :)
This would be the lowest price for a sustained Gbps transfer by a significant margin among all of the deals that have appeared on this thread. The other lowest was from Alex, who found $100/Gbps/month. I somewhat doubt that you could actually achieve 1Gbps sustained for 30 Euros/month on a shared VPS or that OVH would actually tolerate using this much bandwidth at this little cost.
Rob and s7r also raised the same argument. So, let me share my complete experience regarding this topic:
I decided some time ago to invest 500$ in running relays, I did some research to look for the cheapest offers and also to try to setup my relays in different AS, if possible. I did find some interesting deals in different countries, with different providers and I made a list to try them all. All of the deals were quite similar: 100 Mbits unlimited, at an insane low price. So insane that I was suspicious as you are all. I started my relays and got a few bad experiences that I can list here:
- One of the deal was 50€/year for an unlimited 100Mbits in Sweden. After 3 or 4 weeks, my access got simply revoked with no warning or message. I contacted the support and got some clumsy arguments about the fact that I was running an hacking tool. Needless to say, the probable reason was my bandwidth consumption. - Another one was an unlimited 100 Mbits in UK for 4pounds/month. The first few days were nice, relaying ~70Mbits. Then I got throttled to 8Mbits until the end of the month. - Another one was a reseller. I managed to run 200Mbits during a few days of Exit bandwidth on 1 machine, for less than 8€/month. Then, my access were revoked due to some external complain. The funny things was that I did ask if I could run an Exit Tor relay before and the support answered that they had no problems with Tor relays.
The list can go on, I had the same kind of problems with other providers. All of them have something is common, they are all small companies using what Rob said "unlimited bandwidth as marketing term".
Hopefully I had some good experience too (all of them are exit relays):
- I run a few relays at OVH (France, Poland), 100 Mbits for 3€/month like the offer linked in this thread. A different datacenter for each. No complain from the provider and the relays are used since months. - I run one unlimited 100Mbits relay in Moldova since months - I run one unlimited 100Mbits relay in Canada since months
Now, If we take the /16 prefix of the IP I got from my 3 OVH European relays: "54.37", "137.74", "145.239", and if we do some atlas relay search:
https://metrics.torproject.org/rs.html#search/137.74 https://metrics.torproject.org/rs.html#search/%20%0954.37 https://metrics.torproject.org/rs.html#search/145.239
All relays appearing to advertise around 10~12 MiB/s are *probably* the offer I linked in this thread. These relays even have a huge consensus weight :(.
Moreover, there is some people running more than 1Gbps with this method, such as this relay operator: https://metrics.torproject.org/rs.html#details/117B99D5CE22174DEA7F1AD3BE25E... and this guy is doing it with the price I gave above :)
So why is it working? I come up the following conclusion: OVH is a big enough company not to lie with "unlimited, unmetered 100Mbits". I did not try other big providers, but that would be likely the same result.
Conclusion: we can run many Gbps of bandwidth with the price I gave above, for now.
It would at least be a notable new record for the cheapest possible Tor bandwidth, as far as I can tell.
With Waterfilling, we assume above a water level of 10 Mbits, so we need:
100 VPS SSD 1 relaying 1Gbps at the guard position, which the cost turns to be 3*100 = 300 euros/month.
This calculation is much too kind to waterfilling :-) Why pay for a full 100Mbps with only 1 IPv4 address when you only need 10Mbps/IP to achieve the same guard probability? Earlier I showed an example of a cheaper VPS (https://my.hiformance.com/cart.php?a=add&pid=165) that appears to provide for just $0.63/month a VPS with an IPv4 address that is capped at 6Mbps sustained througput. This would be a more economical way (3.5x cheaper) to attack waterfilling.
Yes, you are right. This is insane price and theoretically stronger against Waterfilling. But let me count the number of relays needed to achieve, let's say 10% of bandwidth with that provider, and let's suppose 10% is 15 Gbps (https://metrics.torproject.org/bandwidth-flags.html). Waterfilling reduces the bandwidth that the adversary needs by (currently) a 2/3 ratio. So, the adversary needs 10 Gbits:
10000/6 = 1666 relays.
From this number, I wonder the following things:
Can an adversary puts 1666 Guard relays in the network such that this community would not notice that something strange is happening? Given the fact that we don't even have 2000 Guards by now.
Does the provider have enough IPv4? Are they the same /16?
Would it be as compliant than OVH?
Given those numbers, is it a good thing to reason over security with money only?
Alternatively, I bet you could get bulk IPv4 addresses for much cheaper than the $3/month that OVH charges for its SSD VPS, which you could then potentially apply to your 100Mbps (or larger) server to get 10Mbps and more cheaply attack waterfilling. For example, OVH provides 256 IP addresses for its dedicated servers at no monthly cost (https://www.ovh.co.uk/dedicated_servers/details-servers-range-GAME-id-MC-64-...). These servers can be had for at least 55 euros/month, which provides 500Mbps unlimited. With two of those, you could achieve the above attack on waterfilling for 110 euros = $136.36/month instead of 300 euros/month = $371.92/month.
You're right. But you're also having the same /24 for all your relays running on this machine. Some easy rule on the directory server can prevent this to happen. Limiting the number of relays over a same /24 for example.
Once we’re talking about trying to achieve a large fraction of the Tor network, which requires many Gbps in vanilla Tor, the fixed cost of a server becomes a smaller fraction of the total cost and the savings from the free extra IPs become greater.
That depends on the kind of policy that the Tor network could put in place. If we decide that large families become a threat in end-positions, we may just aggregate all the bandwidth of the family, and apply Waterfilling. That would not kick them off, but would create a kind of 'quarantine'. Same kind of suggestion than the one just below.
This seems to be a different argument than you were making, which was that the many servers must appear to be run independently, which I still disagree with.
This is what Waterfilling does: increase the cost of a well-defined attacker and offer clients to choose into a more "diverse" network.
Sorry, I still don’t agree. It increases the cost in terms of number of IP addresses required and causes clients to spread out more across guards with different IP addresses. This is a narrow notion of diversity and not one that I think is useful as a design principle.
I agree that this is a narrow notion of diversity. Waterfilling is currently applied over IP, but this is not a *mandatory* design. What Tor does now, is an attacker-agnostic balance of bandwidth. Waterfilling should be seen as a technique that allows to take into account an attacker in the balance of the network. It can be applied with a wider notion of diversity and security, as we already outlined.
I hope it helps and many thanks for your comments :)
Best, Florentin
Best, Aaron
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hi Florentin,
Thanks for the thoughtful response!
So why is it working? I come up the following conclusion: OVH is a big enough company not to lie with "unlimited, unmetered 100Mbits". I did not try other big providers, but that would be likely the same result.
Conclusion: we can run many Gbps of bandwidth with the price I gave above, for now.
I wonder how confident we can be about this situation. If we are most worried about an attacker trying to get, say, 10% of the network, would the provider be as oblivious/generous? Your numbers below (10% = 15Gbps) would require running 15*(3/2) / 0.1 = 225 relays at 3 euros/month each. Would OVH still ignore 225 cheap VPSs at 100% bandwidth utilization? Would they still be able to provide 100Mbps at that number?
Yes, you are right. This is insane price and theoretically stronger against Waterfilling. But let me count the number of relays needed to achieve, let's say 10% of bandwidth with that provider, and let's suppose 10% is 15 Gbps (https://metrics.torproject.org/bandwidth-flags.html https://metrics.torproject.org/bandwidth-flags.html). Waterfilling reduces the bandwidth that the adversary needs by (currently) a 2/3 ratio. So, the adversary needs 10 Gbits:
10000/6 = 1666 relays.
From this number, I wonder the following things:
Can an adversary puts 1666 Guard relays in the network such that this community would not notice that something strange is happening? Given the fact that we don't even have 2000 Guards by now.
Again, I don’t see how this would be more noticeable or alarming than a single entity providing 10% of the guard bandwidth. Moreover, the security argument that “someone will surely notice and do something” doesn’t have a good track record. Absent a specific plan of how to notice it and respond automatically, I wouldn’t want to rely on it.
Does the provider have enough IPv4? Are they the same /16?
Are you sure there aren’t many providers with such cheap deals (I fairly easily found another with $9/year for an IP and a 2TB cap: https://www.woothosting.com/pulse/cart.php?gid=16)? Being in the same /16 won’t make a difference in their guard probability.
Alternatively, I bet you could get bulk IPv4 addresses for much cheaper than the $3/month that OVH charges for its SSD VPS, which you could then potentially apply to your 100Mbps (or larger) server to get 10Mbps and more cheaply attack waterfilling. For example, OVH provides 256 IP addresses for its dedicated servers at no monthly cost (https://www.ovh.co.uk/dedicated_servers/details-servers-range-GAME-id-MC-64-... https://www.ovh.co.uk/dedicated_servers/details-servers-range-GAME-id-MC-64-OC.xml). These servers can be had for at least 55 euros/month, which provides 500Mbps unlimited. With two of those, you could achieve the above attack on waterfilling for 110 euros = $136.36/month instead of 300 euros/month = $371.92/month.
You're right. But you're also having the same /24 for all your relays running on this machine. Some easy rule on the directory server can prevent this to happen. Limiting the number of relays over a same /24 for example.
Incorporating IP prefix diversity in Tor’s path selection does seem like a good idea in general. It sounds like you are suggesting that waterfilling should include a fixed limit on the number of relays in a /24. This is now a new scheme that would need its security analyzed. A few things that come to mind: 1. Would there be limits for larger prefixes than an adversary might obtain (e.g. /16)? If not, the limit is only effective for adversaries without resources to obtain a larger prefix. 2. Wouldn’t this allow an adversary to squat on a prefix? For example, he could run a bunch of cheap relays on prefixes owned by the Tor-friendly ISPs and keep anybody else from contributing more resources using that ISP. 3. If resource limits are a reasonable strategy, instead of waterfilling, why not apply such limits to bandwidth (e.g. no more than 10Gbps per /24)? It seems simpler. It is also not susceptible to an attack on water filling in which the water level is raised by contributing to both guard and exit bandwidth.
Best, Aaron
Hi Florentin,
I've added some comments below.
Overall, I think a useful discussion for the community to have is to discuss whether or not we think Waterfilling is even a good idea in the first place, before you go ahead and do a bunch of work writing and fixing a proposal that may just end up in the pile of old grad student research ideas. (Maybe I'm too late, or maybe you want a proposal out there in any case.)
On Mar 7, 2018, at 3:28 AM, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hi Aaron,
Thanks for your comments, you are definitely touching interesting aspects.
Here are thoughts regarding your objections:
- The cost of IPs vs. bandwidth is definitely a function of market offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
We do not believe that this is conclusive, as the market changes, and there certainly are dozens of other providers.
Have you purchased service from OVH and run relays yourself? Have you talked to anyone who has? I strongly believe that you will not find a provider that legitimately offers you continuous 100 Mbit/s over a long period of time for 3 euros. Providers tend to use "unmetered" and "unlimited" bandwidth as marketing terms, but they don't actually mean what you think unlimited means. What they mean is that you have a 100 MBit/s network card, and they allow you to burst up to the full 100 MBit/s. However, they usually have a total bandwidth cap on such service, or become angry and threaten to disconnect your service if you don't cut down your usage (this has happened to me).
It is far more expensive to obtain *continuous*, i.e., *sustained* bandwidth usage over time. Generally, it's cheaper to buy in bulk. In the US, the cheapest bandwidth service we found (that also allows us to run Tor relays) was one that offers sustained 1 Gbit/s for an average of $500/month (including service fees).
The same applies for 0-day attacks: if you need to buy them just for attacking Tor, then they are expensive. If you are an organization in the business of handling 0-day attacks for various other reasons, then the costs are very different. And it may be unclear to determine if it is easier/cheaper to compromise 1 top relay or 20 mid-level relays.
It's hard to reason about this, since I'm not in the business. However, it you already have a zero-day, why would you want to waste it on a Tor relay? You would risk being discovered accessing the machine of a likely security-consious relay operator, and you could just run your own relays. Running your own relays does have some cost, but is far easier to manage and more reliable since you don't have to worry about being discovered or losing access because the software is patched.
And we are not sure that the picture is so clear about botnets either: bots that can become guards need to have high availability (in order to pass the guard stability requirements), and such high availability bots are also likely to have a bandwidth that is higher than the water level (abandoned machines in university networks, ...). As a result, waterfilling would increase the number of high availability bots that are needed, which is likely to be hard.
I think its much more likely that bots are running on my parents Windows machines than on high-bandwidth University machines. Sure, there might be some machines with outdated OSes out there on University networks, but they are also monitored pretty heavily for suspicious activity by the University IT folks, who regularly check in with the machine owners with anything suspicious occurs on the network.
- Waterfilling makes it necessary for an adversary to run a larger number of relays. Apart from the costs of service providers, this large number of relays need to be managed in an apparently independent way, otherwise they would become suspicious to community members, like nusenu who is doing a great job spotting all anomalies. It seems plausible that running 100 relays in such a way that they look independent is at least as difficult as doing that with 10 relays.
But not much more difficult, and not difficult enough that an intern could not whip up a managed deployment in a few weeks. There are various tools out there that can automate software installation and configuration. Ansible, Chef, and Puppet are popular ones, but here is a longer list:
https://en.wikipedia.org/wiki/Comparison_of_open-source_configuration_manage...
I would be surprised if at least TorServers.net didn't already use something like this, since they manage a large number of relays.
The question of the protection from relays, ASes or IXPs is puzzling, and we do not have a strong opinion about it. We focused on relays because they are what is available to any attacker, compared to ASes or IXPs which are more specific adversaries. But, if there is a consensus that ASes or IXPs should rather be considered as the main target, it is easy to implement waterfilling at the AS or IXP level rather than at the IP level: just aggregate the bandwidth relayed per AS or IXP, and apply the waterfilling level computation method to them. Or we could mix the weights obtained for all these adversaries, in order to get some improvement against all of them instead of an improvement against only one and being agnostic about the others.
More fundamentally, since the fundamental idea of Tor is to mix traffic through a large number of relays, it seems to be a sound design principle to make the choice of the critical relays as uniform as possible, as Waterfilling aims to do.
I think this is the crux of my disagreement. We should base relay choice on security and/or performance, whether or not that means uniform choice. In a world where it is more costly to start up new relays than it is to run high bandwidth relays, the waterfilling approach may improve security. But, in my opinion, there are too many open questions and speculations going on here to be convinced that that's the world we live in.
A casual Tor user may be concerned to see that his traffic is very likely to be routed through a very small number of top relays, and this effect is likely to increase as soon as a multi-cores compliant implementation of Tor rises (rust dev). Current top relays which suffer from the main CPU bottleneck will probably be free to relay even more bandwidth than they already do, and gain an even more disproportionate consensus weight. Waterfilling might prevent that, and keep those useful relays doing their job at the middle position of paths.
I run several high bandwidth relays. I can say that the only thing that eliminating the CPU bottleneck would do for me is allow me to run fewer relays in order to consume the bandwidth available on my machine; I still control the same amount of bandwidth overall. The fact that I have to run 4 relays to consume my bandwidth vs. 1 relay does not have any impact on my decision of whether or not I will run more; the main criterion in that decision is cost.
We hope those thoughts can help, and thanks again for sharing yours.
I hope my perspective on things is useful in some way!
Best, Rob
On Mar 7, 2018, at 12:34 PM, Rob Jansen rob.g.jansen@nrl.navy.mil wrote:
Hi Florentin,
I've added some comments below.
(I just found out that Aaron responded to your reply this morning, but I didn't get that email (it probably got stuck somewhere in the NRL email filters). Sorry if I made any points that were already made, but they were made independently.)
Best, Rob
On 7 Mar 2018, at 18:34, Rob Jansen rob.g.jansen@nrl.navy.mil wrote:
It is far more expensive to obtain *continuous*, i.e., *sustained* bandwidth usage over time. Generally, it's cheaper to buy in bulk. In the US, the cheapest bandwidth service we found (that also allows us to run Tor relays) was one that offers sustained 1 Gbit/s for an average of $500/month (including service fees).
In 2016, OVH was offering a discount for 1 Gbps servers in Europe for USD 200/month. The ordinary price was USD 300/month.
We can also purchase a 250 Mbps server in Canada for USD 60/month, but it doesn't get as much Guard bandwidth, because of its location or connectivity.
T
Hi Rob,
Thank you for your comments!
On 07/03/18 18:34, Rob Jansen wrote:
Hi Florentin,
I've added some comments below.
Overall, I think a useful discussion for the community to have is to discuss whether or not we think Waterfilling is even a good idea in the first place, before you go ahead and do a bunch of work writing and fixing a proposal that may just end up in the pile of old grad student research ideas. (Maybe I'm too late, or maybe you want a proposal out there in any case.)
Agree. Most of the work is already done, though. Details remain, but let's try to reach a consensus before :)
On Mar 7, 2018, at 3:28 AM, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hi Aaron,
Thanks for your comments, you are definitely touching interesting aspects.
Here are thoughts regarding your objections:
- The cost of IPs vs. bandwidth is definitely a function of market offers. Your $500/Gbps/month seems quite expensive compared to what can be found on OVH (which is hosting a large number of relays): they ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we assume that wgg = 2/3 and a water level at 10Mbps, this means that, if you want to have 1Gbps of guard bandwidth,
- the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
- the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
We do not believe that this is conclusive, as the market changes, and there certainly are dozens of other providers.
Have you purchased service from OVH and run relays yourself? Have you talked to anyone who has? I strongly believe that you will not find a provider that legitimately offers you continuous 100 Mbit/s over a long period of time for 3 euros. Providers tend to use "unmetered" and "unlimited" bandwidth as marketing terms, but they don't actually mean what you think unlimited means.
I do run several 100 Mbits relay, all used at full 100 Mbits with unmetered and unlimited traffic, for 3€/month. One of them is online since almost a year, no problem from the provider. I must have relayed thousands of TB, and still there.
I agree with your arguments, some providers are indeed using that as a marketing term: the small ones. I don't think OVH or other big providers bother that you fully use what they *must* give you, and my relays uptime confirms that. Probably because they have enough clients paying VPS which do nothing. Basically, we're running cheap Tor relays thanks to all other people buying OVH services and not really using them.
What they mean is that you have a 100 MBit/s network card, and they allow you to burst up to the full 100 MBit/s. However, they usually have a total bandwidth cap on such service, or become angry and threaten to disconnect your service if you don't cut down your usage (this has happened to me).
It is far more expensive to obtain *continuous*, i.e., *sustained* bandwidth usage over time. Generally, it's cheaper to buy in bulk. In the US, the cheapest bandwidth service we found (that also allows us to run Tor relays) was one that offers sustained 1 Gbit/s for an average of $500/month (including service fees).
The same applies for 0-day attacks: if you need to buy them just for attacking Tor, then they are expensive. If you are an organization in the business of handling 0-day attacks for various other reasons, then the costs are very different. And it may be unclear to determine if it is easier/cheaper to compromise 1 top relay or 20 mid-level relays.
It's hard to reason about this, since I'm not in the business. However, it you already have a zero-day, why would you want to waste it on a Tor relay? You would risk being discovered accessing the machine of a likely security-consious relay operator, and you could just run your own relays. Running your own relays does have some cost, but is far easier to manage and more reliable since you don't have to worry about being discovered or losing access because the software is patched.
And we are not sure that the picture is so clear about botnets either: bots that can become guards need to have high availability (in order to pass the guard stability requirements), and such high availability bots are also likely to have a bandwidth that is higher than the water level (abandoned machines in university networks, ...). As a result, waterfilling would increase the number of high availability bots that are needed, which is likely to be hard.
I think its much more likely that bots are running on my parents Windows machines than on high-bandwidth University machines.
Sure, but all those windows computers have no chance to get the guard flag due to the likely diurnal behavior.
Sure, there might be some machines with outdated OSes out there on University networks, but they are also monitored pretty heavily for suspicious activity by the University IT folks, who regularly check in with the machine owners with anything suspicious occurs on the network.
- Waterfilling makes it necessary for an adversary to run a larger number of relays. Apart from the costs of service providers, this large number of relays need to be managed in an apparently independent way, otherwise they would become suspicious to community members, like nusenu who is doing a great job spotting all anomalies. It seems plausible that running 100 relays in such a way that they look independent is at least as difficult as doing that with 10 relays.
But not much more difficult, and not difficult enough that an intern could not whip up a managed deployment in a few weeks. There are various tools out there that can automate software installation and configuration. Ansible, Chef, and Puppet are popular ones, but here is a longer list:
https://en.wikipedia.org/wiki/Comparison_of_open-source_configuration_manage...
I would be surprised if at least TorServers.net didn't already use something like this, since they manage a large number of relays.
The question of the protection from relays, ASes or IXPs is puzzling, and we do not have a strong opinion about it. We focused on relays because they are what is available to any attacker, compared to ASes or IXPs which are more specific adversaries. But, if there is a consensus that ASes or IXPs should rather be considered as the main target, it is easy to implement waterfilling at the AS or IXP level rather than at the IP level: just aggregate the bandwidth relayed per AS or IXP, and apply the waterfilling level computation method to them. Or we could mix the weights obtained for all these adversaries, in order to get some improvement against all of them instead of an improvement against only one and being agnostic about the others.
More fundamentally, since the fundamental idea of Tor is to mix traffic through a large number of relays, it seems to be a sound design principle to make the choice of the critical relays as uniform as possible, as Waterfilling aims to do.
I think this is the crux of my disagreement. We should base relay choice on security and/or performance, whether or not that means uniform choice. In a world where it is more costly to start up new relays than it is to run high bandwidth relays, the waterfilling approach may improve security. But, in my opinion, there are too many open questions and speculations going on here to be convinced that that's the world we live in.
Yes, going toward uniformity just because it is uniformity (or because some entropy says it's better) is not the good approach. It is not our point here, though. Our "as uniform as possible" takes into account security as defined by several models and takes into account performance. If making more uniform reduces security or performance, we stop there.
Funny how we all want the same, but find difficulties to agree on some principles.
Best, Florentin
A casual Tor user may be concerned to see that his traffic is very likely to be routed through a very small number of top relays, and this effect is likely to increase as soon as a multi-cores compliant implementation of Tor rises (rust dev). Current top relays which suffer from the main CPU bottleneck will probably be free to relay even more bandwidth than they already do, and gain an even more disproportionate consensus weight. Waterfilling might prevent that, and keep those useful relays doing their job at the middle position of paths.
I run several high bandwidth relays. I can say that the only thing that eliminating the CPU bottleneck would do for me is allow me to run fewer relays in order to consume the bandwidth available on my machine; I still control the same amount of bandwidth overall. The fact that I have to run 4 relays to consume my bandwidth vs. 1 relay does not have any impact on my decision of whether or not I will run more; the main criterion in that decision is cost.
We hope those thoughts can help, and thanks again for sharing yours.
I hope my perspective on things is useful in some way!
Best, Rob
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev