A main theme in the recent Tor development meeting was guard node security as discussed in Roger's blog post and in Tariq's et al. paper [0].
Over the course of the meeting we discussed various guard-related subjects. Here are some of them:
a) Reducing the number of guards to 1 or 2 (#9273).
b) Increasing the guard rotation period (to 9 months or so) (#8240).
c) The fact that your set of guard nodes can act as a network fingerprint even if you switch to different networks (#10969).
d) The fact that authorities assign flags based on knowledge they acquired while they were up. They don't use historical data to assign flags, which means that an authority thas has been up for 1 month, only knows 1 month worth of information about each relay (#10968).
e) We discussed introducing a weight parameter that makes guards that have been guards for a long time, be more likely to be used as guards.
f) We discussed how guards and circuit isolation should work together. Maybe each isolation profile should have a different set of guards. But what if we have hundreds of isolation profiles (we are the gateway of a network)?
g) Should we refuse to add new guards after a certain number of circuits have been killed (maybe it's an attack). But won't that drive our users to simply reinstall Tor because they think it's broken?
h) We discussed chaining consensus documents together as in a block chain. This could help against targetted attacks where a set of bad authorities give a poisoned consensus to a user. This design has many problems though (how much data do we have to keep forever each consensus? what happens in periods where no consensuses were published?)
i) If we restrict the number of guards to 1, what happens to the unlucky users that pick a slow guard? What's the probability of being an unlucky user? Should we bump up the bandwidth threshold for being a guard node? How does that change the diversity of our guard selection process?
j) How do the above influence the security of Hidden Services? Guard security is essential for the well-being of Hidden Services, but how does increasing the guard rotation period combine with guard enumeration attacks (#9001)?
We discussed more topics too. You can find Nick's raw notes at: https://trac.torproject.org/projects/tor/wiki/org/meetings/2014WinterDevMeet...
To move forward, we decided that proposals should be written for (a) and (b). We also decided that we should first evaluate whether doing (a) and (b) are good ideas at all, especially with regards to (i).
[0]: https://blog.torproject.org/blog/improving-tors-anonymity-changing-guard-par... http://freehaven.net/anonbib/#wpes12-cogs
George Kadianakis desnacked@riseup.net writes:
A main theme in the recent Tor development meeting was guard node security as discussed in Roger's blog post and in Tariq's et al. paper [0].
Over the course of the meeting we discussed various guard-related subjects. Here are some of them:
a) Reducing the number of guards to 1 or 2 (#9273).
b) Increasing the guard rotation period (to 9 months or so) (#8240).
<snip>
i) If we restrict the number of guards to 1, what happens to the unlucky users that pick a slow guard? What's the probability of being an unlucky user? Should we bump up the bandwidth threshold for being a guard node? How does that change the diversity of our guard selection process?
<snip>
To move forward, we decided that proposals should be written for (a) and (b). We also decided that we should first evaluate whether doing (a) and (b) are good ideas at all, especially with regards to (i).
I started working on evaluating whether reducing the number of guards to a single guard will result in horrible performance for users. Also, whether increasing the bandwidth threshold required for being a guard node will result in a less diverse guard selection process.
You can find the script here: https://git.gitorious.org/guards/guards.git https://gitorious.org/guards/guards
This is a thread to hear feature requests of what we need to find out before going ahead and implementing our ideas.
For now, I wrote an analysis.py script, which reads a user-supplied consensus, calculates the entropy of the guard selection process (like we did in #6232), then it removes the slow guards from the consensus (with a user-supplied threshold value), redoes the bandwidth weights, and recalculates the diversity of the guard selection process. Then it prints the two entropy values. This is supposed to give us an idea of how much diversity we lost by pruning the slow guard nodes. For more info, see the analysis() function.
Here is an execution of the script: """" $ python analysis.py consensus 400 # kB/s WARNING:root:Entropy: 9.41949523094 (max entropy of 1999 guards: 10.9650627567). WARNING:root:Before pruning: 1999 guards. After pruning: 1627 guards WARNING:root:Entropy: 9.36671628404 (max entropy of 1627 guards: 10.6679985357). WARNING:root:Original: 9.41949523094. Pruned: 9.36671628404 """ In this case, the entropy of the original guard selection process was 9.4 bits, and after we pruned slow guard nodes (below 400kB/s), we got an entropy of 9.3 bits. Is this good or bad? It's unclear, since comparing two Shannon entropy values does not make much sense (especially since they are in the logarithmic scale).
So here is a TODO list on how to make this analysis script more useful:
* Instead of printing the entropy, visualize the probability distribution of guard selection. A histogram of the probability distribution, for example, might make the loss of diversity more clear.
* Find a way to compare entropy values in a meaningful way. We can use the maximum entropy of each consensus to see how far from max entropy we are in each case. Or we can try to linearize the entropy value somehow.
* Find other ways to measure diversity of guard node selection.
* Given another speed threshold, print the probability that the guard selection will give us a guard below that threshold.
Even if we restrict guards to above 600kB/s, we still want to learn what's the chance you are going to encounter a guard below 1000kB/s.
* Fix errors on the current script.
For example, I'm not sure if I'm using the correct bandwidth values. I'm currently using the value in 'w' lines of the consensus ('w Bandwidth=479'). I used to think that this is a unitless number, but looking at dir-spec.txt it seems to be "kilobytes per second". Is this the value I should be using to figure out which guards I should cut-off?
* Implement ideas (d) and (e) and use historical data to evaluate whether they are worth doing:
d) The fact that authorities assign flags based on knowledge they acquired while they were up. They don't use historical data to assign flags, which means that an authority thas has been up for 1 month, only knows 1 month worth of information about each relay (#10968).
e) We discussed introducing a weight parameter that makes guards that have been guards for a long time, be more likely to be used as guards.
What else do we need to do?
PS: Here is another execution of the script with a higher cut-off value (100MB/s): """ $ python analysis.py consensus 100000 # kB/s WARNING:root:Entropy: 9.41949523094 (max entropy of 1999 guards: 10.9650627567). WARNING:root:Before pruning: 1999 guards. After pruning: 17 guards WARNING:root:Entropy: 3.76986046797 (max entropy of 17 guards: 4.08746284125). WARNING:root:Original: 9.41949523094. Pruned: 3.76986046797 """
George Kadianakis desnacked@riseup.net writes:
George Kadianakis desnacked@riseup.net writes:
A main theme in the recent Tor development meeting was guard node security as discussed in Roger's blog post and in Tariq's et al. paper [0].
Over the course of the meeting we discussed various guard-related subjects. Here are some of them:
a) Reducing the number of guards to 1 or 2 (#9273).
b) Increasing the guard rotation period (to 9 months or so) (#8240).
<snip>
i) If we restrict the number of guards to 1, what happens to the unlucky users that pick a slow guard? What's the probability of being an unlucky user? Should we bump up the bandwidth threshold for being a guard node? How does that change the diversity of our guard selection process?
<snip>
To move forward, we decided that proposals should be written for (a) and (b). We also decided that we should first evaluate whether doing (a) and (b) are good ideas at all, especially with regards to (i).
I started working on evaluating whether reducing the number of guards to a single guard will result in horrible performance for users. Also, whether increasing the bandwidth threshold required for being a guard node will result in a less diverse guard selection process.
You can find the script here: https://git.gitorious.org/guards/guards.git https://gitorious.org/guards/guards
And because release-early-release-often, here is a graph: https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current guard selection process (e.g. the most likely to be selected guard node is selected with probability ~0.013). The right boxplot is the probability distribution we would have if we pruned the guard nodes that are slower than 4MB/s. We can see that in that case, the most popular guard node has probability of ~0.15 being selected.
The left boxplot is there for comparison. It's a uniform probability distribution which is our best case scenario with regards to security: all guard nodes have an equal chance of being selected. (However it's unachievable since we are doing bandwidth-based load balancing.)
Here is another graph for a lower cut-off threshold (800 kB/s): https://people.torproject.org/~asn/guards/guard_boxplot_800.png
(Pushed graphing code to gitorious. I used seaborn, a python visualization library I've wanted to try for a while: https://github.com/mwaskom/seaborn )
On Tue, Feb 25, 2014 at 02:06:39AM +0000, George Kadianakis wrote:
And because release-early-release-often, here is a graph: https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current guard selection process (e.g. the most likely to be selected guard node is selected with probability ~0.013). The right boxplot is the probability distribution we would have if we pruned the guard nodes that are slower than 4MB/s. We can see that in that case, the most popular guard node has probability of ~0.15 being selected.
You mean 0.015, right?
- Ian
Ian Goldberg iang@cs.uwaterloo.ca writes:
On Tue, Feb 25, 2014 at 02:06:39AM +0000, George Kadianakis wrote:
And because release-early-release-often, here is a graph: https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current guard selection process (e.g. the most likely to be selected guard node is selected with probability ~0.013). The right boxplot is the probability distribution we would have if we pruned the guard nodes that are slower than 4MB/s. We can see that in that case, the most popular guard node has probability of ~0.15 being selected.
You mean 0.015, right?
Yep. I meant ~0.015. :) Although actually it's 0.0145066975568...
(I'm using a consensus a few days old "valid-after 2014-02-21 14:00:00".)
Unfortunately, compass.torproject.org is down at the moment and I can't validate the results.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hey George, Glad to see that guard questions are still being asked. Some thoughts from your plots.
On 24-Feb-14 9:06 PM, George Kadianakis wrote:
And because release-early-release-often, here is a graph: https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current guard selection process (e.g. the most likely to be selected guard node is selected with probability ~0.013). The right boxplot is the probability distribution we would have if we pruned the guard nodes that are slower than 4MB/s. We can see that in that case, the most popular guard node has probability of ~0.15 being selected.
A question: How much of the total BW was dropped due to the condition "guard BW must be greater than 4MB"?
- From a security perspective: While the top guard did get ~0.015 rather than ~0.013, a change of +~15% on its original probability of being selected, all the other guards also got a boost. Thinking about it from a steady state: the increase in chance (+X%) of being picked is due to the fact that they _do_ now own +X% more bw than before. They haven't gained something for nothing. So it seems that dropping bandwidth is not harmful if we forget about the previous state of the network.
Have I got something wrong in this analysis?
Other thoughts: raising the bar on guards leads to good things(tm). Not amazing(R), though. One, you get less relays that shouldn't really be guards slowing things down. Two, an adversary can't take control of a large number of slow relays (like in a botnet of residential computers) and run guards that in aggregate give them a lot of bandwidth (which is how guards are selected, i.e. the adversary gets one of their bots picked because the chance of one of the bots being picked in aggregate is high) and at the same time slow down service for a client who actually will use that bad guard with low bandwidth. The trick, as you have pointed out, is in picking this cut-off point. But dropping the bottom most doesn't really hurts things, apart form the feeling of leaving bandwidth on the table.
Looking forward to seeing progress. :)
Cheers, Tariq
Hi,
I'm trying to follow this, so perhaps if someone could explain a little bit to me.... metrics reports.....
valid-after 2014-02-25 00:00:00https://exonerator.torproject.org/consensus?valid-after=2014-02-25-00-00-00 r phillw tNrlqRlQkVqUXVGLWFYhQeYBYI0 pR9ddClTA2Eo6AI989NA3BGXNs4https://exonerator.torproject.org/serverdesc?desc-id=a51f5d742953036128e8023df3d340dc119736ce 2014-02-24 14:29:23 176.31.156.199 9001 0 s Fast Guard Named Running Stable Valid v Tor 0.2.4.20 w Bandwidth=7530 p reject 1-65535
I'm guessing that the cut off level proposed is 'Bandwidth'... to what level would it fall before said relay was not used with guard, or am I totally wrong in my thoughts of what is being proposed?
Regards,
Phill.
On 25 February 2014 04:51, Tariq Elahi tariq.elahi@uwaterloo.ca wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hey George, Glad to see that guard questions are still being asked. Some thoughts from your plots.
On 24-Feb-14 9:06 PM, George Kadianakis wrote:
And because release-early-release-often, here is a graph: https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current guard selection process (e.g. the most likely to be selected guard node is selected with probability ~0.013). The right boxplot is the probability distribution we would have if we pruned the guard nodes that are slower than 4MB/s. We can see that in that case, the most popular guard node has probability of ~0.15 being selected.
A question: How much of the total BW was dropped due to the condition "guard BW must be greater than 4MB"?
- From a security perspective:
While the top guard did get ~0.015 rather than ~0.013, a change of +~15% on its original probability of being selected, all the other guards also got a boost. Thinking about it from a steady state: the increase in chance (+X%) of being picked is due to the fact that they _do_ now own +X% more bw than before. They haven't gained something for nothing. So it seems that dropping bandwidth is not harmful if we forget about the previous state of the network.
Have I got something wrong in this analysis?
Other thoughts: raising the bar on guards leads to good things(tm). Not amazing(R), though. One, you get less relays that shouldn't really be guards slowing things down. Two, an adversary can't take control of a large number of slow relays (like in a botnet of residential computers) and run guards that in aggregate give them a lot of bandwidth (which is how guards are selected, i.e. the adversary gets one of their bots picked because the chance of one of the bots being picked in aggregate is high) and at the same time slow down service for a client who actually will use that bad guard with low bandwidth. The trick, as you have pointed out, is in picking this cut-off point. But dropping the bottom most doesn't really hurts things, apart form the feeling of leaving bandwidth on the table.
Looking forward to seeing progress. :)
Cheers, Tariq -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJTDCFnAAoJEJaS+qOeq5CkB1QH/2Elbogxn4jY54fi69UxT4lp hgRBSrYJwVH41SqmIe+pZgKk0R6SvLMK3UQiEMKGqH/ah25uR18KBV+g/t57gXzm hI/u/tXophbF6fIk+EnA1PdYgyFsF3fPoxieT4HHvsui/y/Xadt49reRBrE209I0 /uMT1yIWDv24Hr03HQz2vFX8EXM/L0veBnbBjH9BvHWa2+bEKnYd/RdY/+4BLHY6 jfi6/eqOSgvRGgazSuepH3sNUJ2s9OtvOVtp33I8WX90QVW+UWPwXeozNi3m9PSg p8CYHiL4VhBcM3ttj39U15s4Z1jKW/c1bUnb+AkGxdCVEqTPT3aONGegv2EbTvo= =GsDo -----END PGP SIGNATURE----- _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Tariq Elahi tariq.elahi@uwaterloo.ca writes:
Hey George, Glad to see that guard questions are still being asked. Some thoughts from your plots.
On 24-Feb-14 9:06 PM, George Kadianakis wrote:
And because release-early-release-often, here is a graph: https://people.torproject.org/~asn/guards/guard_boxplot_4000.png
The middle boxplot is the probability distribution of our current guard selection process (e.g. the most likely to be selected guard node is selected with probability ~0.013). The right boxplot is the probability distribution we would have if we pruned the guard nodes that are slower than 4MB/s. We can see that in that case, the most popular guard node has probability of ~0.15 being selected.
A question: How much of the total BW was dropped due to the condition "guard BW must be greater than 4MB"?
From a security perspective: While the top guard did get ~0.015 rather than ~0.013, a change of +~15% on its original probability of being selected, all the other guards also got a boost. Thinking about it from a steady state: the increase in chance (+X%) of being picked is due to the fact that they _do_ now own +X% more bw than before. They haven't gained something for nothing. So it seems that dropping bandwidth is not harmful if we forget about the previous state of the network.
Have I got something wrong in this analysis?
That's a good point.
However it's worth noting that the probability increase in case of cutoff is not constant, it depends on the previous probability of the guard and on how much of the network we disregarded. See the end of my post for the obligatory math masturb^W^Wanalysis.
So if the guard selection probability increase is not that alarming, we are left with one main factor for guard selection diversity: the number of guard nodes remaining after the cut-off. You can see a graph of this here: https://people.torproject.org/~asn/guards/guard_number_cutoff.png
And the main factor about performance implications is how much of the guard bandwidth we discard after the cut-off. Nick made a graph for this here: https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_bandwidth.png
To get an educated idea about how much bandwidth we should discard, we might want to look at the "advertised bandwidth"/"bandwidth history" graphs in metrics.tpo. For example in:
https://metrics.torproject.org/network.html#bandwidth-flags
we see that about 2/5 of the total guard (advertised) bandwidth is supposedly left unused. This might give us an idea of how much bandwidth we can discard without clogging up the Tor pipes. Maybe by doing a "fast guard nodes" campaign (like we did for exit nodes) the situation will improve vastly too (since guard nodes are easier to set up than exit nodes).
What other factors should we look at before deciding our bandwith cut-off value?
Other thoughts: raising the bar on guards leads to good things(tm). Not amazing(R), though. One, you get less relays that shouldn't really be guards slowing things down. Two, an adversary can't take control of a large number of slow relays (like in a botnet of residential computers) and run guards that in aggregate give them a lot of bandwidth (which is how guards are selected, i.e. the adversary gets one of their bots picked because the chance of one of the bots being picked in aggregate is high) and at the same time slow down service for a client who actually will use that bad guard with low bandwidth. The trick, as you have pointed out, is in picking this cut-off point. But dropping the bottom most doesn't really hurts things, apart form the feeling of leaving bandwidth on the table.
Looking forward to seeing progress. :)
Appendix: How do guard probabilities change after a cutoff?
Currently, guard probabilities are calculated as such: """ (1) foreach guard: guard_prob = guard_weight / sum_of_guard_weights """ where guard_weight is an integer assigned to each guard based on its bandwidth (and its relay flags), and sum_of_guard_weights is the sum of all guard_weights of all guards.
This means that after the cutoff, the new_guard_prob is now:
new_guard_prob = guard_weight / (new_sum_of_guard_weights)
where the guard_weight remains the same for all guards, but the denominator changed because the total guard weight was reduced (because we disregarded some guards by cutting them off). So the new denominator is something like:
new_sum_of_guard_weights = sum_of_guard_weights - guard_weight_difference
where sum_of_guard_weights is the same value as in (1).
By playing with the fraction we get:
new_guard_prob = (guard_weight/sum_of_guard_weights) * (1/(1 - (guard_weight_difference / sum_of_guard_weights)))
which is actually:
new_guard_prob = guard_prob * (1/(1 - (guard_weight_difference / sum_of_guard_weights)))
which means that the new guard probabilities depend on the previous guard probabilities and how much of the guard bandwidth we cropped.
For example, if we crop 1/4 of the guard bandwidth, we get:
new_guard_prob = guard_prob * 4/3
which means that all guard probabilities will be increased by a factor of 1/3. Or something like that...
On Mon, Feb 24, 2014 at 1:10 PM, George Kadianakis desnacked@riseup.net wrote:
For example, I'm not sure if I'm using the correct bandwidth values. I'm currently using the value in 'w' lines of the consensus ('w Bandwidth=479'). I used to think that this is a unitless number, but looking at dir-spec.txt it seems to be "kilobytes per second". Is this the value I should be using to figure out which guards I should cut-off?
I was also under the impression that these weights are unitless, but they do seem to have some correlation to advertised average bandwidth. For example, if I sort the valid-after 2100UTC consensus by weights, and look at the 20 routers starting at weights of 1000,2000, 4000,8000,16000,32000, the median average bandwidth advertised by these nodes are:
weight - median advertised bandwidth 1000 - 795KBps 2000 - 1049KBps 4000 - 5181KBps 8000 - 6474KBps 16000 -12059KBps 32000 - 31457KBps
(For very low weights and very high weights the correlation breaks down pretty badly, though)
Any how, we would want to use the weights for cutoffs anyways, since otherwise "just lie to get above the guard threshold" becomes an interesting attack.
Another thought: we also should investigate how various thresholds affect the relationship between the cumulative guard weight total and the total exit weight.
On Tue, Feb 25, 2014 at 5:04 PM, Nicholas Hopper hopper@cs.umn.edu wrote:
Another thought: we also should investigate how various thresholds affect the relationship between the cumulative guard weight total and the total exit weight.
Well, that turns out not to be a real issue: even if we set the guard threshold to 20MBps, the total guard weight still exceeds the total exit weight.
Here is a chart showing what fraction of total guard bandwidth is retained as we vary the guard threshold from 0 to 10MBps:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_bandwidth.png
And here's a chart showing what fraction of clients will choose the highest (max%) and median (median%) guards as we vary the threshold over the same range:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_weight.png
2MBps and 6MBps look like interesting points on the curves.
Nicholas Hopper hopper@cs.umn.edu writes:
On Tue, Feb 25, 2014 at 5:04 PM, Nicholas Hopper hopper@cs.umn.edu wrote:
Another thought: we also should investigate how various thresholds affect the relationship between the cumulative guard weight total and the total exit weight.
Well, that turns out not to be a real issue: even if we set the guard threshold to 20MBps, the total guard weight still exceeds the total exit weight.
Here is a chart showing what fraction of total guard bandwidth is retained as we vary the guard threshold from 0 to 10MBps:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_bandwidth.png
And here's a chart showing what fraction of clients will choose the highest (max%) and median (median%) guards as we vary the threshold over the same range:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_weight.png
2MBps and 6MBps look like interesting points on the curves.
Thanks, these are useful graphs.
And here is another one with the number of guard nodes over different cutoff values:
https://people.torproject.org/~asn/guards/guard_number_cutoff.png
We will want to choose a cutoff value that doesn't discard too many guard nodes.
Nicholas Hopper hopper@cs.umn.edu writes:
On Tue, Feb 25, 2014 at 5:04 PM, Nicholas Hopper hopper@cs.umn.edu wrote:
Another thought: we also should investigate how various thresholds affect the relationship between the cumulative guard weight total and the total exit weight.
Well, that turns out not to be a real issue: even if we set the guard threshold to 20MBps, the total guard weight still exceeds the total exit weight.
Here is a chart showing what fraction of total guard bandwidth is retained as we vary the guard threshold from 0 to 10MBps:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_bandwidth.png
And here's a chart showing what fraction of clients will choose the highest (max%) and median (median%) guards as we vary the threshold over the same range:
https://www-users.cs.umn.edu/~hopper/guards/guard_thresholds_weight.png
2MBps and 6MBps look like interesting points on the curves.
OK, let's get back to this. This subthread is blocking us from writing a proposal for this project, so we should resolve it soon.
There is one very important performance factor that I can't figure out how to measure well, and that's the impact on the "individual user performance" if we switch to one guard.
That is, how the performance of the average user would change if we switch to one guard. But also how the performance of an unlucky user (one who picked the slowest/overloaded guard) or a lucky user would be altered if we switch to one guard.
This is a very important factor to consider since the unlucky user scenario is what forced us to think about imposing more strict bandwidth cutoffs for guards. This factor is also relevant in the case where we increase the guard bandwidth thresholds, so we should find a way to evaluate it.
Nick, do you have any smart ideas on how to measure this?
Tariq's paper does this in 'Figure 10': it has a CDF with the "expected circuit performance", where you can clearly see that the number of clients having a super slow circuit (< 100kB/s) with three guards is extremely low (~0%), but when you switch to one guard they are not so few anymore (5% of clients). I'm curious to learn how that CDF was created, for example I guess they only considered the performance impact of the guard on the circuit, and not of the rest of the nodes.
Is this kind of measurement adequate to decide whether an unlucky client will get good enough performance?
For example, should we assume that a guard of 100kB/s is equally performant to users in a network where one guard is used and in a network where three guards are used?
And if Tariq's 'Figure 10' approach is indeed good enough for us, how should we proceed? Are we satisfied by the performance of the one guard approach in 'Figure 10'? If not, we will probably need to increase the guard bandwidth cutoffs. Should we make more 'Figure 10' graphs with different bandwidth cutoffs for the case of 1 guard, and try to find the bandwidth cutoff that will resemble the current Tor network most?
On 05-Mar-14 5:19 PM, George Kadianakis wrote:
OK, let's get back to this. This subthread is blocking us from writing a proposal for this project, so we should resolve it soon.
There is one very important performance factor that I can't figure out how to measure well, and that's the impact on the "individual user performance" if we switch to one guard.
That is, how the performance of the average user would change if we switch to one guard. But also how the performance of an unlucky user (one who picked the slowest/overloaded guard) or a lucky user would be altered if we switch to one guard.
This is a very important factor to consider since the unlucky user scenario is what forced us to think about imposing more strict bandwidth cutoffs for guards. This factor is also relevant in the case where we increase the guard bandwidth thresholds, so we should find a way to evaluate it.
Nick, do you have any smart ideas on how to measure this?
Tariq's paper does this in 'Figure 10': it has a CDF with the "expected circuit performance", where you can clearly see that the number of clients having a super slow circuit (< 100kB/s) with three guards is extremely low (~0%), but when you switch to one guard they are not so few anymore (5% of clients). I'm curious to learn how that CDF was created, for example I guess they only considered the performance impact of the guard on the circuit, and not of the rest of the nodes.
We picked one guard each for a large number of clients and then made the CDF (Fig. 10) of all the clients' guard list BW. In our paper, we then assume the guard will be the bottleneck and the client will see at most this amount of bandwidth.
We could do a similar study and get CDFs for the middle node BW and exit BW. Comparing the curves we would see where the bottleneck actually is, i.e. the fattest left side of the curve. It may very well be that the middle nodes are slower on average.
For example, should we assume that a guard of 100kB/s is equally performant to users in a network where one guard is used and in a network where three guards are used?
Too bad we don't know what people use Tor for or in what distribution of use cases. :) Then we could try to ensure all nodes could handle that or have a mix of nodes that on average gave adequate performance across most use cases.
And if Tariq's 'Figure 10' approach is indeed good enough for us, how should we proceed? Are we satisfied by the performance of the one guard approach in 'Figure 10'? If not, we will probably need to increase the guard bandwidth cutoffs. Should we make more 'Figure 10' graphs with different bandwidth cutoffs for the case of 1 guard, and try to find the bandwidth cutoff that will resemble the current Tor network most?
Squinting at the Fig. 10 a bit I think that 1500 (consensus BW units) might be a good place to start.
Cheers, Tariq
Tariq Elahi tariq.elahi@uwaterloo.ca writes:
On 05-Mar-14 5:19 PM, George Kadianakis wrote:
OK, let's get back to this. This subthread is blocking us from writing a proposal for this project, so we should resolve it soon.
There is one very important performance factor that I can't figure out how to measure well, and that's the impact on the "individual user performance" if we switch to one guard.
That is, how the performance of the average user would change if we switch to one guard. But also how the performance of an unlucky user (one who picked the slowest/overloaded guard) or a lucky user would be altered if we switch to one guard.
This is a very important factor to consider since the unlucky user scenario is what forced us to think about imposing more strict bandwidth cutoffs for guards. This factor is also relevant in the case where we increase the guard bandwidth thresholds, so we should find a way to evaluate it.
Nick, do you have any smart ideas on how to measure this?
Tariq's paper does this in 'Figure 10': it has a CDF with the "expected circuit performance", where you can clearly see that the number of clients having a super slow circuit (< 100kB/s) with three guards is extremely low (~0%), but when you switch to one guard they are not so few anymore (5% of clients). I'm curious to learn how that CDF was created, for example I guess they only considered the performance impact of the guard on the circuit, and not of the rest of the nodes.
We picked one guard each for a large number of clients and then made the CDF (Fig. 10) of all the clients' guard list BW. In our paper, we then assume the guard will be the bottleneck and the client will see at most this amount of bandwidth.
We could do a similar study and get CDFs for the middle node BW and exit BW. Comparing the curves we would see where the bottleneck actually is, i.e. the fattest left side of the curve. It may very well be that the middle nodes are slower on average.
For example, should we assume that a guard of 100kB/s is equally performant to users in a network where one guard is used and in a network where three guards are used?
Too bad we don't know what people use Tor for or in what distribution of use cases. :) Then we could try to ensure all nodes could handle that or have a mix of nodes that on average gave adequate performance across most use cases.
I'm also wondering whether 'Figure 10' is a good way to understand the the implications on individual client performance; mainly because of the load balancing that happens using bandwidth weights.
For example, is it an issue (from a performance PoV) if there is a 4*10^-9 probability for each client to use a super slow guard (with bandwidth 20kB/s)? This is the slowest guard we have and hence has the lowest guard probability.
Is that worse than the fact that currently nearly 1.5% of all clients go to a _single_ guard node (which is quite fast: bandwidth 341000kB/s). This is the fastest guard we have and hence has the highest guard probability
So with the above examples, and assuming that we have 500k Tor clients picking a guard node at the same time, the slowest node will get an expected number of 0.002 clients, whereas the fastest node will get an expected number of 7500 clients.
Continuing with even more assumptions, if we assume that the fastest guard will split its bandwidth evenly to all its clients, each client will get 341000/7500 == 45 kB/s. That's not too much better than the 20kB/s guard node...