I had a conversation with a vendor yesterday. They are interested in including Tor as their "private browsing mode" and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off.
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
As Tor Browser is open source, they are already working on it. However ,their concern is scaling up to handling some percent of global users with "tor mode" enabled. They're willing to entertain offering their resources to help us solve the scalability challenges of handling hundreds of millions of users and relays on Tor.
As this question keeps popping up by the business world looking at privacy as the next "must have" feature in their products, I'm trying to compile a list of tasks to solve to help us scale. The old 2008 three-year roadmap looks at performance, https://www.torproject.org/press/2008-12-19-roadmap-press-release.html.en
I've been through the specs, https://gitweb.torproject.org/torspec.git/tree/HEAD:/proposals to see if there are proposals for scaling the network or directory authorities. I didn't see anything directly related.
The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor)
Is there a better list available for someone new to Tor to read up on the scalability challenges?
Il 9/26/14, 4:58 PM, Andrew Lewman ha scritto:
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
..... WOW! .....
Is there a better list available for someone new to Tor to read up on the scalability challenges?
As a basic concept, i don't think that Tor could scale up to huge numbers without making the end-user to became active part of the network routing.
Fabio Pietrosanti (naif):
Il 9/26/14, 4:58 PM, Andrew Lewman ha scritto:
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
..... WOW! .....
Presumably this vendor would not have all of their users using all Tor all the time with no other browser option.
Is there a better list available for someone new to Tor to read up on the scalability challenges?
As a basic concept, i don't think that Tor could scale up to huge numbers without making the end-user to became active part of the network routing.
No, this would not work. We cannot make use of tons of junk relays in the current Tor network. See my other reply.
What would work is an optional mode where the users only use the Private Browser when they need it. For example, if 5% of these 2.8 billion users decides to use the browser on a given day, that is something we can handle, if they helped pay for or add the network capacity for these users somehow. :)
We could also handle controlled rollouts to fractions of their userbase to test the waters, and slowly add high capacity nodes to the network to support these new users, to ensure we have the people ready to accept payment for running the servers, and maintain diversity.
Il 9/27/14, 2:33 AM, Mike Perry ha scritto:
We could also handle controlled rollouts to fractions of their userbase to test the waters, and slowly add high capacity nodes to the network to support these new users, to ensure we have the people ready to accept payment for running the servers, and maintain diversity.
I read your very detailed estimations and improvement paths, i love it!
However i see that the main suggestion to increase the "network capacity" can be simplified as follow: - improve big nodes ability to push even more traffic - add more big nodes
Other improvements are to reduce the "consensus size" and "directory load", but not specifically on network capacity.
While this is the obvious way to "add more capacity" i feel that's going to have impacts such as: 1) reduce the "diversity" (thus the anonymity, because few players will handle most of the network's traffic) 2) make it "irrelevant" for anyone to run their own small/volounteer relay
That sounds like the "easier way" to scale up in a defined amount of time and with a defined budget, but imho also with consequences and pre-defined limits.
I feel that the only way to scale-up without limits and consequences is to have end-users became "active elements" of the network, where we have success story such as Skype.
End-users have important network resources available that can be estimated and used (with care).
Not all end-users are equal, i'm now on a 2M Hyperlan line (damn digital divide!), but someone else in Stockholm or San Francisco it's on a 1000M/100M fiber connection @home (not in a datacenter) and while in Milan i've a 100M/10M fiber!
That bandwith resources are amazing, usually quite cheap (home broadband lines), widely available in the end-users hands.
IMHO those are the bandwidth resources, widely available, cheap, very diverse/sparse that could help the Tor network to scale-up.
How to use it properly within/for the Tor network? That's a different topic.
But those big bandwidth resources are there, available under our feet, in our home, and we're not using it!
-naif
Fabio Pietrosanti (naif):
Il 9/27/14, 2:33 AM, Mike Perry ha scritto:
We could also handle controlled rollouts to fractions of their userbase to test the waters, and slowly add high capacity nodes to the network to support these new users, to ensure we have the people ready to accept payment for running the servers, and maintain diversity.
I read your very detailed estimations and improvement paths, i love it!
However i see that the main suggestion to increase the "network capacity" can be simplified as follow:
- improve big nodes ability to push even more traffic
- add more big nodes
Other improvements are to reduce the "consensus size" and "directory load", but not specifically on network capacity.
While this is the obvious way to "add more capacity" i feel that's going to have impacts such as:
- reduce the "diversity" (thus the anonymity, because few players will
handle most of the network's traffic) 2) make it "irrelevant" for anyone to run their own small/volounteer relay
That sounds like the "easier way" to scale up in a defined amount of time and with a defined budget, but imho also with consequences and pre-defined limits.
I feel that the only way to scale-up without limits and consequences is to have end-users became "active elements" of the network, where we have success story such as Skype.
End-users have important network resources available that can be estimated and used (with care).
Not all end-users are equal, i'm now on a 2M Hyperlan line (damn digital divide!), but someone else in Stockholm or San Francisco it's on a 1000M/100M fiber connection @home (not in a datacenter) and while in Milan i've a 100M/10M fiber!
That bandwith resources are amazing, usually quite cheap (home broadband lines), widely available in the end-users hands.
IMHO those are the bandwidth resources, widely available, cheap, very diverse/sparse that could help the Tor network to scale-up.
How to use it properly within/for the Tor network? That's a different topic.
It's the same topic: I'm arguing that we want to use the 100M fiber connection, and maybe the 10M connection, but definitely not the ADSL link with only 256kbit upstream. The latter costs more bytes to tell clients about than it contributes to the network.
We can cut these ADSL relays from the network and turn them into bridges using the bandwidth authorties. Or have the default relay mode be to start as a bridge and get promoted to a relay once you are measured.
As for diversity, we can better achieve diversity through proper network allocation based on the current node selection algorithms and load balancing, so we actually know that our desired percentage of traffic is going through the geography/jurisdictions/organizations we want.
Keeping thousands of junk nodes that only carry a tiny fraction of the Tor network capacity just so we can pretend we have diveristy is no solution. It's wishful ostrich thinking.
Slow/junk home nodes also have worse mix properties than fast nodes, due to less concurrent traffic running through them. They are thus more useful to surveil externally for correlation, and probably also easier to compromise.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I think one of the important thoughts here, at least as an exit operator is that having a large group like that can significantly influence how Tor is seen and I am sure having that kind of backing could open many avenues for us.
If we are to scale up, we can reduce CPU load (or optimise) per node or we can have more ISPs who welcome Tor. I think whilst there are lots of great people doing fantastic work expanding the ISP's who accept Tor, we need to perhaps revisit those ISP's who have shown to be hostile towards us. Do you believe this person or group could assist in perhaps pursuading ISP's to open up to Tor exit operators like myself?
If they are willing to offer their name as a backing, I'd be more than happy to dedicate myself for many hours per week to get in touch with ISP's and try to change their policies. If we see much success, I can easily co-ordinate a revamp of the good/bad ISP list which has become a bit messy over the last few months. Given the sheer volume of traffic my exits have pushed (Petabytes a month), the amount of abuse complaints I've had and even police raids I am quite comfortable giving ISP's the honest picture. Some won't open up even with major backing but I am sure we can convince some to change their policies when they see and hear from the actual operators who aren't in prison (since many of them seem to equate running Tor exits to being a criminal or a guaranteed way to get in trouble with the police).
- -T
On 26/09/2014 15:58, Andrew Lewman wrote:
I had a conversation with a vendor yesterday. They are interested in including Tor as their "private browsing mode" and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off.
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
As Tor Browser is open source, they are already working on it. However ,their concern is scaling up to handling some percent of global users with "tor mode" enabled. They're willing to entertain offering their resources to help us solve the scalability challenges of handling hundreds of millions of users and relays on Tor.
As this question keeps popping up by the business world looking at privacy as the next "must have" feature in their products, I'm trying to compile a list of tasks to solve to help us scale. The old 2008 three-year roadmap looks at performance, https://www.torproject.org/press/2008-12-19-roadmap-press-release.html.en
I've been through the specs, https://gitweb.torproject.org/torspec.git/tree/HEAD:/proposals to see if there are proposals for scaling the network or directory authorities. I didn't see anything directly related.
The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor)
Is there a better list available for someone new to Tor to read up on the scalability challenges?
Andrew Lewman:
I had a conversation with a vendor yesterday. They are interested in including Tor as their "private browsing mode" and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off.
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
As Tor Browser is open source, they are already working on it. However ,their concern is scaling up to handling some percent of global users with "tor mode" enabled. They're willing to entertain offering their resources to help us solve the scalability challenges of handling hundreds of millions of users and relays on Tor.
As this question keeps popping up by the business world looking at privacy as the next "must have" feature in their products, I'm trying to compile a list of tasks to solve to help us scale. The old 2008 three-year roadmap looks at performance, https://www.torproject.org/press/2008-12-19-roadmap-press-release.html.en
I've been through the specs, https://gitweb.torproject.org/torspec.git/tree/HEAD:/proposals to see if there are proposals for scaling the network or directory authorities. I didn't see anything directly related.
The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor)
These research papers basically propose a total network overhaul to deal with the problem of Tor relay directory traffic overwhelming the Tor network and/or Tor clients.
However, I believe that with only minor modifications, the current Tor network architecture could support 100M daily directly connecting users, assuming we focus our efforts on higher capacity relays and not simply adding tons of slower relays.
The core problem is that the fraction of network capacity that you spend telling users about the current relays in the network can be written as:
f = D*U/B
D is current Tor relay directory size in bytes per day, U is number of users, and B is the bandwidth per day in bytes provided by this Tor network. Of course, this is a simplification, because of multiple directory fetches per day and partially-connecting/idle clients, but for purposes of discussion it is good enough.
To put some real numbers on this, if you compare https://metrics.torproject.org/bandwidth.html#dirbytes with https://metrics.torproject.org/bandwidth.html#bandwidth, you can see that we're currently devoting about 2% of our network throughput to directory activity (~120MiB/sec out of ~5000MiB/sec). So we're not exactly hurting at this point in terms of our directory bytes per user yet.
But, because this is fraction rises with both D and U, these research papers rightly point out that you can't keep adding relays *and* users and expect Tor to scale.
However, when you look at this f=D*U/B formula, what it also says is that if you can reduce the relay directory size by a factor c, and also grow the network capacity by this same factor c, then you can multiply the userbase by c, and have the same fraction of directory bytes.
This means that rather than trying to undertake a major network overhaul like TorSK or PIR-Tor to try to support hundreds of thousands of slow junky relays, we can scale the network by focusing on improving the situation for high capacity relay operators, so we can provide more network bandwidth for the same number of directory bytes per user.
So, let's look at ways to reduce the size of the Tor relay directory, and each way we can find to do so means a corresponding increase in the number of users we can support:
1. Proper multicore support.
Right now, any relay with more than ~100Mbit of capacity really needs to run an additional tor relay instance on that link to make use of it. If they have AES-NI, this might go up to 300Mbit.
Each of these additional instances is basically wasted directory bytes for those relay descriptors.
But with proper multicore support, such high capacity relays could run only one relay instance on links as fast as 2.5Gbit (assuming an 8 core AES-NI machine).
Result: 2-8X reduction in consensus and directory size, depending on the the number of high capacity relays on multicore systems we have.
2. Cut off relays below the median capacity, and turn them into bridges.
Relays in the top 10% of the network are 164 times faster than relays in the 50-60% range, 1400 times faster than relays in the 70-80% range, and 35000 times faster than relays in the 90-100% range.
In fact, many relays are so slow that they provide less bytes to the network than it costs to tell all of our users about them. There should be a sweet spot where we can set this cutoff such that the overhead from directory activity balances the loss of capacity from these relays, as a function of userbase size.
Result: ~2X reduction in consensus and directory size.
3. Switching to ECC keys only.
We're wasting a lot of directory traffic on uncompressible RSA1024 keys, which are 4X larger than ECC keys, and less secure. Right now, were also listing both. When we finally remove RSA1024 entirely, the directory should get quite a bit smaller.
Result: ~2-4X reduction in consensus and directory size.
4. Consensus diffs.
With proposal 140, we can save 60% of the directory activity if we send diffs of the consensus for regularly connecting clients. Calculating the benefit from this is complicated, since if clients leave the network for just 16 hours, there is very little benefit to this optimization. These numbers are highly dependent on churn though, and it may be that by removing most of the slow junk relays, there is actually less churn in the network, and smaller diffs: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/140-consensus...
Let's just ballpark it at 50% for the typical case.
Result: 2X reduction in directory size.
5. Invest in the Tor network.
Based purely on extrapolating from the Noisebridge relays, we could add ~300 relays, and double the network capacity for $3M/yr, or about $1 per user per year (based on the user counts from: https://metrics.torproject.org/users.html).
Note that this value should be treated as a minimum estimate. We actually want to ensure diversity as we grow the network, which may make this number higher. I am working on better estimates using replies from: https://lists.torproject.org/pipermail/tor-relays/2014-September/005335.html
Automated donation/funding distribution mechanisms such as https://www.oniontip.com/ are especially interesting ways to do this (and can even automatically enforce our diversity goals) but more traditional partnerships are also possible.
Result: 100% capacity increase for each O($3M/yr), or ~$1 per new user per year.
So if we chain #1-4 all together, using the low estimates, we should be able to reduce directory size by at least 2*2*2*2 or 8X.
Going back to the f=D*U/B formula, this means that we should be able to add capacity to support 8X more users, while *still* maintaining our 2% directory overhead percentage. This would be 2.5M users * 8, or 20M directly connecting users.
If we were willing to tolerate 10% directory overhead this would allow for 5 times as many users. In other words, 100M daily connecting users.
We would still need to find some way to fund the growth of the network to support this 40X increase, but there are no actual *technical* reasons why it cannot be done.
On 27 Sep 2014, at 02:18, Mike Perry mikeperry@torproject.org wrote:
If we were willing to tolerate 10% directory overhead this would allow for 5 times as many users. In other words, 100M daily connecting users.
We would still need to find some way to fund the growth of the network to support this 40X increase, but there are no actual *technical* reasons why it cannot be done.
One thing there isn't an automatic answer to is whether our current users and our would-be users are differ in their usage pattern. Currently, my intuition would be that most of our users are responsible for a relative small amount of traffic, whereas we have some users who pull a lot of data. I wonder what happens with Netflix, youtube and other services. We might at least want to try and figure this into the equation by estimating our average daily bytes sent/received per users, and comparing that to the bytes sent/received by our target group.
Cheers Sebastian
Sebastian Hahn:
On 27 Sep 2014, at 02:18, Mike Perry mikeperry@torproject.org wrote:
If we were willing to tolerate 10% directory overhead this would allow for 5 times as many users. In other words, 100M daily connecting users.
We would still need to find some way to fund the growth of the network to support this 40X increase, but there are no actual *technical* reasons why it cannot be done.
One thing there isn't an automatic answer to is whether our current users and our would-be users are differ in their usage pattern. Currently, my intuition would be that most of our users are responsible for a relative small amount of traffic, whereas we have some users who pull a lot of data. I wonder what happens with Netflix, youtube and other services. We might at least want to try and figure this into the equation by estimating our average daily bytes sent/received per users, and comparing that to the bytes sent/received by our target group.
This is a great point, Sebastian. Some of the browser vendors do this sort of analytics, and other parties have also made some inferences independently.
In particular, these two studies are old, and may not be accurate anymore, but they say that between 5-15% of each browser's userbase used Private Browsing Mode, depending on the browser (pg 9): https://www.usenix.org/legacy/event/sec10/tech/full_papers/Aggarwal.pdf
And moreover, that the typical usage duration for Private Browsing Mode was 4-22 minutes, with 10 minutes being most common: https://blog.mozilla.org/metrics/2010/08/23/understanding-private-browsing/
Refreshed studies of this type of data would be very helpful for determining what size rollout we could handle, and if we'd want to further limit it by making the Tor mode opt-in, not prominent, or otherwise less likely to be selected by most users, at least for the initial versions.
We could also capture more detailed usage frequency analytics and bytecount statistics in an alpha or other trial version.
Mike Perry:
Andrew Lewman:
I had a conversation with a vendor yesterday. They are interested in including Tor as their "private browsing mode" and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off.
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
The core problem is that the fraction of network capacity that you spend telling users about the current relays in the network can be written as:
f = D*U/B
D is current Tor relay directory size in bytes per day, U is number of users, and B is the bandwidth per day in bytes provided by this Tor network. Of course, this is a simplification, because of multiple directory fetches per day and partially-connecting/idle clients, but for purposes of discussion it is good enough.
To put some real numbers on this, if you compare https://metrics.torproject.org/bandwidth.html#dirbytes with https://metrics.torproject.org/bandwidth.html#bandwidth, you can see that we're currently devoting about 2% of our network throughput to directory activity (~120MiB/sec out of ~5000MiB/sec). So we're not exactly hurting at this point in terms of our directory bytes per user yet.
But, because this is fraction rises with both D and U, these research papers rightly point out that you can't keep adding relays *and* users and expect Tor to scale.
However, when you look at this f=D*U/B formula, what it also says is that if you can reduce the relay directory size by a factor c, and also grow the network capacity by this same factor c, then you can multiply the userbase by c, and have the same fraction of directory bytes.
Actually, there's an obvious and basic dimensional analysis failure here in this paragraph. Rather embarrassing.
What I meant to say is that if you increase U and B by the same factor c but keeping the overall directory size fixed, you get the same f ratio.
That's basically what I'm arguing: We can increase the capacity of the network by reducing directory waste but adding more high capacity relays to replace this waste, causing the overall directory to be the same size, but with more capacity.
It's possible to formalize all of this more precisely, too. I will think about it when I start working on more detailed cost estimations using data from the tor-relays thread. I think that data is the real key to figuring out what we actually need to make this work (in terms of funding and/or new relays).
On 26 September 2014 22:28, Mike Perry mikeperry@torproject.org wrote:
That's basically what I'm arguing: We can increase the capacity of the network by reducing directory waste but adding more high capacity relays to replace this waste, causing the overall directory to be the same size, but with more capacity.
I'm sure that diffs will make a huge difference, but if you're focusing on the directory documents why not also change the consensus and related document formats to be something more efficient than ASCII text? Taking the latest consensus and doing some rough estimates, I found the following:
Original consensus, xz-ed: 407K Change flags to uint16: ~399K +Removing names: 363K +Compressing IPv6 to 16Bytes + 4 Bytes - 360K +Compressing IPv4 to 4 Bytes + 4Bytes + 4bytes - 315K +Compressing the Datetime to 4 bytes - 291K +Compressing the Version string to 4bytes - 288K +Replacing reject 1-65K to a single byte - 287K +Replacing Bandwidth=# with a 4 byte - 273K
These numbers are optimistic - you won't see quite this much gain, but if I'm understanding you correctly that the consensus is painful, it seems like you could save at least 50K-70K out of 400K with relatively straightforward changes.
-tom
On 28 Sep 2014, at 02:12, Tom Ritter tom@ritter.vg wrote:
why not also change the consensus and related document formats to be something more efficient than ASCII text? Taking the latest consensus and doing some rough estimates, I found the following:
Original consensus, xz-ed: 407K Change flags to uint16: ~399K +Removing names: 363K +Compressing IPv6 to 16Bytes + 4 Bytes - 360K +Compressing IPv4 to 4 Bytes + 4Bytes + 4bytes - 315K +Compressing the Datetime to 4 bytes - 291K +Compressing the Version string to 4bytes - 288K +Replacing reject 1-65K to a single byte - 287K +Replacing Bandwidth=# with a 4 byte - 273K
These numbers are optimistic - you won't see quite this much gain, but if I'm understanding you correctly that the consensus is painful, it seems like you could save at least 50K-70K out of 400K with relatively straightforward changes.
This analysis doesn't make much sense, I'm afraid. We use compression on the wire, so repeating flags as human-readable strings has a much lower overhead than you estimate, for example. Re-doing your estimates with actually compressed consensuses might make sense, but probably you'll see a lot less value.
Cheers Sebastian
On 28 September 2014 07:00, Sebastian Hahn sebastian@torproject.org wrote:
This analysis doesn't make much sense, I'm afraid. We use compression on the wire, so repeating flags as human-readable strings has a much lower overhead than you estimate, for example. Re-doing your estimates with actually compressed consensuses might make sense, but probably you'll see a lot less value.
All of those numbers were after compressing the consensus document using xz, which is the best compression method I know.
-tom
On 28 Sep 2014, at 16:33, Tom Ritter tom@ritter.vg wrote:
On 28 September 2014 07:00, Sebastian Hahn sebastian@torproject.org wrote:
This analysis doesn't make much sense, I'm afraid. We use compression on the wire, so repeating flags as human-readable strings has a much lower overhead than you estimate, for example. Re-doing your estimates with actually compressed consensuses might make sense, but probably you'll see a lot less value.
All of those numbers were after compressing the consensus document using xz, which is the best compression method I know.
sorry, I completely missed that you showed compressed numbers. Interesting results!
There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent.
Ryan Carboni transcribed 1.1K bytes:
There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent.
I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay:
https://globe.torproject.org/#/relay/26F6A7570E0F655DFDD054E79ACBB127112C2D7... https://globe.torproject.org/#/relay/3BC76060924A1B8DE69DF57C27E6C1EBB95ABE5...
On Mon, Sep 29, 2014 at 4:36 PM, isis isis@torproject.org wrote:
Ryan Carboni transcribed 1.1K bytes:
There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit
upload
speeds. And those that do might also be using bittorrent.
I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay:
You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions.
It would be trivial for a multimillion dollar organisation to sybil attack Tor even as it stands right now.
On 30/09/2014 01:12, Ryan Carboni wrote:
On Mon, Sep 29, 2014 at 4:36 PM, isis isis@torproject.org wrote:
Ryan Carboni transcribed 1.1K bytes:
There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit
upload
speeds. And those that do might also be using bittorrent.
I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay:
You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions.
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Mon, Sep 29, 2014 at 7:12 PM, Ryan Carboni ryacko@gmail.com wrote:
You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions.
If you look at the numbers isis posted, all relays below the median contribute less than 3% of the overall bandwidth. I think cutting all of them off might be overkill, but even if you did, you would not make a Sybil attack appreciably harder.
- Nikita
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
if we would get multithread support, this would boost the bandwith that is avaible, there i'm sure. All my relays run in CPU limit because i don't think wasting even more ipv4 addresses is great, today you get more and more cpu cores that is not linier with the IPC increase.
E.g. you have a Server with 2x E5-2683 v3 v3 and a 10 Gbit/s pipe you would need atleast 14 IP's to use most of the CPU. And every IP's gets blacklistet, with that much Tor nodes in the same /24 maybe the entire /24. So most ISP's wouldn't be happy with that and with the IPv4 shortage this days im also not very happy with that. We really shouldn't waste more IPv4 IP's then needed and the only solution is to change the max amount of Tor Processes from 2 to a higher number or move to IPv6 or get multithreading working.
Nikita Borisov schrieb:
On Mon, Sep 29, 2014 at 7:12 PM, Ryan Carboni ryacko@gmail.com wrote:
You're missing the point. It would be trivial for a multibillion dollar organization to sybil attack Tor if you add excessive restrictions.
If you look at the numbers isis posted, all relays below the median contribute less than 3% of the overall bandwidth. I think cutting all of them off might be overkill, but even if you did, you would not make a Sybil attack appreciably harder.
- Nikita
On 09/30/2014 06:28 AM, AFO-Admin wrote:
E.g. you have a Server with 2x E5-2683 v3 v3 and a 10 Gbit/s pipe you would need atleast 14 IP's to use most of the CPU.
Multicore support is hard and needs developers. Raising the limit from 2 relays per IP to x per IP has been discussed in the past and would be an easy change.
Moritz Bartl transcribed 0.5K bytes:
Raising the limit from 2 relays per IP to x per IP has been discussed in the past and would be an easy change.
We *still* have that limit? I thought we killed it a long time ago.
Can we kill it now? It's not going to do anything to prevent Sybils, it'll only prevent good relay operators on larger machines from giving more bandwidth.
isis:
Moritz Bartl transcribed 0.5K bytes:
Raising the limit from 2 relays per IP to x per IP has been discussed in the past and would be an easy change.
We *still* have that limit? I thought we killed it a long time ago.
Can we kill it now? It's not going to do anything to prevent Sybils, it'll only prevent good relay operators on larger machines from giving more bandwidth.
If not kill it, raising it to 4-8 might make more sense and be a good interim step.
We could lower it again if we have real multicore support (which I still think we should aim for because it also improves the mix properties of fast relays to an external observer, but I realize is no small task).
I'd say that the idea to 'downgrade' people into being bridges is a good one, if done without requiring user input. 'Everyone run a relay' might only be useful because so many of the people we say it to have fast connections. It seems reasonable to filter out persistently low connections (and allow them back in if their connection speed improves). That is not to say that every potential bridge should actually be accepted as a bridge. The 28B/s bridge is nuts - either it's on an embedded device or their torrc is misconfigured.
What I usually recommend is to users is based on their bandwidth and how frequently their IP changes. If their connection is fast and their IP never changes (eg, a desktop or server), then run a non-exit relay [2]. For a laptop that moves to-from work, then a relay or bridge. If it moves a *lot*, use Cupcake (which is a wrapper for flashproxy). Running a relay on a raspi or a router (?!) is not a great idea -- though people attempt both. If things could gracefully switch from being a relay to a bridge based on their speed, then that would actually make it more straightforward for users because they don't have to worry about whether they should be a bridge or relay.
People can't really estimate their own bandwidth without something like NDT, but they have an idea of how fast it is. eg, this connection is 21Mb/s up, 6mb/s down, but that's mostly irrelevant because my perception of it is that it's Fast. That perception would be the same if I were getting 2Mbp/s up/down. So maybe one non-technical change we can make is to user education and website documentation -- run a relay if you have a Fast connection.
Filtering people out based on advertised bandwidth is tricky - advertised bandwidth is only useful if it's based on reality. 250kb/s seems like a reasonable floor for both relays and bridges. 100kb/s is kind of the sanity check for a distributed bridge - if it's below that, it's not useful enough IMO.
The real questions for me are: how much of a gain is possible? and what is the right balance between number of relays and speed of those relays? and I suspect that until something is tried, it may just be speculation.
best, Griffin
[2] No one should be running an exit from home, and no one who is asking me about this at an event should be running an exit.
Griffin Boyce transcribed 2.7K bytes:
I'd say that the idea to 'downgrade' people into being bridges is a good one, if done without requiring user input. 'Everyone run a relay' might only be useful because so many of the people we say it to have fast connections. It seems reasonable to filter out persistently low connections (and allow them back in if their connection speed improves). That is not to say that every potential bridge should actually be accepted as a bridge. The 28B/s bridge is nuts - either it's on an embedded device or their torrc is misconfigured.
I do remember that 28 B/s bridge had an Advertised Bandwidth which was a much higher number, but I don't recall precisely. (However, because there isn't currently any Bandwidth Authorities for bridge relays, bridges can easily lie about their bandwidth. Currently, we would have no way, once a slow/junk relay is downgraded to a bridge, to check its bandwidth.)
What I usually recommend is to users is based on their bandwidth and how frequently their IP changes. If their connection is fast and their IP never changes (eg, a desktop or server), then run a non-exit relay [2]. For a laptop that moves to-from work, then a relay or bridge.
Actually, anything with a constantly changing IP is a terrible idea for a bridge.
Think of it this way: BridgeDB hands you a bridge line on Monday. On Tuesday, the bridge's IP rotates to something else. Now your Tor doesn't work. Wah-wah, sucks for you!
Or how about this one: BridgeDB hands you a bridge line on Monday. On Monday night in whatever timezone the bridge's operator lives in, the bridge operator carries their laptop (and, hence, the bridge) home from work. Now your Tor doesn't work! On Tuesday morning, the bridge operation carries that laptop back to work. All of a sudden your Tor works again! Repeat! Now you're super confused, and you're likely to just complain that "Tor is janky/unreliable!"
If it moves a *lot*, use Cupcake (which is a wrapper for flashproxy).
Okay, running flashproxies on things which have constantly changing IPs *is* a good idea. But those are flashproxies, not real Tor bridges, and we should be careful not to confound them.
Running a relay on a raspi or a router (?!) is not a great idea -- though people attempt both. If things could gracefully switch from being a relay to a bridge based on their speed, then that would actually make it more straightforward for users because they don't have to worry about whether they should be a bridge or relay.
People can't really estimate their own bandwidth without something like NDT, but they have an idea of how fast it is. eg, this connection is 21Mb/s up, 6mb/s down, but that's mostly irrelevant because my perception of it is that it's Fast. That perception would be the same if I were getting 2Mbp/s up/down. So maybe one non-technical change we can make is to user education and website documentation -- run a relay if you have a Fast connection.
Filtering people out based on advertised bandwidth is tricky - advertised bandwidth is only useful if it's based on reality.
Confusingly, the "Advertised Bandwidth" *is* actually reflective of reality: the Bandwidth Authorities determine it, and then (optionally) it is capped at whatever the relay set `MaxAdvertisedBandwidth` to in the relay's torrc.
250kb/s seems like a reasonable floor for both relays and bridges. 100kb/s is kind of the sanity check for a distributed bridge - if it's below that, it's not useful enough IMO.
You probably already know this, but for everyone else, see my note above about how bridge bandwidths currently aren't measured at all.
The real questions for me are: how much of a gain is possible?
Mike Perry gave some pretty concrete numbers.
and what is the right balance between number of relays and speed of those relays?
See my original reply to Mike's post, particularly the calculations underneath suggestion #2 where I came up with how we'd find a sliding "sweet spot" where the network is not degraded overall by advertising relays which are too slow to compensate for the cost of advertising them.
and I suspect that until something is tried, it may just be speculation.
I respectfully disagree. I'm pretty sure I just calculated and analysed some statistics on some pretty concrete numbers derived directly from current network usage and relays. :)
Mike Perry:
Invest in the Tor network.
Based purely on extrapolating from the Noisebridge relays, we could add ~300 relays, and double the network capacity for $3M/yr, or about $1 per user per year (based on the user counts from: https://metrics.torproject.org/users.html).
Note that this value should be treated as a minimum estimate. We actually want to ensure diversity as we grow the network, which may make this number higher. I am working on better estimates using replies from: https://lists.torproject.org/pipermail/tor-relays/2014-September/005335.html
Automated donation/funding distribution mechanisms such as https://www.oniontip.com/ are especially interesting ways to do this (and can even automatically enforce our diversity goals) but more traditional partnerships are also possible.
Result: 100% capacity increase for each O($3M/yr), or ~$1 per new user per year.
Naif's point about there being 100Mbit residential uplinks out there suggests that there may be a hybrid approach here.
If this vendor could detect super-high-speed client uplinks, they could ask only these users if they wanted to be non-exit relays. But this is complicated, as it also requires understanding if the user's ISP will get upset at the traffic consumption or the fact that a listening TCP service is running. For example, I know Comcast calls their residential service "unlimited", but yells at you if you transfer more than 250GB in a month, or if they discover any listening TCP ports on your IP address.
Even if we could figure these problems out by looking up ISP policy based on client IP address, I think we still need to fund exit relays. I don't think we can just enlist random home users connections to be exits without giving them a wall of text explaining how to deal with issues that may arise.
So this may be something to consider to reduce network expenditure, but it won't completely eliminate it.
Mike Perry transcribed 9.3K bytes:
Andrew Lewman:
I had a conversation with a vendor yesterday. They are interested in including Tor as their "private browsing mode" and basically shipping a re-branded tor browser which lets people toggle the connectivity to the Tor network on and off.
They very much like Tor Browser and would like to ship it to their customer base. Their product is 10-20% of the global market, this is of roughly 2.8 billion global Internet users.
As Tor Browser is open source, they are already working on it. However ,their concern is scaling up to handling some percent of global users with "tor mode" enabled. They're willing to entertain offering their resources to help us solve the scalability challenges of handling hundreds of millions of users and relays on Tor.
As this question keeps popping up by the business world looking at privacy as the next "must have" feature in their products, I'm trying to compile a list of tasks to solve to help us scale. The old 2008 three-year roadmap looks at performance, https://www.torproject.org/press/2008-12-19-roadmap-press-release.html.en
I've been through the specs, https://gitweb.torproject.org/torspec.git/tree/HEAD:/proposals to see if there are proposals for scaling the network or directory authorities. I didn't see anything directly related.
The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor)
These research papers basically propose a total network overhaul to deal with the problem of Tor relay directory traffic overwhelming the Tor network and/or Tor clients.
However, I believe that with only minor modifications, the current Tor network architecture could support 100M daily directly connecting users, assuming we focus our efforts on higher capacity relays and not simply adding tons of slower relays.
The core problem is that the fraction of network capacity that you spend telling users about the current relays in the network can be written as:
f = D*U/B
D is current Tor relay directory size in bytes per day, U is number of users, and B is the bandwidth per day in bytes provided by this Tor network. Of course, this is a simplification, because of multiple directory fetches per day and partially-connecting/idle clients, but for purposes of discussion it is good enough.
To put some real numbers on this, if you compare https://metrics.torproject.org/bandwidth.html#dirbytes with https://metrics.torproject.org/bandwidth.html#bandwidth, you can see that we're currently devoting about 2% of our network throughput to directory activity (~120MiB/sec out of ~5000MiB/sec). So we're not exactly hurting at this point in terms of our directory bytes per user yet.
But, because this is fraction rises with both D and U, these research papers rightly point out that you can't keep adding relays *and* users and expect Tor to scale.
However, when you look at this f=D*U/B formula, what it also says is that if you can reduce the relay directory size by a factor c, and also grow the network capacity by this same factor c, then you can multiply the userbase by c, and have the same fraction of directory bytes.
This means that rather than trying to undertake a major network overhaul like TorSK or PIR-Tor to try to support hundreds of thousands of slow junky relays, we can scale the network by focusing on improving the situation for high capacity relay operators, so we can provide more network bandwidth for the same number of directory bytes per user.
So, let's look at ways to reduce the size of the Tor relay directory, and each way we can find to do so means a corresponding increase in the number of users we can support:
Proper multicore support.
Right now, any relay with more than ~100Mbit of capacity really needs to run an additional tor relay instance on that link to make use of it. If they have AES-NI, this might go up to 300Mbit.
Each of these additional instances is basically wasted directory bytes for those relay descriptors.
But with proper multicore support, such high capacity relays could run only one relay instance on links as fast as 2.5Gbit (assuming an 8 core AES-NI machine).
Result: 2-8X reduction in consensus and directory size, depending on the the number of high capacity relays on multicore systems we have.
Cut off relays below the median capacity, and turn them into bridges.
Relays in the top 10% of the network are 164 times faster than relays in the 50-60% range, 1400 times faster than relays in the 70-80% range, and 35000 times faster than relays in the 90-100% range.
In fact, many relays are so slow that they provide less bytes to the network than it costs to tell all of our users about them. There should be a sweet spot where we can set this cutoff such that the overhead from directory activity balances the loss of capacity from these relays, as a function of userbase size.
Result: ~2X reduction in consensus and directory size.
It's super frustrating when I publicly tell people that ― as much as we <3 them for running a relay ― doing so on a home connection, on wimpy hardware like Raspberry Pis, is likely only going to harm the Tor network. And then people point at "If you have at least 100 kilobytes/s each way, please help out Tor by configuring your Tor to be a relay" on our website [0] and stop listening to whatever other relay-running advice I have to give.
So... here's the background on the "sweet spot" Mike was talking about, and why he stated: "[...]many relays are so slow that they provide less bytes to the network than it cost to tell all of our users about them.":
Using Stem on my latest copy of the consensus to run some calculations on the relay advertised bandwidth (RAB), I get:
Average RAB: 3887.222911227154 KB/s Median RAB: 249.5 KB/s Combined RABs of all RABs < 249.5KB/s: 162354 KB Bandwidth used for directory requests [1]: ~125 MB/s Current total bandwidth usage [2]: ~5700 MB/s
Meaning that, if we cut off all relays below the current median of 250KB/s, we lose 3064 relays, and lose 158 MB/s of network throughput.
Currently, 2.2% of our bandwidth usage goes toward directory requests (125MB/s / 5700MB/s). If we cut off the relays under 250 KB/s, we cut that 2.2% to 1.1%, saving roughly 75 MB/s in directory requests.
Overall, this means that we can halve the size of the current consensus and, rather than losing 158 MB/s, we only actually lose 83 MB/s in throughput. We could easily play with these numbers a bit, and find a "sweet spot" where the bandwidth cutoff rate is determined by whatever makes us net a positive change in overall bandwidth, taking directory requests into account. In other words: "If your relay costs us more to tell users about than the actual traffic it's providing, we don't want it!"
Long term, I don't think we want to do "only 3000 relays are allowed at any given time", but instead, a compromise where:
2.a. Have a sliding definition of what a "real internet connection" is, by modifying the statistics above to find the "sweet spot", and set this as the cutoff rate for the required minimum bandwidth for being a relay.
2.b. The sliding minimum bandwidth for running a relay is *actually* enforced. If you're below the minimum, no one's going to stop you from running your relay, but it's not going to be in the consensus.
Result: Overall network bandwidth stays the same. The size of the current consensus is roughly chopped in half.
Also, BridgeDB doesn't want your slow relays as bridges. See Footnote [3].
Switching to ECC keys only.
We're wasting a lot of directory traffic on uncompressible RSA1024 keys, which are 4X larger than ECC keys, and less secure. Right now, were also listing both. When we finally remove RSA1024 entirely, the directory should get quite a bit smaller.
Result: ~2-4X reduction in consensus and directory size.
I'm going to ignore microdescriptors for now, because I don't use them because they're a Bad Idea (see #5968). And I'm too lazy to go fetch some of them. :)
Mike, you said:
were [sic] also listing both
Should we assume, then, that you're only talking about the `onion-key`s, but not the `signing-keys`s (which are also currently 1024-bit RSA)?
Also... removing `onion-key`s from the `@type server-descriptor`s would not result in a "~2-4X reduction in [...] directory size". (It might possibly for the cached-microdescriptors, but I'm still ignoring those.)
Taking for example a really small server-descriptor (I removed the contact line and did things like making the bandwidth numbers as small as possible), and one of the largest server descriptors I could find, then making copies of each of these descriptors without the `onion-key`s, and then compressing each one of the four files with `gzip -n -9 $FILE`, I got:
Small server-descriptor, with onion key, compressed: 905 B Small server-descriptor, without onion key, compressed: 756 B Large server-descriptor, with onion key, compressed: 1127 B Large server-descriptor, without onion key, compressed: 980 B
Meaning that, without factoring in potential savings from gzipping multiple descriptors at a time, cutting out `onion-key`s would result in server-descriptors which are only 84% - 87% of the size. 13% savings isn't all that much.
Plus, if you are proposing moving everything (including the `signing-key`s) to ECC, I'm not convinced yet that that is a good idea, especially if we're using only one curve. Putting all your eggs in one basket...
Consensus diffs.
With proposal 140, we can save 60% of the directory activity if we send diffs of the consensus for regularly connecting clients. Calculating the benefit from this is complicated, since if clients leave the network for just 16 hours, there is very little benefit to this optimization. These numbers are highly dependent on churn though, and it may be that by removing most of the slow junk relays, there is actually less churn in the network, and smaller diffs: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/140-consensus...
Let's just ballpark it at 50% for the typical case.
Result: 2X reduction in directory size.
Not to mention that, by reducing the bytes used in directory fetches, consensus diffs also help by increasing the "sweet spot" in #2, and ergo raise the number of relays which the network can sustainably maintain.
Invest in the Tor network.
Based purely on extrapolating from the Noisebridge relays, we could add ~300 relays, and double the network capacity for $3M/yr, or about $1 per user per year (based on the user counts from: https://metrics.torproject.org/users.html).
Note that this value should be treated as a minimum estimate. We actually want to ensure diversity as we grow the network, which may make this number higher. I am working on better estimates using replies from: https://lists.torproject.org/pipermail/tor-relays/2014-September/005335.html
Automated donation/funding distribution mechanisms such as https://www.oniontip.com/ are especially interesting ways to do this (and can even automatically enforce our diversity goals) but more traditional partnerships are also possible.
Result: 100% capacity increase for each O($3M/yr), or ~$1 per new user per year.
isis transcribed 14K bytes:
[...]
Oopsie daisies. Forgot my footnotes and references! Voilá:
[0]: https://www.torproject.org/docs/tor-relay-debian.html.en [1]: https://metrics.torproject.org/bandwidth.html?graph=dirbytes&start=2014-... [2]: https://metrics.torproject.org/bandwidth.html?graph=bandwidth&start=2014...
[3]: Please, don't give all the shit relays to me as bridges. I think it's less important scalability-wise (right now) to have a strict cutoff rate for bridges, but eventually, when/if we ever have Bridge Bandwidth Authorities, BridgeDB should cut off anything below some completely arbitrary rate, like 100 KB/s. I've gotten a bridge (from http://bridges.torproject.org) which was 28 B/s. Yes, *bytes*. That thing was probably slowing down the rest of the Tor network just by *existing* via its molasses-speeds blocking the Exit from continuing the response after SENDME number of cells, which is probably eventually going to cause TCP timeouts on the Exit's side and a whole bunch of other messes.
30.09.2014, 01:12 isis:
isis [mash-up]
[3]: Please, don't give all the shit relays to me as bridges. I think it's less important scalability-wise (right now) to have a strict cutoff rate for bridges, but eventually, when/if we ever have Bridge Bandwidth Authorities, BridgeDB should cut off anything below some completely arbitrary rate, like 100 KB/s. I've gotten a bridge (from http://bridges.torproject.org) which was 28 B/s. Yes, *bytes*. That thing was probably slowing down the rest of the Tor network just by *existing* via its molasses-speeds blocking the Exit from continuing the response after SENDME number of cells, which is probably eventually going to cause TCP timeouts on the Exit's side and a whole bunch of other messes.
I think the Tor Project requires a high number of bridges to make collection of all addresses harder for some adversaries. I'm aware that adversaries can outrun the number of brides. This might or might not be valid until you shut down all vanilla bridges.
Obviously bridges that don't provide too less bandwidth should not take part in the network.
What I usually recommend is to users is based on their bandwidth and how frequently their IP changes. If their connection is fast and their IP never changes (eg, a desktop or server), then run a non-exit relay [2]. For a laptop that moves to-from work, then a relay or bridge.
Actually, anything with a constantly changing IP is a terrible idea for a bridge.
Think of it this way: BridgeDB hands you a bridge line on Monday. On Tuesday, the bridge's IP rotates to something else. Now your Tor doesn't work. Wah-wah, sucks for you!
The bridge being down is indeed a problem. I got the same recommendation of running a bridge on my home-network, rather than a relay (or exit) from a person working for/on Tor. I passed that recommendation around, IIRC even on this list, and no one complained.
It is problematic to have this information around and you, isis, speaking up soooo late.
The changing IP part is beneficial for censorship circumvention. It would be broken if clients could learn the new IP address automatically.
Distribution of bridges just sucks for circumvention of censorship. (Not your work, isis.) That could be improved by handing out puzzles.
There's one issue if you remove all the small relays, only relays run by the NSA will be around. Not many people have access to multi-megabit upload speeds. And those that do might also be using bittorrent.
I'm quite certain that I'm definitely not the NSA, and I run a multi megabyte exit relay[.]
I don't think you are the Tor Project either, isis. You are working for the Tor Project. Well depending on the view all employees are the Tor Project.
I'm not suggesting you would run those nodes for the NSA or that you work for them in any way.
Thank you all for your work in upscaling Tor.
Regards, Sebastian G. bastik
To avoid squashing the Tor network with all of these new clients, the company would almost certainly have to run some big relays to help compensate for the additional load. Another proposal would be some sort of incentive for running relays.
-V
Besides the fact that this could be a great opportunity for tor in many ways I see two problems we should consider:
- this vastly growth would be "artificial". What happens to all the users and servers if they stop supporting the product or close-down?
- IMHO it is a problem if the network expands because the SOME users pay for it. Don't get me wrong. I know that many people are willing to spend their private money to run tor-relays ... but I know of no user who have to pay for it. (Expect torplug, but that is a different story)
Possible that I'm getting something completely wrong, but just for the record!
-- M
M. Ziebell:
Besides the fact that this could be a great opportunity for tor in many ways I see two problems we should consider:
- this vastly growth would be "artificial". What happens to all the users and servers if they stop supporting the product or close-down?
In this case, presumably the users would disappear, as this vendor's browser product would no longer include Tor (and I would hope push out an update that removed the feature).
- IMHO it is a problem if the network expands because the SOME users pay for it. Don't get me wrong. I know that many people are willing to spend their private money to run tor-relays ... but I know of no user who have to pay for it. (Expect torplug, but that is a different story)
Well, I think the model we are considering is that a vendor is monitizing their users some other way (perhaps because they purchased the phone that comes with Tor on it, or through advertising revenue during non-private mode).
This vendor would presumably then have an interest in contributing either relays, money, or both to make sure the Tor network is fast enough to be useful to their users.
I am worried that this particular vendor is thinking about just using our Tor Browser patches without talking to us on the engineering side, but I guess this is still in the initial stages of discussion.
Andrew Lewman transcribed 1.8K bytes:
The last research paper I see directly addressing scalability is Torsk (http://www.freehaven.net/anonbib/bibtex.html#ccs09-torsk) or PIR-Tor (http://www.freehaven.net/anonbib/bibtex.html#usenix11-pirtor)
Using Private Information Retrieval (PIR) to retrieve a 50 KB/s relay would likely make the current network [0] completely unusable.
_______________________________________________________________________________
First, the PIR-Tor paper makes a few naïve assumptions which grossly effect the analysis of the overhead for directory fetches:
1. The authors assume that a client's three [sic] Guard Nodes will be the directory mirrors for that client, accepting PIR queries for descriptors.
2. There is zero handling of cases where multiple directory mirrors have differing data.
3. There is zero handling of colluding directory mirrors.
4. The *increase* to descriptor size is not well factored in to the analysis.
In the PIR-Tor paper, §4.2: | | "Even if all three guard relays are compromised, they cannot actively | manipulate elements in the PIR database since they are signed by the | directory authorities[...]" | Single descriptors are *not* signed, the set of all descriptors is. Therefore, if a client wanted to actually check that all (or a majority) or the Directory Authorities had signed a descriptor, that client would need to:
4.a. Download all of them and check. Which negates the whole point of this PIR thing.
4.b. All of the Directory Authorities would need to sign each and every descriptor by itself. This would result in the current microdescriptors, which are sized from roughly 900 bytes to some of the larger ones which are 2.5 KB, increasing in size by roughly 4 KB (that's the size of the DirAuth signatures on the current consensus).
Result: Each microdescriptor is 4X larger, and the Directory Authorities need to do several orders of magnitudes more signatures.
_______________________________________________________________________________
Second, because the PIR-Tor paper doesn't seem to be what we would actually want to implement, the following is an analysis.
Using some safe assumptions about what security guarantees we would likely want to require from any PIR scheme we used... That is, we would like want a 1-roundtrip (RT), b-private, b-Byzantine, k-out-of-m databases, [1] PIR scheme. (Meaning that there are m total directory mirrors, only k of those actually answer a set of your queries, and you want the privacy of your queries and the authenticity of the answers to your queries to be protected even if b number of directory mirrors are malicious and colluding with one another.)
Then, with those assumptions in mind, from Theorem 7.7 in §7.1 of this review on PIR schemes [2] (which is, admittedly, a bit dated) used in one of Dan Boneh's classes, we have a size complexity [3] of
O((k/3b) n^(1/[(k-1)/3]) m log_2 m)
where b<k/3 (Meaning that the number of malicious, colluding directory mirrors which answered is less than one third of the total number of mirrors which answered.)
import math m = 3000 k = 30 b = 4 n = 160 # the number of bits in a relay fingerprint int(math.ceil((((float(k)/(3*float(b)))
... * (n**(1.0/((float(k)-1.0)/float(3))))) ... * float(m)) * math.log(float(m), 2.0))) 146449
the above space complexity comes out to O(146449)-bits per lookup. (Where, by "lookup", I mean "the set of all queries pertaining to fetching a single relay descriptor", since, in PIR, a "query" is usually for a single bit.) This would mean that adding any sane PIR scheme for directory requests would result in something like a 900X increase in the bytes used for each request.
While one of the benefits of PIR would be that clients would no longer need to request the descriptors for all relays listed in the consensus, this benefit actually doesn't help as much as it would seem, because switching to a PIR scheme would require a client to make three of these roughly O(146449)-bit requests, every ten minutes or so, when the client wants a new circuit.
Don't get me wrong; PIR is awesome. But the current Tor network is likely *not* the correct real-world application of any of the current PIR schemes. At least, it isn't until we have some HUGE number of nodes in the network, which by #1 and #2 in my original reply to this thread, [4] we shouldn't.
[0]: Note, I said "the current network". An imaginary future Tor network which has 10,000,000 relays would be different. And then note the very last paragraph where I state that we probably shouldn't ever have 10,000,000 relays.
[1]: I assume that we need replication in Tor's use case. There are papers, such as the following:
Kushilevitz, Eyal, and Rafail Ostrovsky. "Replication is not needed: Single database, computationally-private information retrieval." 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 2013. http://www.cs.fiu.edu/~carbunar/teaching/cis5374/slides/pir-single.pdf
for which the research doesn't apply because it was aimed at computationally-hiding PIR schemes, and obviously Tor aims for information theoretic security. Other than the PIR-Tor paper, I haven't found any literature which analyses PIR for anything close to Tor's use case. (Although I'd be stoked to hear about any such papers.)
[2]: Gasarch, William. "A survey on private information retrieval." Bulletin of the EATCS. University of Chicago, 2004. https://crypto.stanford.edu/~dabo/courses/cs355_fall07/pir.pdf
[3]: I'm completely ignoring the fact that I'm supposed to do polynomial interpolation for this complexity because I just want an estimate and don't feel like messing around with Lagrange polynomials or doing Gaussian elimination on a Vandermonde matrix of a highly-conditional system of equations or anything. Consequently, I removed the `t` variable from the original complexity equation, effectively setting it to t=1.
[4]: https://lists.torproject.org/pipermail/tor-dev/2014-September/007558.html
Thanks isis. We worked on PIR-Tor a while ago, so my recollection could be a bit rusty, but here are some thoughts.
1) From a performance perspective, the best results are obtained by using the information-theoretic variant of PIR-Tor (ITPIR) (as opposed to the computational variant). As you correctly point out, we considered the three guard relays to be the PIR directory servers, with the assumption that not all three collude. (Our argument for the non-colluding assumption was that if all three guards are malicious, then the security offered by Tor is terrible already.) Note than ITPIR necessarily requires more than one server. Given Tor's move to a single guard design, the first challenge is to figure out alternative PIR servers for the client.
(I havn't thought much about this challenge, but it seems that the single guard could be one server, and two other relays (maybe with guard flags) could play the role of PIR directory servers. I'll have to think more about this, but it is possible that this design continues to provide similar security properties to Tor)
2) Even outside the context of PIR, there are several advantages to having some structure in the database, with fine grained signatures on individual relay/micro descriptors (relays could also be grouped into blocks, with signatures at the block level). For example, if the Tor network were to use 2 hop circuits, then clients would have close to 0 (or at least significantly less than 1%) overhead for maintaining network state.
Why? Because the first hop (guard) is fixed, fetching the directory to select the guard is only a *one-time* operation; note that much of the directory overhead in Mike Perry's analysis comes from *periodic fetches*. Secondly, clients do not need to use PIR to fetch the exit relay in this setting, since guards know the identity of exits in a two hop design anyway (bandwidth overhead of fetching a single exit is simply the size of the descriptor/micro-descriptor -- resulting in *several orders of magnitude* bandwidth savings). (This optimization was referred to in the paper in the context of fetching middle relays)
(Of course there are anonymity implications of moving to two hop designs. Perhaps the most significant concern is easy guard identification, but perhaps an additional 100 million clients + move to a single guard reduces these concerns. The bandwidth savings could be very significant, specially in the context of mobile clients, which otherwise might use hundreds of MB per month just to maintain network state. Anyway, I digress.)
Please find some comments about your previous mail inline (below).
On Tue, Sep 30, 2014 at 1:04 AM, isis isis@torproject.org wrote:
- The authors assume that a client's three [sic] Guard Nodes will be the directory mirrors for that client, accepting PIR queries for
descriptors.
Right, please see (1) above.
- There is zero handling of colluding directory mirrors.
This seems incorrect? We used the Percy open source library to perform PIR, which does handle collusion among mirrors (to the best of my recollection). The parameters can be set such that one honest server is sufficient for security.
- The *increase* to descriptor size is not well factored in to the
analysis.
4.b. All of the Directory Authorities would need to sign each and every descriptor by itself. This would result in the current microdescriptors, which are sized from roughly 900 bytes to some
of the larger ones which are 2.5 KB, increasing in size by roughly 4 KB (that's the size of the DirAuth signatures on the current consensus).
I am not sure which signature scheme you are considering. In the paper, we talk about threshold BLS signatures, which have only 22 byte overhead. (This might increase for better security guarantees, but 4KB overhead seems very conservative.)
While one of the benefits of PIR would be that clients would no longer need to request the descriptors for all relays listed in the consensus, this benefit actually doesn't help as much as it would seem, because switching to a PIR scheme would require a client to make three of these roughly O(146449)-bit requests, every ten minutes or so, when the client wants a new circuit.
Right, clients would roughly need 18 middle/exit relays (using PIR queries) over a three hour interval to build a Tor circuit every 10 minutes. My recollection is that even in 2011, PIR seemed more efficient (we estimated a 12KB overhead per PIR query) than a full network fetch every three hours (12*18 = 216KB with PIR vs several MBs without PIR). Though these numbers might change depending on implementation details (including the use of micro-descriptors), the benefits would only improve with growing network size.
Thanks, Prateek
On Tue, Sep 30, 2014 at 6:29 AM, Prateek Mittal pmittal@princeton.edu wrote:
- The *increase* to descriptor size is not well factored in to the
analysis.
4.b. All of the Directory Authorities would need to sign each and
every descriptor by itself. This would result in the current microdescriptors, which are sized from roughly 900 bytes to some of the larger ones which are 2.5 KB, increasing in size by roughly 4 KB (that's the size of the DirAuth signatures on the current consensus).
I am not sure which signature scheme you are considering. In the paper, we talk about threshold BLS signatures, which have only 22 byte overhead. (This might increase for better security guarantees, but 4KB overhead seems very conservative.)
You can make this even lower because every PIR query returns a block which is roughly square root the size of the entire database; you would only need to have a signature on each block.
- Nikita
On Tue, Sep 30, 2014 at 12:04 AM, isis isis@torproject.org wrote:
[1]: I assume that we need replication in Tor's use case. There are papers, such as the following:
Kushilevitz, Eyal, and Rafail Ostrovsky. "Replication is not needed: Single database, computationally-private information retrieval." 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 2013. http://www.cs.fiu.edu/~carbunar/teaching/cis5374/slides/pir-single.pdf for which the research doesn't apply because it was aimed at computationally-hiding PIR schemes, and obviously Tor aims for information theoretic security. Other than the PIR-Tor paper, I haven't found any literature which analyses PIR for anything close to Tor's use case. (Although I'd be stoked to hear about any such papers.)
The current Tor design relies heavily on computational assumptions in all of its cryptography, so I don't think that the Tor network setting is a reason to use information-theoretic rather than computational PIR. We advocated ITPIR because it is much more efficient and because collusion among your three guards was already a significant problem. With the move to a single guard, that argument no longer makes as much sense. We do have an analysis in the paper of the computational PIR scenario, but it does impose significantly more (computational) overhead on the directory mirrors.
- Nikita