We sometimes see attacks from relays that are hosted on cloud platforms. I have been wondering if the benefit of having cloud-hosted relays outweighs the abuse we see from them.
To get an idea of the benefit, I analysed the bandwidth that is contributed by cloud-hosted relays. I first obtained the network blocks owned by three cloud providers (Amazon AWS, Google Cloud, Microsoft Azure), and determined the percent of bandwidth they contributed in July 2015. The results show that there were typically ~200 cloud-hosted relays online: https://nymity.ch/sybilhunting/png/cloud-hosted_relays_2015-07.png The spike shortly after hour 200 was caused by a lot of Amazon relays named "DenkoNet". The spike at the very beginning was caused by a number of relays that might very well belong together, too, based on their uptime pattern.
What counts, however, is bandwidth. Here's the total bandwidth fraction contributed by cloud-hosted relays over July 2015: https://nymity.ch/sybilhunting/png/cloud-hosted_bandwidth_2015-07.png There were no Google Cloud relays to contribute any bandwidth. Amazon AWS-powered relays contributed the majority of bandwidth, followed by Microsoft Azure-powered relays. Here's a summary of the time series in percent:
Min. Mean Median Max. 0.2% 0.8% 0.79% 1.5%
In an average consensus in July 2015, cloud-hosted relays contributed only around 0.8% of bandwidth. Note, however, that this is just a lower bound. The netblocks I used for the analysis could have changed, and I didn't consider providers other than Google, Amazon, and Microsoft.
There are also cloud-hosted bridges. Tor Cloud, however, has shut down, and the number of EC2 bridges is declining: https://metrics.torproject.org/cloudbridges.html?graph=cloudbridges&start=2015-01-01&end=2015-07-31
The harm caused by cloud-hosted relays is more difficult to quantify. Getting rid of them also wouldn't mean getting rid of any attacks. At best, attackers would have to jump through more hoops.
If we were to decide to permanently reject cloud-hosted relays, we would have to obtain the netblocks that are periodically published by all three (and perhaps more) cloud providers: https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html https://msdn.microsoft.com/en-us/library/azure/Dn175718.aspx https://cloud.google.com/appengine/kb/general?hl=en#static-ip
Note that this should be done periodically because the netblocks are subject to change.
Cheers, Philipp
On 1 Sep 2015, at 07:45, Philipp Winter phw@nymity.ch wrote:
We sometimes see attacks from relays that are hosted on cloud platforms. I have been wondering if the benefit of having cloud-hosted relays outweighs the abuse we see from them.
To get an idea of the benefit, I analysed the bandwidth that is contributed by cloud-hosted relays. I first obtained the network blocks owned by three cloud providers (Amazon AWS, Google Cloud, Microsoft Azure), and determined the percent of bandwidth they contributed in July 2015. The results show that there were typically ~200 cloud-hosted relays online: https://nymity.ch/sybilhunting/png/cloud-hosted_relays_2015-07.png The spike shortly after hour 200 was caused by a lot of Amazon relays named "DenkoNet". The spike at the very beginning was caused by a number of relays that might very well belong together, too, based on their uptime pattern.
What counts, however, is bandwidth. Here's the total bandwidth fraction contributed by cloud-hosted relays over July 2015: https://nymity.ch/sybilhunting/png/cloud-hosted_bandwidth_2015-07.png There were no Google Cloud relays to contribute any bandwidth. Amazon AWS-powered relays contributed the majority of bandwidth, followed by Microsoft Azure-powered relays. Here's a summary of the time series in percent:
Min. Mean Median Max. 0.2% 0.8% 0.79% 1.5%
In an average consensus in July 2015, cloud-hosted relays contributed only around 0.8% of bandwidth. Note, however, that this is just a lower bound. The netblocks I used for the analysis could have changed, and I didn't consider providers other than Google, Amazon, and Microsoft.
There are also cloud-hosted bridges. Tor Cloud, however, has shut down, and the number of EC2 bridges is declining: https://metrics.torproject.org/cloudbridges.html?graph=cloudbridges&start=2015-01-01&end=2015-07-31
Can we preserve cloud-hosted bridges independently of whatever we decide to do to cloud-hosted relays?
The harm caused by cloud-hosted relays is more difficult to quantify. Getting rid of them also wouldn't mean getting rid of any attacks. At best, attackers would have to jump through more hoops.
If we were to decide to permanently reject cloud-hosted relays, we would have to obtain the netblocks that are periodically published by all three (and perhaps more) cloud providers: https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html https://msdn.microsoft.com/en-us/library/azure/Dn175718.aspx https://cloud.google.com/appengine/kb/general?hl=en#static-ip
Note that this should be done periodically because the netblocks are subject to change.
I wonder about the impact of this proposal on Tor research and on Tor developers.
Some may consider it a benefit if researchers have to take more steps to interact with the Tor network.
I wonder how many Tor developers develop using cloud machines, and whether it’s a benefit for them to be able to test changes on the live Tor network, or a drawback. I test my changes on Linux using a cloud machine, and have used it at times to ensure that my changes don’t break when deployed on the live network. (I don’t do this at home, for both legal and connectivity reasons.)
Of course, I use chutney to test my changes on a test network, before I use them on the live network. So that’s another option for both researchers and developers. As an aside, we're working on making chutney easier to use, and we’re getting there incrementally. Here is a very rough draft plan: https://trac.torproject.org/projects/tor/wiki/doc/TorChutneyGuide https://trac.torproject.org/projects/tor/wiki/doc/TorChutneyGuide
Of course, if researchers or developers or others really need a machine, they can move to a smaller cloud provider. This has benefits for diversity, and reduces what Google, Amazon, and Microsoft can know about Tor.
Tim (teor)
Tim Wilson-Brown - teor transcribed 11K bytes:
On 1 Sep 2015, at 07:45, Philipp Winter phw@nymity.ch wrote:
There are also cloud-hosted bridges. Tor Cloud, however, has shut down, and the number of EC2 bridges is declining: https://metrics.torproject.org/cloudbridges.html?graph=cloudbridges&start=2015-01-01&end=2015-07-31
Can we preserve cloud-hosted bridges independently of whatever we decide to do to cloud-hosted relays?
Tor Cloud is deprecated for several reasons, [0] and it's possible that those bridges haven't been getting software updates. [1] Those bridges should probably die. But yes, in theory, if we decided to block cloud relays, we technically could preserve those bridges. If anything, I'd be more in favour of doing this the other way around: ban those EC2 bridges and keep the cloud relays (but perhaps create more/better automated scans to detect misbehaviour).
I wonder about the impact of this proposal on Tor research and on Tor developers.
The Tor Project does have an EC2 account that some Tor developers have access to, but we don't ever run non-TestingTorNetwork relays/bridges on it. Also, in general, (paid) Tor developers aren't supposed to run relays, due to concerns that doing so could possibly be legally interpreted as "The Tor Project runs the Tor network".
However, I agree with your concerns that this change might make it more difficult for other researchers to study tor (hopefully, ethically).
[0]: https://blog.torproject.org/blog/tor-cloud-service-ending-many-ways-remain-h... [1]: https://trac.torproject.org/projects/tor/ticket/11502
We sometimes see attacks from relays that are hosted on cloud platforms. I have been wondering if the benefit of having cloud-hosted relays outweighs the abuse we see from them.
I don't think banning GCE, AWS and MS Azure is an efficient method to significantly increase the cost of attacks because it is trivial for an attacker to quickly spin up "a large number of disposable machines" at other ISPs as well.
Detecting new groups of relays in a single AS that all sign up in a short timeframe is trivial (DocTor does and did that already [1][2], OrNetRadar [3] does it as well).
Should you decide to continue generally blacklisting entire ISPs/ASes/IP ranges:
Please add that info (including the banned ISPs/ASes/IP ranges) to the documentation (i.e. relay setup guides [4]) so volunteers don't waste their time and money to setup blacklisted relays [5].
[1] https://lists.torproject.org/pipermail/tor-consensus-health/2015-July/005955... [2] https://lists.torproject.org/pipermail/tor-consensus-health/2015-July/005974... [3] https://lists.riseup.net/www/info/ornetradar http://news.gmane.org/gmane.network.onion-routing.ornetradar [4] https://www.torproject.org/getinvolved/relays.html.en [5] https://lists.torproject.org/pipermail/tor-relays/2015-August/007655.html
On 1 Sep 2015, at 07:45, Philipp Winter <phw@nymity.ch mailto:phw@nymity.ch> wrote:
The harm caused by cloud-hosted relays is more difficult to quantify. Getting rid of them also wouldn't mean getting rid of any attacks. At best, attackers would have to jump through more hoops.
If we were to decide to permanently reject cloud-hosted relays, we would have to obtain the netblocks that are periodically published by all three (and perhaps more) cloud providers: <https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html> <https://msdn.microsoft.com/en-us/library/azure/Dn175718.aspx https://msdn.microsoft.com/en-us/library/azure/Dn175718.aspx> <https://cloud.google.com/appengine/kb/general?hl=en#static-ip https://cloud.google.com/appengine/kb/general?hl=en#static-ip>
Note that this should be done periodically because the netblocks are subject to change.
On 1 Sep 2015, at 08:58, nusenu nusenu@openmailbox.org wrote:
Should you decide to continue generally blacklisting entire ISPs/ASes/IP ranges:
Please add that info (including the banned ISPs/ASes/IP ranges) to the documentation (i.e. relay setup guides [4]) so volunteers don't waste their time and money to setup blacklisted relays [5].
[4] https://www.torproject.org/getinvolved/relays.html.en https://www.torproject.org/getinvolved/relays.html.en [5] https://lists.torproject.org/pipermail/tor-relays/2015-August/007655.html https://lists.torproject.org/pipermail/tor-relays/2015-August/007655.html
If the blocked IP ranges are going to become numerous, and change frequently, why not create a tool that volunteer relay operators can use to check an IP address?
Tim (teor)
My sense of tor-relays is that "end users" as relay operators (which presumably operate most relays, with places like torservers doing the rest) just go looking for VPS accounts. ie: compute platforms aren't their thing.
Which leaves the only real users of compute to be attackers and researchers. The former we don't want, the latter we do.
Blocking compute seems fine based on its tiny resource contribution. Researchers could come to Tor to unblock and share their project though that could be discouraging, and there's currently no mechanism for that. Attackers often need lots of IP's and programmability at good cost, which may not readily exist with VPS. Govts excepted.
On Mon, Aug 31, 2015 at 6:58 PM, nusenu nusenu@openmailbox.org wrote:
Detecting new groups of relays in a single AS that all sign up in a
Blocking compute may limit the ability to openly survey the attack space by forcing it to hide more.
Please add that info (including the banned ISPs/ASes/IP ranges) to the documentation (i.e. relay setup guides [4]) so volunteers don't waste their time and money to setup blacklisted relays [5].
+1
Philipp Winter transcribed 2.6K bytes:
The harm caused by cloud-hosted relays is more difficult to quantify. Getting rid of them also wouldn't mean getting rid of any attacks. At best, attackers would have to jump through more hoops.
Does anyone know which attacks were carried out via relays running on cloud platforms?
The only one I remember was the "One cell is enough" [0] tagging attack in 2009, but IIRC, their malicious/colluding exit was run on PlanetLab (also, via the nature of the attack, it probably wouldn't have caused any harm to real users). Were there any others?
[0]: https://blog.torproject.org/blog/one-cell-enough
Does anyone know which attacks were carried out via relays running on cloud platforms?
The Lizard Squad thing from last year was substantially Google Cloud (GAE/GCE), if I recall correctly (there's a list from consensus-health here[0]). Lots of research takes place on EC2, but it doesn't seem to be a sybil-magnet - after all, there are cheaper providers who will ask fewer questions (e.g., OVH, which has a much bigger consensus fraction).
[0] https://lists.torproject.org/pipermail/tor-consensus-health/2014-December/00...