I reimplemented doctor's sybil checker [0] in Go [1] which makes it possible to (somewhat) quickly analyse archived consensuses. The algorithm is quite simple. It iterates over every consensus ever published, keeps track of all relay fingerprints, and tells us how many previously unseen relay fingerprints are present in every consensus. I put the results, time series ranging from 2007 to 2014, online [2]. One can see a bunch of suspicious spikes in some of the years. I manually checked the events and summed them up below. But first, here are some basic statistics about the amount of new fingerprints:
Min. : 0.000 1st Qu.: 4.000 Median : 6.000 Mean : 6.377 3rd Qu.: 8.000 Max. :3020.000
The median amount of new fingerprints in a consensus is six. The maximum number observed is 3,020 which was caused by the sybil attack last December.
Here are some preliminary notes about the most significant spikes. I'll have a more detailed analysis at some point in the future.
2007-11-12: Missing consensuses. 2008-07-22: Missing consensuses. 2008-09-19: Some missing consensuses and a small group called "torism" came online. 2008-10-25: Missing consensuses. 2010-06-26: Several hundred PlanetLab relays came online. At least their nickname contained "planetlab" or some variation thereof. 2010-09-23: The trotsky relays which were suspected to be part of a botnet. 2010-10-02: Again trotsky relays. 2012-11-15: Several hundred clearly related relays, at least some of which in Amazon's EC2 IP address space, come online. 2013-02-04: A group very similar to the previous one comes online. 2014-01-30: A clearly related group of relays comes online, presumably the one from the pulled Blackhat talk. 2014-11-17: Several probably related relays in the Google cloud get online. 2014-12-26: Many relays named LizardNSA and FuslVZTOR come online. 2014-12-30: Many relays named anonpoke come online.
[0] https://gitweb.torproject.org/doctor.git/tree/sybil_checker.py [1] https://gitweb.torproject.org/user/phw/sybilhunter.git/ [2] http://www.nymity.ch/new_fingerprints/
Cheers, Philipp
On Thu, Jan 15, 2015 at 04:25:10PM +0100, Philipp Winter wrote:
2014-01-30: A clearly related group of relays comes online, presumably the one from the pulled Blackhat talk. (A) 2014-11-17: Several probably related relays in the Google cloud get online. (B) 2014-12-26: Many relays named LizardNSA and FuslVZTOR come online. (C) 2014-12-30: Many relays named anonpoke come online. (D)
The visualizer program only works on archived microdescriptors, which only go back through 2014. But I ran it on all of 2014 and you can see the four incidents above.
The stripes in the background are months.
https://people.torproject.org/~dcf/graphs/microdescs/microdescs-2014.png (8760×62986 pixels) https://people.torproject.org/~dcf/graphs/microdescs/microdescs-2014-short.p... (8760×2048 pixels)
(Wow, who knew there were over 60000 distinct descriptors in 2014?)
Maybe the checker should also check for when a lot of relays go away at once. It looks that happened in mid-April, where relays that had been started at different times in the beginning of the year all stopped at once.
(Oh, on further reflection, that must have been Heartbleed!)
David Fifield
On Thu, Jan 15, 2015 at 01:34:01PM -0800, David Fifield wrote:
Maybe the checker should also check for when a lot of relays go away at once. It looks that happened in mid-April, where relays that had been started at different times in the beginning of the year all stopped at once.
(Oh, on further reflection, that must have been Heartbleed!)
That's a good idea. Here's the visualisation: http://www.nymity.ch/hunting_sybils/leaving_relays/
Some of the spikes represent the same sybils already present in the previous visualisation. That makes sense because it's not surprising that these group disappeared just as quickly as they appeared.
However, there are some additional groups which were not present in the previous visualisation:
- 2008-05-14: More than 200 relays disappear. I am not sure why. - 2008-08-19: More than 150 relays disappear. Also not sure why. - 2009-09-24: About 60 relays disappear. There's a group of 10 relays with the same nickname, so this might be a false positive. - 2012-01-23: About 100 relays disappear. Not sure why. - 2012-09-16: More than 150 relays disappear. Many of them are in the same /24 and many have the same nickname pattern. - 2013-04-11: Same as above. - 2014-04-17: 247 relays disappear. These relays were rejected from the consensus because of the heartbleed bug. - 2014-04-18: 906 relays disappear. These relays were rejected from the consensus because of the heartbleed bug.
In the next step, I'll work on a similarity metric to compare and cluster relay descriptors. That should help with manual analysis.
Cheers, Philipp
On Thu, Jan 15, 2015 at 10:25 AM, Philipp Winter phw@nymity.ch wrote:
The median amount of new fingerprints in a consensus is six. The Here are some preliminary notes about the most significant spikes. I'll
2008-10-25: Missing consensuses.
FYI, between here there was thread tor-talk 'many new relays' of possible event around end 2009-06 to begining 2009-07. Along with usual posts of people about potential things to detect.
2010-06-26: Several hundred PlanetLab relays came online. At least
On Thu, Jan 15, 2015 at 06:11:25PM -0500, grarpamp wrote:
On Thu, Jan 15, 2015 at 10:25 AM, Philipp Winter phw@nymity.ch wrote:
The median amount of new fingerprints in a consensus is six. The Here are some preliminary notes about the most significant spikes. I'll
2008-10-25: Missing consensuses.
FYI, between here there was thread tor-talk 'many new relays' of possible event around end 2009-06 to begining 2009-07. Along with usual posts of people about potential things to detect.
Interesting, thanks for pointing this out. This event is visible in the diagram but not as a sudden spike but more as a temporary increase in the base rate: http://www.nymity.ch/new_fingerprints/2009_new_fingerprints.png
This event also shows why a simple threshold-based detection mechanism is insufficient: Sybils can be added slowly over several hours or days in order to stay under the threshold.
Cheers, Philipp
On Mon, Jan 19, 2015 at 11:47 AM, Philipp Winter phw@nymity.ch wrote:
On Thu, Jan 15, 2015 at 06:11:25PM -0500, grarpamp wrote:
FYI, between here there was thread tor-talk 'many new relays' of possible event around end 2009-06 to begining 2009-07. Along with usual posts of people about potential things to detect.
Interesting, thanks for pointing this out. This event is visible in the diagram but not as a sudden spike but more as a temporary increase in the base rate: http://www.nymity.ch/new_fingerprints/2009_new_fingerprints.png
Cool, nice to see this graphed, really obvious something happened.
This event also shows why a simple threshold-based detection mechanism is insufficient: Sybils can be added slowly over several hours or days in order to stay under the threshold.
Re detection methods, consider some examination of exceeded bounds within first and second derivatives of various data you are collection, RRD, etc.