I spent some time improving the existing relay uptime visualisation [0]. Inspired by a research paper [1], the new algorithm uses single-linkage clustering with Pearson's correlation coefficient as distance function. The idea is that relays are grouped next to each other if their uptime (basically a binary sequence) is highly correlated. Check out the following gallery. It contains monthly relay uptime images, dating back to 2007: https://nymity.ch/sybilhunting/uptime-visualisation/
If you aren't familiar with this type of visualisation: Every image shows the uptime of all Tor relays that were online in a given month. Every row is a consensus and every column is a relay. White pixels mean that a relay was offline and black pixels means that a relay was online. Red pixels are used to highlight suspiciously similar clusters. A nice example is the Heartbleed incident: https://nymity.ch/sybilhunting/uptime-visualisation/slide_2014-04.html The huge red block on the left shows all the relays that were removed by the directory authorities because they didn't rotate their key pairs in time.
The downside of single-linkage clustering is that it takes longer to compute. On my laptop, I can create an image covering one month in under three minutes, so it's tolerable.
Another practical problem is that it's cumbersome to learn the relay fingerprint of a given column. I'm looking into JavaScript/HTML tricks that can show text when you hover over a region in the image. Perhaps somebody knows more?
[0] https://bugs.torproject.org/12813 [1] http://nms.csail.mit.edu/papers/clustering-imw2002.pdf, Section 2
Cheers, Philipp