[tor-dev] Better relay uptime visualisation

7 Dec 2015


      I spent some time improving the existing relay uptime visualisation [0].
Inspired by a research paper [1], the new algorithm uses single-linkage
clustering with Pearson's correlation coefficient as distance function.
The idea is that relays are grouped next to each other if their uptime
(basically a binary sequence) is highly correlated.  Check out the
following gallery.  It contains monthly relay uptime images, dating back
to 2007:
https://nymity.ch/sybilhunting/uptime-visualisation/
If you aren't familiar with this type of visualisation: Every image
shows the uptime of all Tor relays that were online in a given month.
Every row is a consensus and every column is a relay.  White pixels mean
that a relay was offline and black pixels means that a relay was
online.  Red pixels are used to highlight suspiciously similar clusters.
A nice example is the Heartbleed incident:
https://nymity.ch/sybilhunting/uptime-visualisation/slide_2014-04.html
The huge red block on the left shows all the relays that were removed by
the directory authorities because they didn't rotate their key pairs in
time.
The downside of single-linkage clustering is that it takes longer to
compute.  On my laptop, I can create an image covering one month in
under three minutes, so it's tolerable.
Another practical problem is that it's cumbersome to learn the relay
fingerprint of a given column.  I'm looking into JavaScript/HTML tricks
that can show text when you hover over a region in the image.  Perhaps
somebody knows more?
[0] https://bugs.torproject.org/12813
[1] http://nms.csail.mit.edu/papers/clustering-imw2002.pdf, Section 2
Cheers,
Philipp

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

[tor-dev] Better relay uptime visualisation