On Tue, Dec 08, 2015 at 10:47:08AM +1100, Tim Wilson-Brown - teor wrote:
On 8 Dec 2015, at 10:43, Tom Ritter <[1]tom@ritter.vg> wrote: On 7 December 2015 at 13:51, Philipp Winter <[2]phw@nymity.ch> wrote: I spent some time improving the existing relay uptime visualisation [0]. Inspired by a research paper [1], the new algorithm uses single-linkage clustering with Pearson's correlation coefficient as distance function. The idea is that relays are grouped next to each other if their uptime (basically a binary sequence) is highly correlated. Check out the following gallery. It contains monthly relay uptime images, dating back to 2007: <[3]https://nymity.ch/sybilhunting/uptime-visualisation/> If you aren't familiar with this type of visualisation: Every image shows the uptime of all Tor relays that were online in a given month. Every row is a consensus and every column is a relay. White pixels mean that a relay was offline and black pixels means that a relay was online. Red pixels are used to highlight suspiciously similar clusters. That's really cool. It seems to imply that the majority of the tor network stop operating halfway through the month though... Do the other tor graphs take into account hibernating relays? For example, I would expect the time-to-download graph would be somewhat affected: [4]https://metrics.torproject.org/torperf.html?graph=torperf&start= 2015-10-01&end=2015-10-31&source=all&filesize=5mb
Hibernating relays run from the start of their first period to gauge load. Then they start at a random time during the day/month, but early enough that they think they'll still use all their bandwidth.
I wonder if we're seeing another phenomenon? (daily / monthly server restarts?) Or we could be seeing hibernation failing to work as intended.
Relays turn on or off all the time. Of all the descriptors seen in a year, less than 10% are continuously running the whole time. The rest either started at some time or stopped at some time or both. See an example here for 2014: https://people.torproject.org/~dcf/graphs/microdescs/microdescs-2014-short.p... All we're seeing is the distributions of the dates at which the subset of relays that stopped during the month actually stopped, which seems pretty uniform. I'll bet that if you look at those relays in the previous month, they are running at the end of the month, not hibernating.