David Goulet:
Hello everyone!
Since July 2017, there has been a steady decline in relays from ~7k to now ~6.5k. This is a bit unusual that is we don't see often such a steady behavior of relays going offline (at least that I can remember...).
It could certainly be something normal here. However, we shouldn't rule out a bug in tor as well. The steadyness of the decline makes me a bit more worried than usual.
You can see the decline has started around July 2017:
https://metrics.torproject.org/networksize.html?start=2017-06-01&end=201...
What happened around July in terms of tor release:
2017-06-08 09:35:17 -0400 802d30d9b7 (tag: tor-0.3.0.8) 2017-06-08 09:47:44 -0400 e14006a545 (tag: tor-0.2.5.14) 2017-06-08 09:47:58 -0400 aa89500225 (tag: tor-0.2.9.11) 2017-06-08 09:55:28 -0400 f833164576 (tag: tor-0.2.4.29) 2017-06-08 09:55:58 -0400 21a9e5371d (tag: tor-0.2.6.12) 2017-06-08 09:56:15 -0400 3db01d3b56 (tag: tor-0.2.7.8) 2017-06-08 09:58:36 -0400 64ac28ef5d (tag: tor-0.2.8.14) 2017-06-08 10:15:41 -0400 dc47d936d4 (tag: tor-0.3.1.3-alpha) ... 2017-06-29 16:56:13 -0400 fab91a290d (tag: tor-0.3.1.4-alpha) 2017-06-29 17:03:23 -0400 22b3bf094e (tag: tor-0.3.0.9) ... 2017-08-01 11:33:36 -0400 83389502ee (tag: tor-0.3.1.5-alpha) 2017-08-02 11:50:57 -0400 c33db290a9 (tag: tor-0.3.0.10)
Note that on August 1st 2017, 0.2.4, 0.2.6 and 0.2.7 went end of life.
That being said, I don't have an easy way to list which relays went offline during the decline (since July basically) to see if a common pattern emerges.
So few things. First, if anyone on this list noticed that their relay went off the consensus while still having tor running, it is a good time to inform this thread :).
Second, anyone could have an idea of what possibly is going on that is have one or more theories. Even better, if you have some tooling to try to list which relays went offline, that would be _awesome_.
Third, knowing what was the state of packaging in Debian/Redhat/Ubuntu/... around July could be useful. What if a package in distro X is broken and the update have been killing the relays? Or something like that...
Last, looking at the dirauth would be a good idea. Basically, when did the majority switched to 030 and then 031. Starting in July, what was the state of the dirauth version?
Any help is very welcome! Again, this decline could be from natural cause but for now I just don't want to rule out an issue in tor or packaging.
(Replying to OP since it went OT)
As some of you know, TDP did a little suite of shell scripts based on OONI data to look at diversity statistics:
https://torbsd.github.io/oostats.html
With the source here for further tinkering:
https://github.com/torbsd/tdp-onion-stats/
Maybe something we could look at is "exception reports", which in some industries means regular reports that look at anomalies or "exceptions" which display out-of-the-ordinary statistics, generally prompting some sort of action.
In other words, daily reports would be run on, say, bw consensus by country, and if there was some statistically significant change over N periods of time, it would be noted. Or if a particular OS drops or jumps. Or if a particular AS jumps or declines for relays, bridges, whatever.
If done right, a bunch of these reports could point to particular changes to the network that need further investigation, and in some cases, might quickly point to the related issue. Eg, countryX shutdown ISP with a particular AS number, etc.
The more reports coupled with careful optimization over time could become an alarm system for Tor network changes, instead of just "er, such-and-such distro didnt update their packages then, I just found out in git."
Thoughts?
g