Hi Christian, other tor relay fans,
I'm looking for some volunteers, hopefully including Christian, to work on metrics and visualization of impact from new relays.
We're working with EFF to do another "Tor relay challenge" [*], to both help raise awareness of the value of Tor, and encourage many people to run relays -- probably non-exit relays for the most part, since that's the easiest for normal volunteers to step up and do.
You can read about the first round from several years ago here: https://www.eff.org/torchallenge
To make it succeed, the challenge for us here is to figure out what to measure to track progress, and then measure it and graph it for everybody.
I'm figuring that like last time, EFF will collect a list of fingerprints of relays that signed up "because of the challenge".
One of the main pushes we're aiming for this year is longevity: it's easy to sign up a relay for two weeks and then stop. We want to emphasize consistency and encourage having the relays up for many months.
So what are the things we'd want to track?
- Number of relays signed up that are Running, over time. - Total bandwidth history of these running relays, over time. - Maybe a graph showing the total number of bytes ever contributed by these relays? That would impress people perhaps. - Total consensus weight of these running relays, over time. - Something emphasizing duration -- e.g. the total consensus weight of the subset of the relays that have been in the consensus for 90% of the past month, 2 months, 6 months, etc. Are there better ideas here I hope? We'll want to be cognizant that if we're in the first week of the challenge, the 2 month graph will be empty and thus look sad. - Something comparing the above numbers to the total numbers. Given how huge some of the relays are lately, it would be easily to visualize the new contribution as a tiny irrelevant fraction, which could be disheartening to new relay operators even if their relays will actually become a big deal with some patience. What are some strategies for making this work right? E.g. a layer graph showing y layered on top of x where y is the new contribution, rather than a percentage-of-total graph that shows approximately 0%.
We could also imagine more niche categories. For example, if we're hoping to get people to sign up relays at universities, we could imagine that the folks running the challenge give us a list of fingerprints of relays that self-identify as being at universities, and then we do up the same set of graphs with that subset of relays.
So, Christian, others, how much of this is possible as-is or with some limited tweaking, with Globe and related scripts? I am hoping the answer is most of it. :) I also cc Karsten because a lot of this overlaps with the metrics scripts, but I am expecting Karsten to push back against the idea of integrating these measurements more with the metrics project.
Any other ideas for what to measure to help people know whether their contribution is being worthwhile?
[*] Please don't take this mail as any official announcement, or timeline, or any of that. At this point we need to collect people to help make this happen, not collect news stories.
Thanks! --Roger