Metrics for assessing EFF's Tor relay challenge? - tor-relays

27 Mar 2014


      Hi Christian, other tor relay fans,
I'm looking for some volunteers, hopefully including Christian, to work
on metrics and visualization of impact from new relays.
We're working with EFF to do another "Tor relay challenge" [*], to both
help raise awareness of the value of Tor, and encourage many people to
run relays -- probably non-exit relays for the most part, since that's
the easiest for normal volunteers to step up and do.
You can read about the first round from several years ago here:
https://www.eff.org/torchallenge
To make it succeed, the challenge for us here is to figure out what to
measure to track progress, and then measure it and graph it for everybody.
I'm figuring that like last time, EFF will collect a list of fingerprints
of relays that signed up "because of the challenge".
One of the main pushes we're aiming for this year is longevity: it's
easy to sign up a relay for two weeks and then stop. We want to emphasize
consistency and encourage having the relays up for many months.
So what are the things we'd want to track?
- Number of relays signed up that are Running, over time.
- Total bandwidth history of these running relays, over time.
- Maybe a graph showing the total number of bytes ever contributed
  by these relays? That would impress people perhaps.
- Total consensus weight of these running relays, over time.
- Something emphasizing duration -- e.g. the total consensus weight of
  the subset of the relays that have been in the consensus for 90% of
  the past month, 2 months, 6 months, etc. Are there better ideas here
  I hope? We'll want to be cognizant that if we're in the first week
  of the challenge, the 2 month graph will be empty and thus look sad.
- Something comparing the above numbers to the total numbers. Given how
  huge some of the relays are lately, it would be easily to visualize
  the new contribution as a tiny irrelevant fraction, which could be
  disheartening to new relay operators even if their relays will actually
  become a big deal with some patience. What are some strategies for
  making this work right? E.g. a layer graph showing y layered on top of
  x where y is the new contribution, rather than a percentage-of-total
  graph that shows approximately 0%.
We could also imagine more niche categories. For example, if we're hoping
to get people to sign up relays at universities, we could imagine that
the folks running the challenge give us a list of fingerprints of relays
that self-identify as being at universities, and then we do up the same
set of graphs with that subset of relays.
So, Christian, others, how much of this is possible as-is or with some
limited tweaking, with Globe and related scripts? I am hoping the answer
is most of it. :) I also cc Karsten because a lot of this overlaps with
the metrics scripts, but I am expecting Karsten to push back against
the idea of integrating these measurements more with the metrics project.
Any other ideas for what to measure to help people know whether their
contribution is being worthwhile?
[*] Please don't take this mail as any official announcement, or timeline,
or any of that. At this point we need to collect people to help make
this happen, not collect news stories.
Thanks!
--Roger