New subject: Metrics for assessing EFF's Tor relay challenge?

4 Apr 2014

      Christian, Lukas, everyone,
I learned today that we should have something working in a week or two.
 That's why I started hacking on this today and produced some code:
https://github.com/kloesing/challenger
Here are a few things I could use help with:
- Anybody want to help turning this script into a web app, possibly
using Flask?  See the first next step in README.md.
- Lukas, you announced OnionPy on tor-dev@ the other day.  Want to look
into the "Add local cache for ..." bullet points under "Next steps"?  Is
this something OnionPy could support?  Want to write the glue code?
- Christian, want to help write the graphing code that visualizes the
`combined-*.json` files produced by that tool?  The README.md suggests a
few possible graphs.
Thanks in advance!  You're all helping grow the Tor network!
Also replying to Christian's mail inline.
On 28/03/14 09:07, Christian wrote:
...
On 27.03.2014 16:25, Karsten Loesing wrote:
...
On 27/03/14 11:57, Roger Dingledine wrote:
...
Hi Christian, other tor relay fans,
I'm looking for some volunteers, hopefully including Christian, to work
on metrics and visualization of impact from new relays.
We're working with EFF to do another "Tor relay challenge" [*], to both
help raise awareness of the value of Tor, and encourage many people to
run relays -- probably non-exit relays for the most part, since that's
the easiest for normal volunteers to step up and do.
You can read about the first round from several years ago here:
https://www.eff.org/torchallenge
To make it succeed, the challenge for us here is to figure out what to
measure to track progress, and then measure it and graph it for everybody.
I'm figuring that like last time, EFF will collect a list of fingerprints
of relays that signed up "because of the challenge".
One of the main pushes we're aiming for this year is longevity: it's
easy to sign up a relay for two weeks and then stop. We want to emphasize
consistency and encourage having the relays up for many months.
Do you want the challenge application to simply provide some graphs or
give some sort of interactive dashboard (clientside JavaScript)?
You asked Roger, and I'm not Roger, but I'd say let's start with some
graphs.  We can always make it more interactive later.  Though I doubt
it will be necessary.
...
...
Before going through your list of things we'd want to track below, let's
first talk about our options to turn a list of fingerprints into fancy
graphs:

Write a new metrics-web module and put graphs on the metrics

website.  This means parsing relay descriptors and storing certain
per-relay statistics for all relays.  That gives us maximum flexibility
in the kinds of statistics, but is also most expensive in terms of
developer hours.  I don't want to do this.

Extend Globe to show details pages for multiple relays.  This

requires us to move to the server-based Globe-node, because the poor
browser shouldn't download graph data for all relays, but the server
should return a single graph for all relays.  It's also unclear if the
new graphs will be of general interest for Globe users, and if the rest
of the Globe details will be confusing to people interested in the relay
challenge.  Probably not a great idea, but I'm not sure.
I agree that Globe isn't the best place to display the challenge graphs.
Currently the only focus for Globe is to provide data for single relays
and bridges.
Imo it would be better if the challenge participants list adds links to
atlas, blutmagie and globe.
Agreed!
...
...

Extend Onionoo to return aggregate graph data for a given set of

fingerprints.  Seems useful.  But has the big disadvantage that Onionoo
would suddenly have to create responses dynamically.  I'm worried about
creating a new performance bottleneck there, and this is certainly not
possible with poor overloaded yatei.

Write a new little tool that fetches Onionoo documents once (or

twice) per day for all relays participating in the relay challenge and
that produces graph data.  That new tool could probably re-use some
Compass code for the backend and some Globe code for the frontend.
Graphs could be integrated directly into EFF's website.  This is
currently my favorite approach.
I like this idea.
Glad to hear!  I slightly moved away from the "fetches once or twice per
day" idea to a more elaborate approach.  But the general idea is still
the same.
...
...
Note for 2--4: Onionoo currently only gives out data for relays that
have been running in the past 7 days.  I'd have to extend it to give out
all data for a list of fingerprints, regardless of when relays were
running the last time.  That's 2--3 days of coding and testing for me.
It's also potentially creating a bottleneck, so we should first have a
replacement for yatei.
...
So what are the things we'd want to track?

Number of relays signed up that are Running, over time.

We can do something here with Onionoo's new uptime documents.
...

Total bandwidth history of these running relays, over time.

We can sum up data from bandwidth documents for this.
...

Maybe a graph showing the total number of bytes ever contributed
by these relays? That would impress people perhaps.

Sure, same data as above.
...

Total consensus weight of these running relays, over time.

We only have total consensus weight *fraction*, but yes.
...

Something emphasizing duration -- e.g. the total consensus weight of
the subset of the relays that have been in the consensus for 90% of
the past month, 2 months, 6 months, etc. Are there better ideas here
I hope? We'll want to be cognizant that if we're in the first week
of the challenge, the 2 month graph will be empty and thus look sad.

Not sure what the 90% part is for, but yes, graphs with total consensus
weight fraction are doable.
Regarding the sad-looking 2 month graph, we can easily define the data
when the challenge starts and not show graphs until they make sense.
Note that the current intervals for most data are 1 week, 1 month, 3
months, 1 year, and 5 years.
...

Something comparing the above numbers to the total numbers. Given how
huge some of the relays are lately, it would be easily to visualize
the new contribution as a tiny irrelevant fraction, which could be
disheartening to new relay operators even if their relays will actually
become a big deal with some patience. What are some strategies for
making this work right? E.g. a layer graph showing y layered on top of
x where y is the new contribution, rather than a percentage-of-total
graph that shows approximately 0%.

Absolute contributions to consensus weight are not available, just
relative fractions.
...
We could also imagine more niche categories. For example, if we're hoping
to get people to sign up relays at universities, we could imagine that
the folks running the challenge give us a list of fingerprints of relays
that self-identify as being at universities, and then we do up the same
set of graphs with that subset of relays.
Sure, that's doable.
...
So, Christian, others, how much of this is possible as-is or with some
limited tweaking, with Globe and related scripts? 
is most of it. :) I also cc Karsten because a lot of this overlaps with
the metrics scripts, but I am expecting Karsten to push back against
the idea of integrating these measurements more with the metrics project.
Right, adding this to the metrics website is not a good idea, because
then we'd have to parse raw relay descriptors.
Somebody else to include here is Sreenatha who has done a pretty good
job processing Onionoo data for the t-shirt yes/no ticket #9889.
...
Any other ideas for what to measure to help people know whether their
contribution is being worthwhile?
Not yet, but new ideas may arise when we start working on the code.
...
[*] Please don't take this mail as any official announcement, or timeline,
or any of that. At this point we need to collect people to help make
this happen, not collect news stories.
What's the timeline for this?  This requires some non-trivial coding
time, and I'm not sure how to prioritize this over existing things on my
todo list.
All the best,
Karsten
(Found nothing else to comment on.)
Thanks!
All the best,
Karsten

Re: [tor-relays] Metrics for assessing EFF's Tor relay challenge?