-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello devs,
the Tor Metrics website [0] claims to be "the primary place to learn interesting facts about the Tor network" and invites its visitors who "come across something that is missing" to contact the website authors about it. That's a bold statement I put there! :)
Yet, there's considerable product backlog with possible enhancements [1] that doesn't seem to ever become shorter. Even worse, it can be expected that the backlog will refill quickly once the community notices that feature requests are suddenly considered. The main reason for this unfortunate situation is that Tor Metrics contains many moving parts, including some heavy database lifting that takes place below the surface, that all want to be maintained. Adding more parts just makes the whole thing even more likely to break. At the same time, knowing about the situation that Tor Metrics has become almost closed to contributions is painful.
This posting shall discuss possible solutions. The goal is to let Tor Metrics grow in a healthy fashion that encourages contributions from the community. These solutions are not mutually exclusive, and the best solution may use parts of more than one solution sketched out here.
1 Make Tor Metrics better and bigger, internally
The obvious solution is that the maintainers of Tor Metrics could just work harder to overcome the problems stated above. Let's think this through.
1.1 Add more development resources
If only the current Tor Metrics maintainers had more time to devote to cleaning up existing parts and to add new parts, that would solve our problem. They could refactor parts that are hard to maintain, and they could work off the serious backlog that has piled up. Of course, this means dropping or handing over responsibilities for other products, and it may mean finding (and paying) new developers to help maintain Tor Metrics. It's unclear whether anything like this would fit into Tor's budget, and whether these changed priorities would make users of tools that had to be dropped or handed over unhappy.
1.2 Rewrite internal parts of Tor Metrics to encourage external contributions
Most of Tor Metrics would have run 10 or 15 years ago with only minor modifications. It's not necessarily a bad thing to use established technologies. But maybe, if we rewrite it using modern data-processing, web, and visualization frameworks, it becomes more attractive to other developers to contribute code and help maintain existing (well, then rewritten) code. The result would be a larger Tor Metrics website that is easier to maintain and hopefully maintained by more people. It's unclear how realistic this plan is, though, and it requires attention by Tor Metrics maintainers to bring it enough into shape for external contributors to get involved.
2 Add more ways to contribute to Tor Metrics externally
It may be possible to further grow Tor Metrics without adding more code to it, hence not making it any harder to maintain. However, if code to generate visualizations is run elsewhere, there's a certain risk that results are not perceived as trustworthy as if that code were run as part of Metrics. This is primarily a problem of setting user expectations right. We could add different ways for contributing to Tor Metrics, depending on the level of commitment that contributors are willing to make. Possible new ways (in addition to filing a Trac ticket, which is already possible, though not very effective) are:
2.1 Accept contribution of static data or static graphs
Somebody might contribute data (in a tarball, download link, etc.) or a static graph (static as in "doesn't break, ever", not "static HTML with a tiny amount of JavaScript that will surely never break"). The Tor Metrics team reviews that and puts it on the Tor Metrics website, together with a short description, author information, license, etc. There are plenty of visualizations on Trac and on the mailing lists, so we'll have to define criteria what we add and what not, and we'll need a good process for making that happen.
2.2 Link to external websites
Somebody might write a website that visualizes Tor network data. The Tor Metrics team reviews the idea behind it, but not necessarily look at its code, and adds an external link to Tor Metrics. It becomes obvious that the authors remain responsible for their visualization, so there's no risk involved for Tor Metrics, but users may not trust it as much, because it doesn't have the Tor Metrics label. Note that we're already doing this approach by linking to the visualizations showing "Tor users as percentage of larger Internet population" [2] and "Data flow in the Tor network" [3]. Also note that we could as well have hosted the former directly on Tor Metrics with appropriate attribution, because it's a static image. This is not the case with the latter.
2.3 Run an externally developed website as if it were part of Tor Metrics
Let's imagine that somebody produces a visualization of Tor network data and would like to make it part of Tor Metrics but without limiting themselves to the technology used by Tor Metrics. We could let them write their visualization as website and integrate it into Tor Metrics after reviewing its code.
Technically, part of this integration would be to "redress" the website by applying the Tor Metrics design (which has lots of room for improvement, but let's just say the result will look as seamlessly integrated into Tor Metrics as the "Network bubble graphs" [4]). Another part would probably be to rewrite web requests, so that users still think they're talking to https://metrics.torproject.org/, but really they're talking to another webserver behind that.
Regarding hosting and maintenance, in theory, the website could be hosted by the original creators, but that effectively means that the Tor Metrics team gives up part of the control about what's on the Tor Metrics website. The creators of the external website could change parts or add new parts that wouldn't be reviewed by Tor Metrics developers, but they would be perceived as part of Metrics, which seems bad. The Tor Metrics team could run the externally developed website on a separate host or on the same host as Tor Metrics. We could imagine variants where the original creator stays around to fix any issues as they come up, or we could imagine that they donate their visualization that the Tor Metrics people will then maintain. We could even imagine that the Tor Metrics maintainers some day decide to integrate the originally external website into Tor Metrics proper, but that would not be required for this model to work.
All these ideas require writing down guidelines, criteria, and processes. In particular, they require more thoughts and input from other people who are not currently involved in Tor Metrics maintenance and who can be expected more objective. And once these ideas are implemented, we'll need more Tor Metrics maintainer than just one.
What are your thoughts?
All the best, Karsten
[0] https://metrics.torproject.org/
[1] https://trac.torproject.org/projects/tor/query?status=!closed&component=...
[2] https://metrics.torproject.org/oxford-anonymous-internet.html
[3] https://metrics.torproject.org/uncharted-data-flow.html
[4] https://metrics.torproject.org/bubbles.html