Hello devs,
I'm continuously tweaking the Metrics Portal [0] in the attempt to make it more useful. My latest idea is to finally spin off the Directory Archive part from it, which is the part that serves descriptor tarballs. I'd like to hear what people think about that.
Let me give you some more context. The Metrics Portal serves three main purposes:
1. Graphs [1]: there are graphs on network size, network diversity, user number estimates, and performance measurements. This is probably what most visitors are interested in. In addition to graphs, there are also tables and .csv files for download.
2. Research [2]: we offer descriptor tarballs for download and explain the data formats. This is mostly interesting for researchers and developers. (This is the part that I'd like to spin off and move to a separate place.)
3. Status [3]: we provided (and still provide) services that are based on archived descriptors. This includes ExoneraTor and Consensus Health, both of which have moved to their own sites, and it includes Relay Search which is high on the list of endangered services. While we (did) provide these services on the Metrics Portal, they're almost gone.
So, my plan is to move away the Research part, number 2 above, and re-organize the remaining Graphs part to address visitors who are not necessarily researchers or developers. The new Directory Archive website would contain the following content:
- Start page: possibly re-using parts from the current start page [0]. - Data: How to obtain the data, possibly re-using parts from the current Data page [4], but without separate pages for file lists. - Formats: similar to the current Data Formats page [5]. - Tools: similar to the current Tools page [6].
What would be a good name for the new website holding the Tor Directory Archive? How about:
- CollecTor, collector.torproject.org (not available yet) or - AggregaTor, aggregator.torproject.org (not available yet)
On the software side, I'd like to remove all dynamic (Java) parts from the new website and have it served by Apache alone instead of Tomcat. The only parts that still need to be dynamically generated would be file lists, and I'd like to solve that by using Apache directory listings or some other Apache module.
The re-organized Metrics Portal is going to have the following content:
- Start page: possibly re-using parts from the current start page [0]. - Graphs: all four sub pages from the current Graphs page [1], so Network, Bubbles, Users, and Performance. - Aggregated data: the processed data behind graphs that is currently available on the Statistics page [7].
Speaking of, if anybody wants to help design the new website (or help re-design the existing Metrics Portal website once the cruft is gone), your help would be much appreciated. Bonus points if no JavaScript is required, at least for the Directory Archive website. Please contact me if you're interested.
Any feedback welcome! Thanks!
All the best, Karsten
[0] https://metrics.torproject.org/ [1] https://metrics.torproject.org/graphs.html [2] https://metrics.torproject.org/research.html [3] https://metrics.torproject.org/status.html [4] https://metrics.torproject.org/data.html [5] https://metrics.torproject.org/formats.html [6] https://metrics.torproject.org/tools.html [7] https://metrics.torproject.org/stats.html
On 25/05/14 10:35, Karsten Loesing wrote:
I'm continuously tweaking the Metrics Portal [0] in the attempt to make it more useful. My latest idea is to finally spin off the Directory Archive part from it, which is the part that serves descriptor tarballs.
Ta-da! ===> https://collector.torproject.org/ <=== New website!
Here's what changed compared to providing directory archive data on the metrics website:
- Archive tarballs are now provided in a directory structure rather than a single directory: https://collector.torproject.org/archive/
- Recently published descriptors can now be accessed much more easily: https://collector.torproject.org/recent/
- Updated documentation of descriptor formats: https://collector.torproject.org/formats.html
- Please note that descriptors via rsync, both metrics-archive and metrics-recent, will no longer be available, because everything is now available via https. Grace period runs until August 4, 2014.
- Preliminary logo suggested by Jeroen and very quickly put together: https://people.torproject.org/~karsten/volatile/collector-logo.png -- if you're a graphic designer and want to contribute one hour of your time to design that for real, please contact me!
Thanks to Damian, Jeroen, and Lunar for doing a first quick review round. Further feedback is much appreciated.
And now, it's finally time clean up the metrics website...
All the best, Karsten
On Wed, Jun 04, 2014 at 04:54:03PM +0200, Karsten Loesing wrote:
On 25/05/14 10:35, Karsten Loesing wrote:
I'm continuously tweaking the Metrics Portal [0] in the attempt to make it more useful. My latest idea is to finally spin off the Directory Archive part from it, which is the part that serves descriptor tarballs.
Ta-da! ===> https://collector.torproject.org/ <=== New website!
Looks great!
I added the service to: https://trac.torproject.org/projects/tor/wiki/org/operations/Infrastructure
- Recently published descriptors can now be accessed much more easily:
That's a very useful feature.
- Preliminary logo suggested by Jeroen and very quickly put together:
https://people.torproject.org/~karsten/volatile/collector-logo.png -- if you're a graphic designer and want to contribute one hour of your time to design that for real, please contact me!
Hmm, that seems to be the octopus which is part of USA-247's logo: http://en.wikipedia.org/wiki/USA-247
Hopefully, somebody can contribute a better one.
Cheers, Philipp
On Fri, Jun 6, 2014 at 1:18 PM, Philipp Winter phw@nymity.ch wrote:
On Wed, Jun 04, 2014 at 04:54:03PM +0200, Karsten Loesing wrote:
On 25/05/14 10:35, Karsten Loesing wrote:
I'm continuously tweaking the Metrics Portal [0] in the attempt to make it more useful. My latest idea is to finally spin off the Directory Archive part from it, which is the part that serves descriptor tarballs.
Ta-da! ===> https://collector.torproject.org/ <=== New website!
Looks great!
Seconded - very awesome indeed!
I added the service to: https://trac.torproject.org/projects/tor/wiki/org/operations/Infrastructure
- Recently published descriptors can now be accessed much more easily:
That's a very useful feature.
Am I right to assume that any service/program/client that relied on metrics "rsync the recent/ folder" feature should migrate to using https://collector.torproject.org/recent/ ?
One thing that's neat with rsync is that it can take care of any lapses in service (on either the metrics data backend side, or on the client-which-is-downloading-the-data side) - it will just automagically mirror all the consensuses (if this is needed by the client/program/etc.)
Of course, it's very easy to just make the client check if it has any lapses/holes in its (historical) view of the needed data, and to make it re-download (wget, whatever) the missing parts as needed.
Just wanted to make sure there'll be no rsync-recent-metrics-data service any more (correct me if i got this wrong.)
- Preliminary logo suggested by Jeroen and very quickly put together:
https://people.torproject.org/~karsten/volatile/collector-logo.png -- if you're a graphic designer and want to contribute one hour of your time to design that for real, please contact me!
Hmm, that seems to be the octopus which is part of USA-247's logo: http://en.wikipedia.org/wiki/USA-247
Quite sure this was some cheeky intended satire :) Really like the logo, btw ;)
Hopefully, somebody can contribute a better one.
Cheers, Philipp
Kostas
On 06/06/14 12:49, Kostas Jakeliunas wrote:
Am I right to assume that any service/program/client that relied on metrics "rsync the recent/ folder" feature should migrate to using https://collector.torproject.org/recent/ ?
One thing that's neat with rsync is that it can take care of any lapses in service (on either the metrics data backend side, or on the client-which-is-downloading-the-data side) - it will just automagically mirror all the consensuses (if this is needed by the client/program/etc.)
Of course, it's very easy to just make the client check if it has any lapses/holes in its (historical) view of the needed data, and to make it re-download (wget, whatever) the missing parts as needed.
Just wanted to make sure there'll be no rsync-recent-metrics-data service any more (correct me if i got this wrong.)
Yes, I'm planning to shut down the rsync service after August 4, 2014. I created a new file and put it into the root of both metrics-archive and metrics-recent:
NOTICE-THIS-SERVICE-WILL-BE-SHUT-DOWN-AFTER-AUGUST-4-2014.txt
""" Dear rsync user,
the rsync service on this host will be SHUT DOWN after August 4, 2014.
The contents in this directory are now available on the CollecTor homepage and will remain available after that date:
https://collector.torproject.org/
https://collector.torproject.org/archive/
https://collector.torproject.org/recent/
The reason for shutting down the rsync service is that every service we run causes maintenance costs and makes it harder to change things in the future. That's why we have to shut down services sometimes, even when people are still using them. In this case we think we can provide a good alternative by serving all files securely via https.
Sorry for any inconvenience caused by this!
Karsten Loesing, Tor Project """
I'm currently extending metrics-lib to fetch descriptors from CollecTor. Maybe Damian extends Stem to do the same (or accepts patches for that purpose).
I'm also planning to send out a reminder about the August 4 date four weeks before shutting down the service.
- Preliminary logo suggested by Jeroen and very quickly put together:
https://people.torproject.org/~karsten/volatile/collector-logo.png -- if you're a graphic designer and want to contribute one hour of your time to design that for real, please contact me!
Hmm, that seems to be the octopus which is part of USA-247's logo: http://en.wikipedia.org/wiki/USA-247
Quite sure this was some cheeky intended satire :) Really like the logo, btw ;)
I like it, too. :) But I'd understand if others think it's a bad idea.
All the best, Karsten
On 06/06/14 16:11, Karsten Loesing wrote:
On 06/06/14 12:49, Kostas Jakeliunas wrote:
Am I right to assume that any service/program/client that relied on metrics "rsync the recent/ folder" feature should migrate to using https://collector.torproject.org/recent/ ?
One thing that's neat with rsync is that it can take care of any lapses in service (on either the metrics data backend side, or on the client-which-is-downloading-the-data side) - it will just automagically mirror all the consensuses (if this is needed by the client/program/etc.)
Of course, it's very easy to just make the client check if it has any lapses/holes in its (historical) view of the needed data, and to make it re-download (wget, whatever) the missing parts as needed.
Just wanted to make sure there'll be no rsync-recent-metrics-data service any more (correct me if i got this wrong.)
Yes, I'm planning to shut down the rsync service after August 4, 2014. I created a new file and put it into the root of both metrics-archive and metrics-recent:
NOTICE-THIS-SERVICE-WILL-BE-SHUT-DOWN-AFTER-AUGUST-4-2014.txt
""" Dear rsync user,
the rsync service on this host will be SHUT DOWN after August 4, 2014.
The contents in this directory are now available on the CollecTor homepage and will remain available after that date:
https://collector.torproject.org/
https://collector.torproject.org/archive/
https://collector.torproject.org/recent/
The reason for shutting down the rsync service is that every service we run causes maintenance costs and makes it harder to change things in the future. That's why we have to shut down services sometimes, even when people are still using them. In this case we think we can provide a good alternative by serving all files securely via https.
Sorry for any inconvenience caused by this!
Karsten Loesing, Tor Project """
I'm currently extending metrics-lib to fetch descriptors from CollecTor. Maybe Damian extends Stem to do the same (or accepts patches for that purpose).
I'm also planning to send out a reminder about the August 4 date four weeks before shutting down the service.
Reminder: the rsync service running on metrics.torproject.org will be shut down after
August 4, 2014.
All the best, Karsten
On 06/06/14 12:18, Philipp Winter wrote:
On Wed, Jun 04, 2014 at 04:54:03PM +0200, Karsten Loesing wrote:
On 25/05/14 10:35, Karsten Loesing wrote:
I'm continuously tweaking the Metrics Portal [0] in the attempt to make it more useful. My latest idea is to finally spin off the Directory Archive part from it, which is the part that serves descriptor tarballs.
Ta-da! ===> https://collector.torproject.org/ <=== New website!
Looks great!
I added the service to: https://trac.torproject.org/projects/tor/wiki/org/operations/Infrastructure
Thanks, forgot to do that.
- Recently published descriptors can now be accessed much more easily:
That's a very useful feature.
It is!
- Preliminary logo suggested by Jeroen and very quickly put together:
https://people.torproject.org/~karsten/volatile/collector-logo.png -- if you're a graphic designer and want to contribute one hour of your time to design that for real, please contact me!
Hmm, that seems to be the octopus which is part of USA-247's logo: http://en.wikipedia.org/wiki/USA-247
Hopefully, somebody can contribute a better one.
You don't like octopuses? :) That being said, I'm open to other suggestions. And not having any logo might be okay, too.
All the best, Karsten