[Replying on list with Damian's permission.]
On 06/06/14 17:21, Damian Johnson wrote:
I'm currently extending metrics-lib to fetch descriptors from CollecTor. Maybe Damian extends Stem to do the same (or accepts patches for that purpose).
Probably obvious, but what's the advantage of this over fetching new descriptors from authorities and mirrors? What kind of CollecTor methods did you have in mind?
If you're only interested in new descriptors, then the authorities and mirrors are the place to go. But if you care about having the full history of descriptors, then you should go to CollecTor. Imagine that your tool might break for a few hours. There's no way to get the missing data from the directory authorities or mirrors anymore. For applications like ExoneraTor, where you need to store every single consensus, this can make the difference. The same holds for Onionoo and the metrics website, though it's less critical to miss a few hours.
Also, if you only care about relay descriptors, then you should be fine asking directory authorities and mirrors. But sanitized bridge descriptors, exit lists, bridge pool assignments, and Torperf measurements are only available from CollecTor.
Note that there's also a disadvantage of fetching from CollecTor: it's a single point of failure. Now, we could add mirrors, but it would be even better to set up a second CollecTor instance (ideally run by another person) in parallel to the current first instance, and have the two exchange data. Future work, on my list.
I guess the above should be added to the CollecTor homepage. Do you have any suggestions where you would have expected to read about this, or do you have further points to add? I'll add something next week.
All the best, Karsten