[Replying on list with Damian's permission.]
On 06/06/14 17:21, Damian Johnson wrote:
>> I'm currently extending metrics-lib to fetch descriptors from CollecTor.
>> Maybe Damian extends Stem to do the same (or accepts patches for that
>> purpose).
>
> Probably obvious, but what's the advantage of this over fetching new
> descriptors from authorities and mirrors? What kind of CollecTor
> methods did you have in mind?
If you're only interested in new descriptors, then the authorities and
mirrors are the place to go. But if you care about having the full
history of descriptors, then you should go to CollecTor. Imagine that
your tool might break for a few hours. There's no way to get the
missing data from the directory authorities or mirrors anymore. For
applications like ExoneraTor, where you need to store every single
consensus, this can make the difference. The same holds for Onionoo and
the metrics website, though it's less critical to miss a few hours.
Also, if you only care about relay descriptors, then you should be fine
asking directory authorities and mirrors. But sanitized bridge
descriptors, exit lists, bridge pool assignments, and Torperf
measurements are only available from CollecTor.
Note that there's also a disadvantage of fetching from CollecTor: it's a
single point of failure. Now, we could add mirrors, but it would be
even better to set up a second CollecTor instance (ideally run by
another person) in parallel to the current first instance, and have the
two exchange data. Future work, on my list.
I guess the above should be added to the CollecTor homepage. Do you
have any suggestions where you would have expected to read about this,
or do you have further points to add? I'll add something next week.
All the best,
Karsten