Hi Damian,
On 5/21/12 5:55 PM, Damian Johnson wrote:
Hi Karsten.
- Bridge network statuses contain a "published" line
Oh, I didn't realize that there was a consensus that included bridges. Mind explaining where they come from and what they're for?
The bridge authority generates a bridge network status that it copies to the BridgeDB host and to the metrics server twice per hour. The bridge network status contains (relay) flags like Running and Stable that BridgeDB uses to decide which bridges to give out. Metrics uses the bridge network status to graph the number of running bridges.
Which category can I find these in on the metrics data page?
The bridge descriptor tarballs contain bridge network statuses, server descriptors, and extra-info descriptors. See:
https://metrics.torproject.org/data.html#bridgedesc
I haven't implemented network status entries yet so changes there aren't a concern, though it would be useful for me to have one as an example.
You'll find an example here:
https://metrics.torproject.org/formats.html#bridgedesc
(I'll also include an example of the suggested format below.)
Server descriptors and extra-info descriptors are stored under the SHA1 hashes of the descriptor identifiers of their non-scrubbed forms.
Stem provides its caller with the descriptor's path but doesn't try to do anything with it, so this isn't a concern.
Okay.
Server descriptors and extra-info descriptors contain a new "router-digest" line with the hex-formatted descriptor identifier.
Not following. Is this new 'router-digest' entry only in the bridge descriptors?
Yes, it would be only in bridge server descriptors and in bridge extra-info descriptors. For relay descriptors, you'd determine the descriptor identifier by calculating the SHA1 of "router [...]\nrouter-signature\n" or "extra-info [...]\nrouter-signature\n". This wouldn't be possible anymore with bridge descriptors anymore, because we'd change some lines or line parts in the sanitizing process. Therefore the extra "router-digest" line.
Is it a bridge equivalent for a relay server descriptor's 'fingerprint' field?
No, the fingerprint is the identity key digest, whereas the descriptor identifier is the descriptor digest.
Again, an example of the new descriptors would be nice to have.
Sure. The bridge network status entry below references the server descriptor via AG/Za6N (base64) = 006FD96B (hex) which in turn references the extra-info descriptor via 068A2E28.
@type bridge-network-status 1.0 published 2012-04-16 11:37:05 [...] r ididnteditheconfig Pp+Rv3MgCzkgeoaIx4uHnGaz0Yo AG/Za6Ned4Wmo7i3X+LiQ1oTvbQ 2012-04-15 22:04:22 10.32.143.78 40187 0 s Fast Guard Running Stable Valid w Bandwidth=55 p reject 1-65535 [...]
@type bridge-server-descriptor 1.0 router ididnteditheconfig 10.32.143.78 40187 0 0 platform Tor 0.2.2.35 (git-73ff13ab3cc9570d) on Linux x86_64 opt protocols Link 1 2 Circuit 1 published 2012-04-15 22:04:22 opt fingerprint 3E9F 91BF 7320 0B39 207A 8688 C78B 879C 66B3 D18A uptime 2 bandwidth 204800 204800 55794 opt extra-info-digest 068A2E28D4C934D9490303B7A645BA068DCA0504 opt hidden-service-dir reject *:* router-digest 006FD96BA35E7785A6A3B8B75FE2E2435A13BDB4
@type bridge-extra-info 1.0 extra-info ididnteditheconfig 3E9F91BF73200B39207A8688C78B879C66B3D18A published 2012-04-15 22:04:22 [...] router-digest 068A2E28D4C934D9490303B7A645BA068DCA0504
Bridge nicknames (#5684) in all descriptor types
Minor tweak for the is_scrubbed() method, but that's all.
Great.
... and dirreq-* statistics lines (#5807) in extra-info descriptors are not sanitized anymore.
I didn't realize that bridge extrainfo descriptors _were_ sanitized. What section of the format page details the scrubbing for those?
Aha, good catch, that's not mentioned on the format page. Right now, dirreq-*, cell-*, and exit-* lines are completely removed. #5807 is about leaving dirreq-* lines in. I'll update the format page next week when the new tarballs are available.
I've never tried running the stem parser over a bridge extrainfo descriptor, so again an example would be useful. :)
Plenty of examples available, e.g.,
https://metrics.torproject.org/data/bridge-descriptors-2012-05.tar.bz2
Thanks, Karsten