Over at https://bugs.torproject.org/9316, we are working on having BridgeDB export metrics. The patch is almost done and I deployed the work-in-progress code on BridgeDB, so we can take a look at the metrics and think of ways to improve them. The metrics format encodes the approximate number of requests per distribution mechanism per transport per country per success/fail. All numbers are rounded up to the next multiple of 10. The last field, "none", will be used for an anomaly score and is currently unused.
For example, the line
bridgedb-metric-count email.obfs4.riseup.success.none 10
tells us that there have been 1-10 successful email requests for obfs4 coming from Riseup addresses.
I attached 24 hours worth of metrics to this email. Keep the following issues in mind:
* My feature branch hasn't been reviewed yet and likely still has bugs, so take all numbers with a grain of salt.
* The country codes are based on Debian stretch's geoip-database, which is slightly outdated and uses Maxmind's far-from-perfect GeoLite database.
* The country code "??" refers to geo-location failure or lack of IP addresses (in the cast of moat). The country code "zz" refers to a request from a Tor exit relay.
Some observations:
* Gmail sees much more use than Riseup. That's no surprise.
* The email distributor sees more vanilla than obfs4 requests. I wonder to what degree this is caused by the poor UX of the email distributor.
* For HTTPS, many countries have a fail and success bucket of 10 each. I would expect this to be at least one user who failed the captcha at least once before finally getting it right.
* The captcha success rate for obfs4 over moat is 54%. That's very low and must cause lots of frustration for users. This is a known issue that's tracked in this ticket: https://bugs.torproject.org/29695
* Ignore the large amount of HTTPS requests from "zz" -- I expect the vast majority of these to be a bot that's interacting with BridgeDB over exit relays.
After a cursory look at the numbers, I would like to aggregate the data, to make it easier to compare distributors, transports, and countries. For example: how do moat, email, and HTTPS rank in popularity? I'll improve the patch to keep track of these numbers in separate metrics.
Any thoughts or suggestions?
Cheers, Philipp