On 5/10/12 5:46 PM, Nick Mathewson wrote:
On Thu, May 10, 2012 at 4:31 AM, Karsten Loesing karsten@torproject.org wrote:
here is the proposal as discussed in #5807 to improve our bridge usage statistics.
This is now proposal 201. Thanks!
Thanks for adding the proposal!
However, I found that we may already have the data that I proposed to collect in that proposal. Here's why:
Some bridges report statistics that were originally designed for relays only, including statistics on directory requests. We throw these statistics away in the sanitizing process, so they don't end up in the tarballs. I reported this problem in #5824. But after thinking about it more, I concluded that we might as well use the data instead of starting to collect the very same thing in proposal 201.
I did a quick analysis of the April 2012 bridge descriptors to see if the contained directory request statistics are usable. The result is promising:
https://trac.torproject.org/projects/tor/ticket/5807#comment:5
I suggest to do the following three things:
1. Kill proposal 201. Oops.
2. Discuss whether or not it's safe to leave directory requests reported by bridges in the sanitized bridge descriptor tarballs. I think it's fine to leave them in, because they don't reveal any information that an adversary could use to locate bridges. Here's a sample bridge descriptor with directory request statistics (the dirreq-* lines):
http://freehaven.net/~karsten/volatile/d2549adbc83f5bdffd9bb8a5f525e23556ec2...
3. If there are no concerns in 2., publish bridge descriptor tarballs containing directory request statistics in two weeks from now.
4. Resume counting directory requests by country on bridges like we do on relays. Right now, we only count total requests in "dirreq-v3-resp ok=304 [...]" lines, but we don't count requests by country which is why there are empty "dirreq-v3-reqs" lines. This isn't problematic for the moment, because we can infer country distributions from "bridge-ips" lines. But we should start collecting by-country data, so that we can simply look at "dirreq-v3-reqs" lines in the future.
Sorry for the chaos. But I'm actually quite excited that we have good data for estimating daily bridge users and don't have to start collecting them now in the hope that they'll be useful in a year or two. :)
Best, Karsten