In the past few months of bridge user graphs, there is an apparent negative correlation between obfs3 users and vanilla users: when one goes up, the other goes down. If you draw a horizontal line at about 5500, they are almost mirror images of each other. I don't see it with any other transport pairs. Any idea why it might be?
I can see what could cause a simultaneous decrease in vanilla and increase in obfs3: Tor gets blocked somewhere and users switch to obfs3. But I wouldn't expect blocking events to look so smooth or happen so frequently, and it doesn't explain why the reverse change happens later (obfs3 being blocked while Tor is unblocked is less plausible). I can also understand the overall long-term trend of obfs3 increasing and vanilla decreasing. But I don't see why they should mirror each other so closely over short time periods.
Some hypotheses: 1. There are lots of users who have a mix of vanilla and obfs3 bridges configured. Their tor (randomly?) chooses one of them, which usually works. The number of such users is constant over the short term; i.e. the sum of obfs3+vanilla is constant, but the proportion of obfs3 and vanilla fluctuates randomly. 2. Maybe vanilla-down/obfs3-up is caused by blocking events, and vanilla-up/obfs3-down is caused by natural new-user churn and/or coincidence. 3. There is something about the way BridgeDB hands out bridges, or the way in which users use it, that causes it to give out obfs3 bridges at the expense of vanilla and vice versa. 4. Some kind of feedback loop: obfs3 bridges get used and get congested, so users switch to vanilla, which then get used and congested, etc.
David Fifield
On Thursday, October 23, 2014 at 10:32 AM, David Fifield wrote:
In the past few months of bridge user graphs, there is an apparent negative correlation between obfs3 users and vanilla users: when one goes up, the other goes down. If you draw a horizontal line at about 5500, they are almost mirror images of each other. I don't see it with any other transport pairs. Any idea why it might be?
I can see what could cause a simultaneous decrease in vanilla and increase in obfs3: Tor gets blocked somewhere and users switch to obfs3. But I wouldn't expect blocking events to look so smooth or happen so frequently, and it doesn't explain why the reverse change happens later (obfs3 being blocked while Tor is unblocked is less plausible). I can also understand the overall long-term trend of obfs3 increasing and vanilla decreasing. But I don't see why they should mirror each other so closely over short time periods.
Some hypotheses:
- There are lots of users who have a mix of vanilla and obfs3 bridges
configured. Their tor (randomly?) chooses one of them, which usually works. The number of such users is constant over the short term; i.e. the sum of obfs3+vanilla is constant, but the proportion of obfs3 and vanilla fluctuates randomly. 2. Maybe vanilla-down/obfs3-up is caused by blocking events, and vanilla-up/obfs3-down is caused by natural new-user churn and/or coincidence.
My guess is that there’s an order
0. vanilla 1. obfs3 2. else
and it *always* starts from the top and tries going down the list until it finds one that works.
So when vanilla is unreachable it tries obfs3. At a later date, when vanilla is no longer blocked, it moves back.
- There is something about the way BridgeDB hands out bridges, or the
way in which users use it, that causes it to give out obfs3 bridges at the expense of vanilla and vice versa. 4. Some kind of feedback loop: obfs3 bridges get used and get congested, so users switch to vanilla, which then get used and congested, etc.
David Fifield _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org (mailto:tor-dev@lists.torproject.org) https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Attachments:
- userstats-bridge-transport-<OR>-obfs3-2014-08-01-2014-10-23.png
On Thu, Oct 23, 2014 at 10:32:41AM -0700, David Fifield wrote:
In the past few months of bridge user graphs, there is an apparent negative correlation between obfs3 users and vanilla users: when one goes up, the other goes down. If you draw a horizontal line at about 5500, they are almost mirror images of each other. I don't see it with any other transport pairs. Any idea why it might be?
I can see what could cause a simultaneous decrease in vanilla and increase in obfs3: Tor gets blocked somewhere and users switch to obfs3. But I wouldn't expect blocking events to look so smooth or happen so frequently, and it doesn't explain why the reverse change happens later (obfs3 being blocked while Tor is unblocked is less plausible). I can also understand the overall long-term trend of obfs3 increasing and vanilla decreasing. But I don't see why they should mirror each other so closely over short time periods.
Some hypotheses:
- There are lots of users who have a mix of vanilla and obfs3 bridges configured. Their tor (randomly?) chooses one of them, which usually works. The number of such users is constant over the short term; i.e. the sum of obfs3+vanilla is constant, but the proportion of obfs3 and vanilla fluctuates randomly.
- Maybe vanilla-down/obfs3-up is caused by blocking events, and vanilla-up/obfs3-down is caused by natural new-user churn and/or coincidence.
- There is something about the way BridgeDB hands out bridges, or the way in which users use it, that causes it to give out obfs3 bridges at the expense of vanilla and vice versa.
- Some kind of feedback loop: obfs3 bridges get used and get congested, so users switch to vanilla, which then get used and congested, etc.
I meant to include a link to the source graph (where you can also experiment with adding other transports).
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
David Fifield
On 23/10/14 19:32, David Fifield wrote:
In the past few months of bridge user graphs, there is an apparent negative correlation between obfs3 users and vanilla users: when one goes up, the other goes down. If you draw a horizontal line at about 5500, they are almost mirror images of each other. I don't see it with any other transport pairs. Any idea why it might be?
Hi David,
I briefly looked at the raw data behind this graph, but didn't find any obvious problems with the algorithm. I'm running out of time now, but I can share some preliminary results in case you want to dig deeper:
- https://people.torproject.org/~karsten/volatile/bridge-users-obfs3-or-mean.p... is the graph that you posted with a third line for mean values.
- https://people.torproject.org/~karsten/volatile/bridge-responses.csv.xz contains numbers of responses (for requested consensuses) by bridge, transport, and time interval.
- https://people.torproject.org/~karsten/volatile/bridge-responses-by-transpor... shows responses by transport.
- https://people.torproject.org/~karsten/volatile/bridge-responses-over-time.p... shows only <OR> and obfs3 responses over time.
- https://people.torproject.org/~karsten/volatile/bridge-responses-or-by-finge... shows only <OR> responses higher than 1000 by fingerprint.
- https://onionoo.torproject.org/details?fingerprint=231E2DE81DC4314F2035D2C0D... is the bridge reporting those high numbers for <OR> responses. Is PacificSunset maybe one of the bundled bridges?
That's all. Maybe it helps?
All the best, Karsten
On Sun, Oct 26, 2014 at 09:08:49AM +0100, Karsten Loesing wrote:
On 23/10/14 19:32, David Fifield wrote:
In the past few months of bridge user graphs, there is an apparent negative correlation between obfs3 users and vanilla users: when one goes up, the other goes down. If you draw a horizontal line at about 5500, they are almost mirror images of each other. I don't see it with any other transport pairs. Any idea why it might be?
I briefly looked at the raw data behind this graph, but didn't find any obvious problems with the algorithm. I'm running out of time now, but I can share some preliminary results in case you want to dig deeper:
is the graph that you posted with a third line for mean values.
contains numbers of responses (for requested consensuses) by bridge, transport, and time interval.
I don't understand this file. Is it the number of times a bridge answered a directory request? The number of times a bridge appeared in a consensus? The number of times it was given out by BridgeDB?
shows responses by transport.
shows only <OR> and obfs3 responses over time.
shows only <OR> responses higher than 1000 by fingerprint.
I had trouble opening these in Evince, so I'm attaching PNGs I generated with convert -density 150 x.pdf x.png
is the bridge reporting those high numbers for <OR> responses. Is PacificSunset maybe one of the bundled bridges?
It is indeed: pref("extensions.torlauncher.default_bridge.obfs3.5", "obfs3 208.79.90.242:35658 BA61757846841D64A83EA2514C766CB92F1FB41F");
I don't understand your line of reasoning in singling it out, though. What do high numbers for <OR> responses suggest to you?
David
On Fri, Oct 31, 2014 at 03:42:06PM -0700, David Fifield wrote:
On Sun, Oct 26, 2014 at 09:08:49AM +0100, Karsten Loesing wrote:
On 23/10/14 19:32, David Fifield wrote:
In the past few months of bridge user graphs, there is an apparent negative correlation between obfs3 users and vanilla users: when one goes up, the other goes down. If you draw a horizontal line at about 5500, they are almost mirror images of each other. I don't see it with any other transport pairs. Any idea why it might be?
I briefly looked at the raw data behind this graph, but didn't find any obvious problems with the algorithm. I'm running out of time now, but I can share some preliminary results in case you want to dig deeper:
is the graph that you posted with a third line for mean values.
contains numbers of responses (for requested consensuses) by bridge, transport, and time interval.
I don't understand this file. Is it the number of times a bridge answered a directory request? The number of times a bridge appeared in a consensus? The number of times it was given out by BridgeDB?
Actually maybe I understand. It must be the number of times a bridge answered a directory request, à la https://research.torproject.org/techreports/counting-daily-bridge-users-2012....
is the bridge reporting those high numbers for <OR> responses. Is PacificSunset maybe one of the bundled bridges?
It is indeed: pref("extensions.torlauncher.default_bridge.obfs3.5", "obfs3 208.79.90.242:35658 BA61757846841D64A83EA2514C766CB92F1FB41F");
I don't understand your line of reasoning in singling it out, though. What do high numbers for <OR> responses suggest to you?
This might be the key to the mystery. It must be that PacificSunset, despite being an obfs3 bridge, doesn't have ExtORPort enabled, so all its obfs3 connections are being counted as <OR> connections.
$ grep 231E2DE81DC4314F2035D2C0D0D043A425FF8999 bridge-responses.csv 231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-03 19:14:20,2014-08-04 00:00:00,21.4 231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-04 00:00:00,2014-08-04 19:14:20,86.6 231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-05 14:26:12,2014-08-06 00:00:00,3151.1 231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-06 00:00:00,2014-08-07 00:00:00,8054.7 ...
That would mean that when a user configures obfs3, their tor (randomly?) chooses one of the 7 default obfs3 bridges. When they happen to get PacificSunset, the obfs3 user count goes down and the <OR> count goes up by an equal amount, the sum remaining unchanged.
Here are the bundled obfs3 bridges and their hashed fingerprints:
https://gitweb.torproject.org/builders/tor-browser-bundle.git/blob/a7f6793b9... $ for a in A09D536DD1752D542E1FBB3C9CE4449D51298239 AF9F66B7B04F8FF6F32D455F05135250A16543C9 58D91C3A631F910F32E18A55441D5A0463BA66E2 BA61757846841D64A83EA2514C766CB92F1FB41F 1E05F577A0EC0213F971D81BF4D86A9E4E8229ED 4C331FA9B3D1D6D8FB0D8FBBF0C259C360D97E6A; do echo $a "->" $(python -c "import hashlib; print hashlib.sha1("$a".decode("hex")).hexdigest().upper()"); done A09D536DD1752D542E1FBB3C9CE4449D51298239 -> 3E0908F131AC417C48DDD835D78FB6887F4CD126 AF9F66B7B04F8FF6F32D455F05135250A16543C9 -> 6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C 58D91C3A631F910F32E18A55441D5A0463BA66E2 -> FAEABF422ECB91C1D96492B06DE2539EDD6BFB0E BA61757846841D64A83EA2514C766CB92F1FB41F -> 231E2DE81DC4314F2035D2C0D0D043A425FF8999 1E05F577A0EC0213F971D81BF4D86A9E4E8229ED -> A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 4C331FA9B3D1D6D8FB0D8FBBF0C259C360D97E6A -> 73D8FF840444F84EC50DD755FBAD44CF1F0DE28B
It looks like PacificSunset is the only one of the bundle bridges that has this bug (reports <OR> and no other transports). The others report small numbers for <OR>, presumably because they get a few <OR> users from BridgeDB.
$ for a in A09D536DD1752D542E1FBB3C9CE4449D51298239 AF9F66B7B04F8FF6F32D455F05135250A16543C9 58D91C3A631F910F32E18A55441D5A0463BA66E2 BA61757846841D64A83EA2514C766CB92F1FB41F 1E05F577A0EC0213F971D81BF4D86A9E4E8229ED 4C331FA9B3D1D6D8FB0D8FBBF0C259C360D97E6A; do grep -i ^$(python -c "import hashlib; print hashlib.sha1("$a".decode("hex")).hexdigest()") bridge-responses.csv; done | awk -F, "{print $1,$3}" | uniq 3E0908F131AC417C48DDD835D78FB6887F4CD126 obfs2 3E0908F131AC417C48DDD835D78FB6887F4CD126 obfs3 3E0908F131AC417C48DDD835D78FB6887F4CD126 obfs4 3E0908F131AC417C48DDD835D78FB6887F4CD126 <OR> 3E0908F131AC417C48DDD835D78FB6887F4CD126 scramblesuit 6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C obfs2 6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C obfs3 6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C <OR> FAEABF422ECB91C1D96492B06DE2539EDD6BFB0E obfs3 FAEABF422ECB91C1D96492B06DE2539EDD6BFB0E <OR> 231E2DE81DC4314F2035D2C0D0D043A425FF8999 <OR> A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 obfs2 A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 obfs3 A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 <OR> 73D8FF840444F84EC50DD755FBAD44CF1F0DE28B obfs2 73D8FF840444F84EC50DD755FBAD44CF1F0DE28B obfs3 73D8FF840444F84EC50DD755FBAD44CF1F0DE28B <OR>
I'll also note that PacificSunset is one of the problematic bridges listed at https://trac.torproject.org/projects/tor/ticket/13504#comment:2.
David Fifield
On 01/11/14 01:44, David Fifield wrote:
This might be the key to the mystery. It must be that PacificSunset, despite being an obfs3 bridge, doesn't have ExtORPort enabled, so all its obfs3 connections are being counted as <OR> connections.
Hi David,
sorry for not replying earlier and for not replying in more detail.
I just wanted to say that your analysis looks plausible to me. Fixing PacificSunset's ExtORPort might indeed fix the metrics graphs.
Thanks for hunting down this issue!
All the best, Karsten
On Wed, Nov 05, 2014 at 12:17:50PM +0100, Karsten Loesing wrote:
On 01/11/14 01:44, David Fifield wrote:
This might be the key to the mystery. It must be that PacificSunset, despite being an obfs3 bridge, doesn't have ExtORPort enabled, so all its obfs3 connections are being counted as <OR> connections.
Hi David,
sorry for not replying earlier and for not replying in more detail.
I just wanted to say that your analysis looks plausible to me. Fixing PacificSunset's ExtORPort might indeed fix the metrics graphs.
Here's one more graph on the subject. The shading in the background indicates when PacificSunset was running (blue) and not running (white). It looks like the periods of highest correlation come while it is running.
I got the uptime information from https://onionoo.torproject.org/uptime?search=pacificsunset.
David Fifield