In comparing the user graphs of pluggable transports, I found that there seems to be a correlation between the graphs of flashproxy and meek. They are not equal, but they seem to go up and down at the same time. Other transports don't show such an effect as far as I can tell. What could the cause be? I am trying to rule out a measurement error. Could it be an artifact of both transports running on the same bridge?
After the "How to use meek" blog post on August 15, there was a noticeable increase in the number of meek users, from about 5 to about 20 concurrent users:
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
The blog post also seems to have had a positive effect on flashproxy and scramblesuit, too. I suspect it is because the blog post also named other transports and suggested to try them first:
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a... https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
On the other hand, I don't see any noticeable rise for obfs3 and fte:
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a... https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
What's interesting is when we plot meek and flashproxy together (this is the first attached image). There seems to be a strong correspondence between the two. They both jump on August 15, as expected, but then they both jump again together on August 21. In early September they settle back down, but still mostly match each other's up/down patterns:
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
What's really intriguing is that the apparent correlation goes back in time, even before August 15. Compare the shapes of the graphs during July, for example.
The correlation between meek and scramblesuit appears to be not as strong (this is the second attached image). There's a rise around August 15, but no big secondary spike, although the first part of September matches pretty well.
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
There's no obvious positive correlation between meek and fte (this is the third attached image) nor meek and obfs3 (this is the fourth attached image).
https://metrics.torproject.org/users.html?graph=userstats-bridge-transport&a...
Why could this be? Some hypotheses: * meek and flashproxy have similar capabilities (resistance to IP blocking), so maybe the circumstances where they work are similar. If one works, so does the other, so we expect them to more or less track one another. * There is a bug in metrics collection, so that, for example, sometimes a meek connection is counted as flashproxy or vice versa. The backend bridge for both meek and flashproxy is the same; i.e., tor2.bamsoftware.com. (Before May 8, 2014, it was a different bridge, namely tor1, but both transports were still running on it together.) * After my blog post, a bunch of people just picked transports at random until they found one that worked, which was likely one of {flashproxy, meek, scramblesuit}, and now we're just seeing the aggregation of those user's daily usage. Doesn't explain a match before August 15, though.
To figure this out I'm thinking of i) counting bytes transferred on the flashproxy and meek external ports, or ii) moving one to a different bridge (or different tor instance), to see if the effect remains. Do you have any other ideas?
David Fifield
On 16/09/14 03:36, David Fifield wrote:
In comparing the user graphs of pluggable transports, I found that there seems to be a correlation between the graphs of flashproxy and meek. [...]
To figure this out I'm thinking of i) counting bytes transferred on the flashproxy and meek external ports, or ii) moving one to a different bridge (or different tor instance), to see if the effect remains. Do you have any other ideas?
Hi David,
here's what I think might cause this: we're counting consensuses downloaded from a bridge via any supported transport, and then we're attributing those downloads to specific transports based on what fraction of IPs connected per transport.
What we should do instead is count consensus downloads by transport. There's a ticket for this, but nobody is currently working on it:
https://trac.torproject.org/projects/tor/ticket/8786
Your idea ii) should fix this.
Of course, you'd be in a good position to test a patch for #8786. Would you want to hack on that?
All the best, Karsten
On Tue, Sep 16, 2014 at 05:43:45AM +0200, Karsten Loesing wrote:
On 16/09/14 03:36, David Fifield wrote:
In comparing the user graphs of pluggable transports, I found that there seems to be a correlation between the graphs of flashproxy and meek. [...]
To figure this out I'm thinking of i) counting bytes transferred on the flashproxy and meek external ports, or ii) moving one to a different bridge (or different tor instance), to see if the effect remains. Do you have any other ideas?
Hi David,
here's what I think might cause this: we're counting consensuses downloaded from a bridge via any supported transport, and then we're attributing those downloads to specific transports based on what fraction of IPs connected per transport.
I see! Thank you. I imagine it would make a big difference in this case, because flash proxy and meek are polar opposites: flash proxy gets connections from tons of random IPs (often different IPs for the same client), and meek is always getting connections from the same CDN edge servers (the same IP for many different clients). If I understand it right, we are over-counting flash proxy and over-counting meek.
What we should do instead is count consensus downloads by transport. There's a ticket for this, but nobody is currently working on it:
https://trac.torproject.org/projects/tor/ticket/8786
Your idea ii) should fix this.
Of course, you'd be in a good position to test a patch for #8786. Would you want to hack on that?
We'll see :) For the time being I'll try isolating the transports and see what effect it has.
David Fifield
On 16/09/14 06:07, David Fifield wrote:
On Tue, Sep 16, 2014 at 05:43:45AM +0200, Karsten Loesing wrote:
On 16/09/14 03:36, David Fifield wrote:
In comparing the user graphs of pluggable transports, I found that there seems to be a correlation between the graphs of flashproxy and meek. [...]
To figure this out I'm thinking of i) counting bytes transferred on the flashproxy and meek external ports, or ii) moving one to a different bridge (or different tor instance), to see if the effect remains. Do you have any other ideas?
Hi David,
here's what I think might cause this: we're counting consensuses downloaded from a bridge via any supported transport, and then we're attributing those downloads to specific transports based on what fraction of IPs connected per transport.
I see! Thank you. I imagine it would make a big difference in this case, because flash proxy and meek are polar opposites: flash proxy gets connections from tons of random IPs (often different IPs for the same client), and meek is always getting connections from the same CDN edge servers (the same IP for many different clients). If I understand it right, we are over-counting flash proxy and over-counting meek.
Under-counting meek, but yes.
What we should do instead is count consensus downloads by transport. There's a ticket for this, but nobody is currently working on it:
https://trac.torproject.org/projects/tor/ticket/8786
Your idea ii) should fix this.
Of course, you'd be in a good position to test a patch for #8786. Would you want to hack on that?
We'll see :) For the time being I'll try isolating the transports and see what effect it has.
Please keep us posted how that works out.
All the best, Karsten
On Tue, Sep 16, 2014 at 06:21:04AM +0200, Karsten Loesing wrote:
On 16/09/14 06:07, David Fifield wrote:
We'll see :) For the time being I'll try isolating the transports and see what effect it has.
Please keep us posted how that works out.
I split the bridge into two processes following the guide at https://www.torservers.net/wiki/setup/server#multiple_tor_processes. Now we'll see how the graphs change in the upcoming days.
David Fifield