On Thu, Sep 21, 2023 at 09:26:58PM -0600, David Fifield wrote:
I made a graph of the bandwidth on the two bridges since this started happening.
The two vertical lines mark: 2023-09-20 14:00:00 earliest known case of domain resolving to Cloudflare 2023-09-21 18:00:00 change to foursquare.com in rdsys https://gitlab.torproject.org/tpo/anti-censorship/rdsys-admin/-/merge_reques...
- snowflake-02 bandwidth has dwindled to almost nothing. Seriously almost nothing: it's around 3 MB/s currently.
- There's a huge almost instantaneous step in snowflake-01 at around 2023-09-21 13:00:00. At first, I thought this might have been a consequence of the rdsys change, but it's about 5 hours earlier than that. What could it be? Some unrelated unblocking event that just happened to happen while this domain stuff is happening?
The non-use of snowflake-02 continues -- see the attached graph. I'm racking my brain trying to understand that is. snowflake-01 usage has decreased a lot too -- the graph appears to be at about the same level, but you can see it's not brickwalled at the upper end of the range as it was before. Even ignoring the step anomaly at 2023-09-21 13:00:00, it didn't go to zero like snowflake-02 did.
It may be that whatever decides whether you get a Fastly or a Cloudflare edge server correlates highly with whether your client is capable of using snowflake-02. My working assumption, so far, has been that Tor Browser has multi-bridge support since 12.0 (2022-12-07), while Orbot only has multi-bridge support in the unreleased Orbot 17 (https://github.com/guardianproject/orbot/releases/tag/17.0.0-BETA-2-tor.0.4.... is the first beta release to have it). If Tor Browser users are mostly on desktop, and mobile users are mostly on mobile/cellular, and DNS resolution for cdn.sstatic.net also correlates with desktop vs. mobile, then that could explain it. It would mean that ~100% of Tor Browser users are getting a Cloudflare IP address, and <100% of Orbot users are.
But it's not the case that Orbot 17 is totally unreleased. The Play Store currently has 16.6.3-RC-1-tor.0.4.7.10 released 2022-11-01: https://web.archive.org/web/20230925022736/https://play.google.com/store/app... But Orbot 17 betas are available (most recent is 2023-08-09 https://github.com/guardianproject/orbot/releases/tag/17.1.0-BETA-3-tor.0.4....), and version 17 is in F-Droid: http://meetbot.debian.net/tor-meeting/2023/tor-meeting.2023-09-21-15.57.log.... <dcf1> Orbot 17 has both bridges, but it's not released yet, except in beta, afaik. I walways thought that was the cause of the low use of snowflake-02, that we were just waiting for Orbot to make a full release of v17. But maybe it is more complicated. <meskio> mmm, I have here orbot 17, so I guess I'm using the beta... <meskio> is in fdroid So even if the correlation hypothesis were correct, I wouldn't expect snowflake-02 to drop as far as it has.
Maybe the bridge selection at the client is not as random as we intend? Even though there are two bridge lines, maybe tor systematically prefers the one that's listed first? The idea here is that maybe snowflake-02 only gets used when snowflake-01 is past its capacity and starts to fail connection attempts. With the suddenly reduced overall level of users, there's enough headroom that snowflake-02 essentially never gets used.
A possible explanation for the sudden step in snowflake-01 usage at 2023-09-21 13:00:00 is that there's a population of Snowflake clients out there other than the ones we are responsible for. Whoever is distributing the clients for that population may have noticed the cdn.sstatic.net change and deployed their own mitigation, separate from anything we have done. The step only took about 15 minutes (see the second attached graph), which is a pretty fast deployment. If that other imagined deployment only knows about snowflake-01, that could explain why the step appears in snowflake-01's graph and not snowflake-02's. It still doesn't explain why, before the step, snowflake-01 took a big hit to users but did not go to zero, while snowflake-02 kept declining.
Maybe we have an undetected bug in multi-proxy support that favors snowflake-01? The broker is supposed to reject proxies that do not have multi-bridge support since 2022-10-03: https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowfla... But maybe it's not working the way it's supposed to? Maybe it's easier to get a proxy for snowflake-01 than for snowflake-02?
Maybe there's something wrong with the snowflake-02 bridge? I've been using snowflake-02 all day today (using AMP cache rendezvous). In the morning, it did seem to be a little screwy -- I couldn't get a YouTube video to play without frequent stops. One time, I happened to notice these messages in the log; they may be unrelated, but perhaps there is some weird interaction with Conflux: 2023-09-24 19:03:16.550 [NOTICE] Failed to find node for hop #1 of our path. Discarding this circuit. 2023-09-24 19:03:16.552 [NOTICE] Our circuit 0 (id: 38) died due to an invalid selected path, purpose Unlinked conflux circuit. This may be a torrc configuration issue, or a bug. 2023-09-24 19:22:50.237 [NOTICE] Failed to find node for hop #1 of our path. Discarding this circuit. I checked the bridge to ensure that the expected version of the server software was deployed (commit 0a6aeda9), and it was.
While I was using Tor Browser, I let it upgrade to 13.0a5. 13.0a5 has a fix to the default bridge lines, but I uses a manual bridge line so I would only be on snowflake-02. After the upgrade, it started working better and I could watch YouTube as normal. Maybe it was just a concidence that the upgrade to 13.0a5 seemed to improve the performance? In any case, from watching bandwidth on the bridge, it looks like I've had the bridge mostly to myself all day.
Just for good measure, I upgraded tor on snowflake-02 from 0.4.7.13-1~focal+1 to 0.4.8.6-1~focal+1 at 2023-09-24 20:48:25.