Linus Nordberg and I wrote a short paper that was presented at FOCI 2023. The topic is how to use all the available CPU capacity of a server running a Tor relay.
This is how the Snowflake bridges are set up. It might also be useful for anyone running a relay that is bottleneck on the CPU. If you have ever run multiple relays on one IP address for better scaling (if you are one of the relay operators affected by the recent AuthDirMaxServersPerAddr change), you might want to experiment with this setup. The difference is that all the instances of Tor have the same relay fingerprint, so they operate like one big relay instead of many small relays.
https://www.bamsoftware.com/papers/pt-bridge-hiperf/
The pluggable transports model in Tor separates the concerns of anonymity and circumvention by running circumvention code in a separate process, which exchanges information with the main Tor process over local interprocess communication. This model leads to problems with scaling, especially for transports, like meek and Snowflake, whose blocking resistance does not rely on there being numerous, independently administered bridges, but which rather forward all traffic to one or a few centralized bridges. We identify what bottlenecks arise as a bridge scales from 500 to 10,000 simultaneous users, and then from 10,000 to 50,000, and show ways of overcoming them, based on our experience running a Snowflake bridge. The key idea is running multiple Tor processes in parallel on the bridge host, with externally synchronized identity keys.
On Mon, Sep 04, 2023 at 02:09:50AM -0600, David Fifield wrote:
Linus Nordberg and I wrote a short paper that was presented at FOCI 2023. The topic is how to use all the available CPU capacity of a server running a Tor relay.
This is how the Snowflake bridges are set up. It might also be useful for anyone running a relay that is bottleneck on the CPU. If you have ever run multiple relays on one IP address for better scaling (if you are one of the relay operators affected by the recent AuthDirMaxServersPerAddr change), you might want to experiment with this setup. The difference is that all the instances of Tor have the same relay fingerprint, so they operate like one big relay instead of many small relays.
The workshop presentation video (22 minutes) of this paper has just become available on YouTube. The paper homepage has a copy of the video too.
https://www.youtube.com/watch?v=UkUQsAJB-bg&list=PLWSQygNuIsPc8bOJ2szObl...
The other FOCI 2023 issue 2 videos are online as well:
https://www.youtube.com/playlist?list=PLWSQygNuIsPc8bOJ2szOblMK4i6T79S1m
This is very timely as I am testing multiple relays globally and locally. Plus a Snowflake instance. Appreciated
On Sun, Nov 5, 2023 at 11:18 PM, David Fifield <[david@bamsoftware.com](mailto:On Sun, Nov 5, 2023 at 11:18 PM, David Fifield <<a href=)> wrote:
On Mon, Sep 04, 2023 at 02:09:50AM -0600, David Fifield wrote:
Linus Nordberg and I wrote a short paper that was presented at FOCI 2023. The topic is how to use all the available CPU capacity of a server running a Tor relay.
This is how the Snowflake bridges are set up. It might also be useful for anyone running a relay that is bottleneck on the CPU. If you have ever run multiple relays on one IP address for better scaling (if you are one of the relay operators affected by the recent AuthDirMaxServersPerAddr change), you might want to experiment with this setup. The difference is that all the instances of Tor have the same relay fingerprint, so they operate like one big relay instead of many small relays.
https://www.bamsoftware.com/papers/pt-bridge-hiperf/
The workshop presentation video (22 minutes) of this paper has just become available on YouTube. The paper homepage has a copy of the video too.
https://www.youtube.com/watch?v=UkUQsAJB-bg&list=PLWSQygNuIsPc8bOJ2szOblMK4i6T79S1m&index=5
The other FOCI 2023 issue 2 videos are online as well:
https://www.youtube.com/playlist?list=PLWSQygNuIsPc8bOJ2szOblMK4i6T79S1m _______________________________________________ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Hi David
https://www.bamsoftware.com/papers/pt-bridge-hiperf/
https://www.youtube.com/watch?v=UkUQsAJB-bg&list=PLWSQygNuIsPc8bOJ2szObl...
The other FOCI 2023 issue 2 videos are online as well:
https://www.youtube.com/playlist?list=PLWSQygNuIsPc8bOJ2szOblMK4i6T79S1m
Thank you for the paper and the presentation.
Chapter 3 (Multiple Tor processes) shows the structure:
mypt - HAproxy = multiple tor services
At the end of chapter 3.1 it is written
the loss of country- and transport-specific metrics
How will the metrics data be pulled out of the multiple tor services to fetch *all* metrics data? Or will only one of them be looked at, without full data representation?
I ask primary about an obfs4 setup. Which might apply for snowflake and friends too.
On Mon, Dec 11, 2023 at 08:13:17PM +0100, Felix wrote:
Thank you for the paper and the presentation.
Chapter 3 (Multiple Tor processes) shows the structure:
mypt - HAproxy = multiple tor services
At the end of chapter 3.1 it is written
the loss of country- and transport-specific metrics
How will the metrics data be pulled out of the multiple tor services to fetch *all* metrics data? Or will only one of them be looked at, without full data representation?
The key is that every instance of tor must have a different nickname. That way, even though they all have the same relay identity key, Tor Metrics knows to count all the descriptors separately.
So, for instance, on one snowflake bridge (identity 2B280B23E1107BB62ABFC40DDCC8824814F80A72), we use nicknames: flakey1, flakey2, …, flakey12 and on another bridge (identity 8838024498816A039FCBBAB14E6F40A0843051FA) we use nicknames: crusty1, crusty2, …, crusty12
Instructions for setting up nicknames can be found at https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guid...
It used to be the case that Tor Metrics did not understand the descriptors of this kind of multi-instance bridge. If you had N instances, it would count only 1 of them per time period. But Tor Metrics has now known about this kind of bridge (multiple descriptors per time period with the same identity key but different nicknames) for more than a year: https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/40... https://gitlab.torproject.org/tpo/network-health/metrics/website/-/merge_req...
Relay Search still does not know about multi-instance bridges, though. If you look up such a bridge, it will display one of the multiple instances more or less at random. In the case of the current snowflake bridges, you have to multiply the numbers on Relay Search pages by 12 to get the right numbers. https://metrics.torproject.org/rs.html#details/2B280B23E1107BB62ABFC40DDCC88... https://metrics.torproject.org/rs.html#details/8838024498816A039FCBBAB14E6F4...
There's a special repository for making graphs of snowflake users. This was necessary in the time before Tor Metrics natively understood multi-instance bridges, and I still use it because it offers some extra flexibility over what metrics.torproject.org provides. With some small changes, the same code could work for other pluggable transports, or even single bridges. https://gitlab.torproject.org/dcf/snowflake-graphs This is a sample of the graph output: https://forum.torproject.org/t/snowflake-daily-operations-november-2023-upda...
tor-relays@lists.torproject.org