Re: [anti-censorship-team] Need to increase number of tor instances on snowflake-01 bridge, increased usage since yesterday

27 Sep 2022

      On Tue, Sep 27, 2022 at 08:22:21PM +0200, Linus Nordberg wrote:
...
David Fifield david@bamsoftware.com wrote
Tue, 27 Sep 2022 08:54:53 -0600:
...
I checked the number of sockets connected to the haproxy frontend port,
thinking that we may be running out of localhost 4-tuples. It's still in
bounds (but we may have to figure something out for that eventually).
# ss -n | grep -c '127.0.0.1:10000\s*$'
27314
# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 15000    64000

Would more IP addresses and DNS round robin work?
By more IP addresses you mean more localhost IP addresses, I guess? All
of 127.0.0.0/8 is localhost, so we can expand the range of four-tuples
by using more addresses from that address range in either the source or
destination address position. haproxy probably has an option to listen
on multiple addresses. The trick is actually using the multiple
addresses. I don't think DNS will work directly, because
snowflake-server gets the address of its upstream from the TOR_PT_ORPORT
environment variable, which is specified to take an IP:port, not a DNS
name (and is implemented that way in goptlib).
https://gitweb.torproject.org/torspec.git/tree/pt-spec.txt?id=ec77ae643f3e47...
https://gitweb.torproject.org/pluggable-transports/goptlib.git/tree/pt.go?h=...
You could try using more addresses from 127.0.0.0/8 in the *source*
address position, by specifying the second parameter of net.DialTCP to
set the source address here:
https://gitweb.torproject.org/pluggable-transports/goptlib.git/tree/pt.go?h=...
...
...
It may be something inside snowflake-server, for example some central
scheduling algorithm that cannot run any faster. (Though if that were
the case, I'd expect to see one CPU core at 100%, which I do not.) I
suggest doing another round of profiling now that we have taken care of
the more obvious hotspots in
https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
After an interesting chat with anarcat I think that we are CPU bound and
in particular by handling so many interrupts from the NIC and dealing
with such a high number of context switches. I have two suggestions on
how to move forward with this.
First, let's patch tor to get rid of the extor processes, as suggested
by David earlier when discussing RAM pressure. This should bring down
context switches.
The easiest way to do this is probably to comment out the
re-randomization of the ExtORPort auth cookie file on startup, and
replace the existing cookie files with static files. Or even just
comment out the failure case in connection_ext_or_auth_handle_client_hash.
https://gitweb.torproject.org/tor.git/tree/src/feature/relay/ext_orport.c?h=...
The uncontrollable rerandomization of auth cookies is the whole reason
for extor-static-cookie:
https://forum.torproject.net/t/tor-relays-how-to-reduce-tor-cpu-load-on-a-si...
Here's my post requesting support in core tor:
https://lists.torproject.org/pipermail/tor-dev/2022-February/014695.html

2024

2023

2022

2021

2020

2019

Re: [anti-censorship-team] Need to increase number of tor instances on snowflake-01 bridge, increased usage since yesterday