Re: [tor-relays] How to reduce tor CPU load on a single bridge?

31 Dec 2021

      On Mon, Dec 27, 2021 at 04:00:34PM -0500, Roger Dingledine wrote:
...
On Mon, Dec 27, 2021 at 12:05:26PM -0700, David Fifield wrote:
...
I have the impression that tor cannot use more than one CPU core???is that
correct? If so, what can be done to permit a bridge to scale beyond
1×100% CPU? We can fairly easily scale the Snowflake-specific components
around the tor process, but ultimately, a tor client process expects to
connect to a bridge having a certain fingerprint, and that is the part I
don't know how to easily scale.

Surely it's not possible to run multiple instances of tor with the
same fingerprint? Or is it? Does the answer change if all instances
are on the same IP address? If the OR ports are never used?

Good timing -- Cecylia pointed out the higher load on Flakey a few days
ago, and I've been meaning to post a suggestion somewhere. You actually
*can* run more than one bridge with the same fingerprint. Just set it
up in two places, with the same identity key, and then whichever one the
client connects to, the client will be satisfied that it's reaching the
right bridge.
Thanks for this information. I've done a test with one instance of
obfs4proxy forwarding through a load balancer to two instances of tor
that have the same keys, and it seems to work. It seems like this could
work for Snowflake.
...
(A) Even though the bridges will have the same identity key, they won't
have the same circuit-level onion key, so it will be smart to "pin"
each client to a single bridge instance -- so when they fetch the bridge
descriptor, which specifies the onion key, they will continue to use
that bridge instance with that onion key. Snowflake in particular might
also want to pin clients to specific bridges because of the KCP state.
(Another option, instead of pinning clients to specific instances,
would be to try to share state among all the bridges on the backend,
e.g. so they use the same onion key, can resume the same KCP sessions,
etc. This option seems hard.)
Let's make a distinction between the "frontend" snowflake-server
pluggable transport process, and the "backend" tor process. These don't
necessarily have to be 1:1; either one could be run in multiple
instances. Currently, the "backend" tor is the limiting factor, because
it uses only 1 CPU core. The "frontend" snowflake-server can scale to
multiple cores in a single process and is comparatively unrestrained. So
I propose to keep snowflake-server as a single process, and to run
multiple tor processes. That eliminates the dimension of KCP state
coordination, and should last us until snowflake-server outgrows the
resources of a single host.
The snowflake-server program is a managed proxy; i.e., it expects to run
with certain environment variables set by a managing process, normally
tor. We'll need to instead run snowflake-server apart from any single
tor instance. Probably the easiest way to do that in the short term is
with ptadapter (https://github.com/twisteroidambassador/ptadapter),
which converts a pluggable transport into a TCP proxy, forwarding to an
address you specify.
Then we can have ptadapter forward to a load balancer like haproxy. The
load balancer will then round-robin over the ORPorts of the available
tor instances. The tor instances can all be on the same host (run as
many instances as you have CPU cores), which may or may not be the same
host on which snowflake-server is running.
Currently we have this:
        ________________     ___
    -->|snowflake-server|-->|tor|
            ----------------     ---
              (run by tor)
The design I'm proposing is this:
                                          ___
                                      .->|tor|
        ________________     _______  |   ---
    -->|snowflake-server|-->|haproxy|-+->|tor|
        ----------------     -------  |   ---
       (run by ptadapter)             '->|tor|
                                          ---
I believe that the "pinning" of a client session to particular tor
instance will work automatically by the fact that snowflake-server keeps
an outgoing connection alive (i.e., through the load balancer) as long
as a KCP session exists.
One complication we'll have to work out is that ptadapter doesn't have a
setting for ExtORPort forwarding. ptadapter absorbs any ExtORPort
information and forwards an unadorned connection onward. The idea I had
to to work around this limitation is to have ptadapter, rather than
execute snowflake-server directly, execute a shell script that sets
TOR_PT_EXTENDED_SERVER_PORT to a hardcoded address (i.e., to haproxy)
before running snowflake-server. Though, I am not sure what to do about
the extended_orport_auth_cookie file, which will be different for
different tor instances.
## Demo instructions
This is what I did to do a test of one instance of obfs4proxy
communicating with two instances of tor that have the same keys, on
Debian 11.
Install a first instance of tor and configure it as a bridge:
    # apt install tor
    # tor-instance-create o1
/etc/tor/instances/o1/torrc:
    BridgeRelay 1
    PublishServerDescriptor 0
    AssumeReachable 1
    SocksPort 0
    ORPort 127.0.0.1:9001
Start the first instance, which will generate keys:
    systemctl start tor@o1
Install a second instance of tor and configure it as a bridge (with a
different ORPort):
    # tor-instance-create o2
/etc/tor/instances/o2/torrc:
    BridgeRelay 1
    PublishServerDescriptor 0
    AssumeReachable 1
    SocksPort 0
    ORPort 127.0.0.1:9002
But before starting the second instance the first time, copy keys from
the first instance:
    # cp -r /var/lib/tor-instances/o1/keys /var/lib/tor-instances/o2/
    # chown -R _tor-o2:_tor-o2 /var/lib/tor-instances/o2/keys/
    # systemctl start tor@o2
The two instances should have the same fingerprint:
    # cat /var/lib/tor-instances/*/fingerprint
    Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F
    Unnamed 4808CD98E4C1D4F282DA741A860A44D755701F2F
Install haproxy and configure it to forward to the two tor instances:
    # apt install haproxy
/etc/haproxy/haproxy.cfg:
    frontend tor
    	mode tcp
    	bind 127.0.0.1:9000
    	default_backend tor-o
    backend tor-o
    	mode tcp
    	server o1 127.0.0.1:9001
    	server o2 127.0.0.1:9002
Restart haproxy with the new configuration:
    # systemctl restart haproxy
Install ptadapter and configure it to listen on an external address and
forward to haproxy:
    # apt install python3-pip
    # pip3 install pdadapter
ptadapter.ini:
    [server]
    exec = /usr/bin/obfs4proxy
    state = pt_state
    forward = 127.0.0.1:9000
    tunnels = server_obfs4
    [server_obfs4]
    transport = obfs4
    listen = [::]:443
Run ptadapter:
    ptadapter -S ptadapter.ini
On the client, make a torrc file with the information from the
pt_state/obfs4_bridgeline.txt file created by ptadapter:
    UseBridges 1
    SocksPort auto
    Bridge obfs4 172.105.3.197:443 4808CD98E4C1D4F282DA741A860A44D755701F2F cert=1SCzqyYyPh/SiXTJa9nLFxMyjWQITVCKeICME+SwxgNcTTSUQ7+vM/ghofU7oaalIRBILg iat-mode=0
    ClientTransportPlugin obfs4 exec /usr/bin/obfs4proxy
    DataDir datadir
Then run tor with the torrc:
    tor -f torrc
If you restart tor multiple times on the client, you can see haproxy
alternating between the two backend servers (o1 and o2) in
/var/log/haproxy.log:
    Dec 31 04:30:31 localhost haproxy[9707]: 127.0.0.1:55500 [31/Dec/2021:04:30:21.235] tor tor-o/o1 1/0/10176 11435 -- 1/1/0/0/0 0/0
    Dec 31 04:30:51 localhost haproxy[9707]: 127.0.0.1:55514 [31/Dec/2021:04:30:46.925] tor tor-o/o2 1/0/4506 17682 -- 1/1/0/0/0 0/0
    Dec 31 04:38:41 localhost haproxy[9707]: 127.0.0.1:55528 [31/Dec/2021:04:30:55.540] tor tor-o/o1 1/0/466049 78751 -- 1/1/0/0/0 0/0
    Dec 31 05:34:52 localhost haproxy[9707]: 127.0.0.1:55594 [31/Dec/2021:05:34:50.083] tor tor-o/o2 1/0/2209 13886 -- 1/1/0/0/0 0/0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-relays] How to reduce tor CPU load on a single bridge?