max TCP interruption before Tor circuit teardown? - tor-relays

20 Oct 2013


      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hello,
I'm working on building support scaffolding[1] for Tor on Raspberry Pi
and other small ARM single-board computers (SBCs).
With the slower computers, sometimes too many attempts to connect to
the ORPort (I am almost positive as part of TAP circuit building, but
not *really* sure) can eventually cause Tor to consume more physmem
than available and cause the oom-killer to kill Tor.  Also, depending
on the crappiness of the user's router, it's effectively a SYN flood,
and can crash or impair consumer routers.
My solution, so far, is to define (through trial and error on a
per-machine basis, since [1] is only officially supporting 3 SBCs
right now) limits on how many SYNs may be sent to the ORPort and the
DirPort per second.  This is done with iptables.  I experimented,
tuned the parameters and watched traffic for weeks and came up with a
pretty good set of limits for a 950MHz Raspberry Pi:  4 SYNs/sec burst
10.  (For those about to say the Pi is thus too slow to be used as a
relay, it's quite capable of relaying *at least* 2.5Mbps, but *not*
when it's getting SYN flooded.)
So, sometimes hosts exceed this limit.  Once the limit is exceeded, my
current strategy is to use iptables REJECT to send an ICMP Service
Unavailable (or whatever it's called, sorry no coffee yet) back to the
hosts that triggered the filter.  This is on a per-SYN basis.
After watching the data, I noticed that some hosts just try to connect
once or twice, or try to connect (during overload conditions) at
reasonable intervals of tens of seconds to a few minutes.  Other hosts
will quadruple-tap the ORPort with SYNs, four in a row, and otherwise
be much more aggressive with sending SYNs.
I'm currently testing fail2ban[2] as a way to ban aggressive peers by
changing that iptables REJECT to a DROP for a short period, in order
to accomplish two things:
1. Encourage them to knock off their bad behavior (i.e., go away for a
   little while).
2. Free up CPU time, RAM and bandwidth because we don't have to
   construct and send ICMP packets to banned peers.
Currently, if a peer violates the 4/sec burst 10 SYN limit more than 5
times in 60 seconds, that peer will be banned for 90 seconds.  I'm
trying to trim this down to the minimum that will protect the relay,
and 90 seconds is a guess given some of my fears, read on...
During an overload condition, my primary priority is to protect the
relay, but of course I wish to do so with as little disruption to the
Tor network as possible.  So, here is a potential problem with my
approach that I can think of, which could degrade service (mildly, for
a few end users) on the Tor network.
First, during a SYN flood type overload, some peers which have
*existing* circuits built through the relay and are sending SYNs as
normal traffic, will stochastically get "caught" in the filter and
banned for a short time.  If these hosts already have circuits open
through the relay which is overloaded, I would prefer to preserve
those circuits rather than break them.  My defensive strategy versus
overload here is to throttle new circuit creation requests, *not* to
break existing circuits.
So here's the $64,000 question:
If a tor relay has a circuit built through a peer, and the peer starts
dropping 100% of packets, how long will it take before the relay with
the circuit "gives up" on the circuit and tears it down?  I want to
set my temp ban time *below* this timeout.  Thus, unlucky peers that
were caught in the filter and have circuits already built through the
relay they will experience a brief performance degradation, but they
won't lose their active circuits through the overloaded relay, and in
the meantime hopefully the overload condition is becoming resolved.
Is there such a timeout?  There must be.  Can someone tell me what it is?
Or, is there a better way to protect low-resource machines (slow CPU,
512MB RAM) against the SYN flood "circuit creation storm" conditions
which occasionally arise on the Tor network?  Again, I must reiterate,
machines with specs like these can be very good relays for home
broadband users.  The true goal of my project[1] is to build a set of
software which enables a "plug and forget" relay for home broadband
users that costs well under $100.
[1] https://github.com/gordon-morehouse/cipollini
[2] http://www.fail2ban.org/wiki/index.php/Main_Page
Best,
- -Gordon M.
...PGP SIGNATURE...
-----BEGIN PGP SIGNATURE-----

iQEcBAEBCgAGBQJSZAfXAAoJED/jpRoe7/ujGQ4H/jUUjdufoWw1qthjsqdf4ICO
ejJXBuZ60O8TpuiT07EtZq9tSDsb5hLqjJMORNJWPJe3ffD7/BHv/6St0y0fkjFh
svEFBwVMmkNfaDd65z2JaFBRnSQTsgtiOeOnQFQt1evWQMeVmFsvoMvVIbqGSkf6
NipJnfFeoAtt/6cCMl7+yIxmGGb1Udl0ZEmlVacIYtFr8MjgIo59vT94k7SuzV1N
4Na9PNeQ9WIBlZf9vyesHgnuzJA8hEkFyoP5Fc3bPT/e3WYdAIifswE6rhZVT+AQ
9rHRvBUVoYpMNyy0fqMY34rLYegelIYstsy26dmW+robJKDxGVC3zBeGbOtcEOE=
=mSYu
-----END PGP SIGNATURE-----