Greetings!
As some of you know, a bunch of onion services were or are still under heavy DDoS on the network. More specifically, they are bombarded with introduction requests (INTRODUCE2 cells) which forces them to rendezvous for each of them by creating a ton of circuits.
This basically leads to a resource exhaustion attack on the service side with its CPU massively used for path selection, opening new circuits and continously handling INTRODUCE2 cells.
Unfortunately, our circuit-level flow control does not apply to the service introduction circuit which means that the intro point is allowed, by the Tor protocol, to send an arbitrary large amount of cells down the circuit. This means for the service that even after the DoS has stopped, it would still receive massive amounts of cells because some are either inflight on the circuit or queued at the intro point ready to be sent (towards the service).
That being all said, our short-term goal here is to add INTRODUCE2 rate-limiting (similar to the Guard DoS subsystem deployed early last year) *at* the intro point but much simpler. The goal is to soak up the introduction load directly at the intro points which would help reduce the load on the network overall and thus preserve its health.
Please have a look at https://trac.torproject.org/15516 for some discussions and ongoing code work. We are at the point where we have a branch that rate limits INTRODUCE2 cells at the intro point but we need to figure out proper values for the rate per second and the burst allowed.
One naive approach is to see how much cells an attack can send towards a service. George and I have conducted experiment where with 10 *modified* tor clients bombarding a service at a much faster rate than 1 per-second (what vanilla tor does if asked to connect a lot), we see in 1 minute ~15000 INTRODUCE2 cells at the service. This varies in the thousands depending on different factors but overall that is a good average of our experiment.
This means that 15000/60 = 250 cells per second.
Considering that this is an absurd amount of INTRODUCE2 cells (maybe?), we can put a rate per second of let say a fifth meaning 50 and a burst of 200.
Over the normal 3 intro points a service has, it means 150 introduction per-second are allowed with a burst of 600 in total. Or in other words, 150 clients can reach the service every second up to a burst of 600 at once. This probably will ring alarms bell for very popular services that probably gets 1000+ users a second so please check next section.
I'm not that excited about hardcoded network wide values so this is why the next section is more exciting but much more work for us!
One step further: we have not really decided yet if this is something we want nor have time to tackle but an idea here would be for a service to inform the intro point, using the ESTABLISH_INTRO cell payload, on the parameters it wants for its DoS defenses. So let say a very popular .onion with OnionBalance and 10 intro points, could tell to its intro points that it wants much higher values for the DoS defenses (or even put it off).
However, it doesn't change the initial building block of being able to rate limit at the introduction point. As a second step, we can add this new type of ESTABLISH_INTRO cell. It is always dicy to introduce a new cell since for instance this would leak information to the intro point that the service is ">= version". Thus, this needs to be done with carefully.
Time for your thoughts and help! :)
Thanks everyone! David