Le 20/12/2017 à 23:15, teor a écrit :
On 21 Dec 2017, at 08:51, teor teor2345@gmail.com wrote:
- Why didn't we see this abuse wave coming ? We kept replying to reporters of the dreaded "Failing because we have XXX connections already. Please read doc/TUNING for guidance" about how they could amend their config to accept more connections. Although the 'global scale' of those events should have been detected, without most of use assuming it was due to nodes' bad config.
Load spikes are normal, particularly with the HSDir flag, because HSDir usage is not bandwidth-weighted.
Allowing more connections *is* the right thing to do with this attack, if your OS has the resources. Several of my relays never went down, because they were over-provisioned with RAM and CPU.
Others only went down temporarily, during the most intense phases. (And then their excessive bandwidth weight was redistributed, and they have been coping well.)
If you don't have the resources to handle that many connections, then limiting connections is the right thing to do. If you can't do it using tor, then a firewall is the way to go.
This has been put in place and relay is now able to sustain the still ongoing flood.
(There are some bugs in Tor that make the attack more effective than it should be. We're working on fixing them.)
To mitigate this attack, we recommend setting MaxMemInQueues to the amount of RAM you have available per tor instance (or maybe a few hundred MB less).
Tor estimates it, but the estimate isn't very good.
This has been added about 12 hours ago (and relay SIGHUPed) and I still cannot see any trace of circuit OOM kills in relay logs.
And the 2 most recent heartbeat reports show a 'normal' circuit count
Thanks for all the fish :)