Hi folks,
Thanks for your patience with the relay overload issues.
We've merged https://bugs.torproject.org/24902 into tor git master. We'll be putting out an 0.3.3.2-alpha release in not too long for wider testing, and eventually backporting it all the way back to 0.2.9, but if you're the sort who enjoys running code from git, now is a great time to try it and let us know of problems and/or successes.
Here's the changelog stanza:
o Major features: - Give relays some defenses against the recent network overload. We start with three defenses (default parameters in parentheses). First: if a single client address makes too many connections (>100), hang up on further connections. Second: if a single client address makes circuits too quickly (more than 3 per second, with an allowed burst of 90) while also having too many connections open (3), refuse new create cells for the next while (1-2 hours). Third: if a client asks to establish a rendezvous point to you directly, ignore the request. These defenses can be manually controlled by new torrc options, but relays will also take guidance from consensus parameters, so there's no need to configure anything manually. Implements ticket 24902.
To repeat that last part: there are a bunch of torrc options you can use to tweak stuff, but you can leave it all at the defaults and it will read its instructions out of the consensus parameters: https://consensus-health.torproject.org/#consensusparams
Woo, --Roger
Thanks for your patience with the relay overload issues.
We've merged https://bugs.torproject.org/24902 into tor git master. We'll be putting out an 0.3.3.2-alpha release in not too long for wider testing, and eventually backporting it all the way back to 0.2.9, but if you're the sort who enjoys running code from git, now is a great time to try it and let us know of problems and/or successes.
Here's the changelog stanza:
o Major features: - Give relays some defenses against the recent network overload. We start with three defenses (default parameters in parentheses). First: if a single client address makes too many connections (>100), hang up on further connections. Second: if a single client address makes circuits too quickly (more than 3 per second, with an allowed burst of 90) while also having too many connections open (3), refuse new create cells for the next while (1-2 hours). Third: if a client asks to establish a rendezvous point to you directly, ignore the request. These defenses can be manually controlled by new torrc options, but relays will also take guidance from consensus parameters, so there's no need to configure anything manually. Implements ticket 24902.
To repeat that last part: there are a bunch of torrc options you can use to tweak stuff, but you can leave it all at the defaults and it will read its instructions out of the consensus parameters: https://consensus-health.torproject.org/#consensusparams
And packages for Debian-based OSes are probably in the next nightly master builds available at https://deb.torproject.org/torproject.org/dists/
nusenu:
And packages for Debian-based OSes are probably in the next nightly master builds available at https://deb.torproject.org/torproject.org/dists/
I just added support for tor nightly build repos to ansible-relayor (Debian/Ubuntu only), to make it very easy to test bleeding edge #Tor features such as the new denial of service mitigations. https://github.com/nusenu/ansible-relayor
On 31 Jan 2018, at 20:37, nusenu nusenu-lists@riseup.net wrote:
We've merged https://bugs.torproject.org/24902 into tor git master. ...
If you compile using clang, there are some warnings that appear to be harmless: https://trac.torproject.org/projects/tor/ticket/25094
The overall design is solid and the defences seem to work. But we are still doing minor fixes before the release and backport. (That's why it's master :-)
And packages for Debian-based OSes are probably in the next nightly master builds available at https://deb.torproject.org/torproject.org/dists/
We merged about 30 minutes before the final debs were written to deb.torproject.org. So I'm not sure if they will all have these changes.
Your best bet is to wait 12 hours from now, and they'll be there.
T
-- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n ------------------------------------------------------------------------
teor:
On 31 Jan 2018, at 20:37, nusenu nusenu-lists@riseup.net wrote:
We've merged https://bugs.torproject.org/24902 into tor git master. ...
If you compile using clang, there are some warnings that appear to be harmless: https://trac.torproject.org/projects/tor/ticket/25094
The overall design is solid and the defences seem to work. But we are still doing minor fixes before the release and backport. (That's why it's master :-)
And packages for Debian-based OSes are probably in the next nightly master builds available at https://deb.torproject.org/torproject.org/dists/
We merged about 30 minutes before the final debs were written to deb.torproject.org. So I'm not sure if they will all have these changes.
Your best bet is to wait 12 hours from now, and they'll be there.
that is why I said _next_ nightly builds ;)
Woo, for sure!
On Jan 31, 2018, at 03:16, Roger Dingledine arma@mit.edu wrote:
Hi folks,
Thanks for your patience with the relay overload issues.
We've merged https://bugs.torproject.org/24902 into tor git master. We'll be putting out an 0.3.3.2-alpha release in not too long for wider testing, and eventually backporting it all the way back to 0.2.9, but if you're the sort who enjoys running code from git, now is a great time to try it and let us know of problems and/or successes.
Here's the changelog stanza:
o Major features:
- Give relays some defenses against the recent network overload. We start with three defenses (default parameters in parentheses). First: if a single client address makes too many connections (>100), hang up on further connections. Second: if a single client address makes circuits too quickly (more than 3 per second, with an allowed burst of 90) while also having too many connections open (3), refuse new create cells for the next while (1-2 hours). Third: if a client asks to establish a rendezvous point to you directly, ignore the request. These defenses can be manually controlled by new torrc options, but relays will also take guidance from consensus parameters, so there's no need to configure anything manually. Implements ticket 24902.
To repeat that last part: there are a bunch of torrc options you can use to tweak stuff, but you can leave it all at the defaults and it will read its instructions out of the consensus parameters: https://consensus-health.torproject.org/#consensusparams
Woo, --Roger
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On 01/31/2018 10:16 AM, Roger Dingledine wrote:
but if you're the sort who enjoys running code from git, now is a great time to try it and let us know of problems and/or successes.
at a first glance master (tor-0.3.3.1-alpha-42-g2294e330b) works like a charm here at a hardened stable Gentoo with vanilla kernel 4.14.16 at both Tor exit relays
at a first glance master (tor-0.3.3.1-alpha-42-g2294e330b) works like a charm here at a hardened stable Gentoo with vanilla kernel 4.14.16 at both Tor exit relays
Is that with or without additional firewall rules to combat the abundant connection issues?
On 01/31/2018 10:16 AM, Roger Dingledine wrote:
the sort who enjoys running code from git, now is a great time to try it and let us know of problems and/or successes.
tor-0.3.3.1-alpha-58-ga846fd267 is bad here, the inbound connections stays at 5-10
tor-0.3.3.1-alpha-42-g2294e330b works fine so far (testd with fadditinoal firewall, now started w/o firewall rules)
Hi everbody
Am 31-Jan-18 um 10:16 schrieb Roger Dingledine:
now is a great time to try it and let us know of problems and/or successes.
Currently just success. NTor is still pretty high, circuits and TAP 'normal'. cpu is difficult to say, still pumping lots of circuits anyway. Settings are consensus related.
Two guards running since 6 hours and both show like: DoS mitigation since startup: 19085 circuits rejected, 14 marked addresses. 0 connections closed. 12 single hop clients refused.
A middle (is long term guard and will get flag back soon) running since 10 hours shows: DoS mitigation since startup: 67877 circuits rejected, 6 marked addresses. 0 connections closed. 263 single hop clients refused.
All are Freebsd and behind firewall, still: 20 connects per /32 ip, rate limited to 3 per sec. Immediate connection flushing, multi relay shared blocking table. Blocking duration 1 day per ip.
Going to reduce fw after 24 hours step-by-step.
Thanks for the nice peace of software!
Has #2 been eval regarding onion indexing engines, oniontorrent, etc? They use a lot of resources for agnostic purposes. Censoring them as collateral damage would be bad.
On 1 Feb 2018, at 18:59, grarpamp grarpamp@gmail.com wrote:
Has #2 been eval regarding onion indexing engines, oniontorrent, etc? They use a lot of resources for agnostic purposes. Censoring them as collateral damage would be bad.
Applications that use a lot of resources will have to rate-limit themselves. Otherwise, relays will rate-limit them.
Since traffic is encrypted, there is no way we can distinguish "good" scanners or downloaders from "bad" ones. All applications will need to respect these new limits.
T
Applications that use a lot of resources will have to rate-limit themselves. Otherwise, relays will rate-limit them.
It's possible if relays figure that stuff by #2 might not be an attack per se, but could be user activities... that relays might push back on that one by... - Seeking significantly higher default values committed - Seeking default action committed as off - Setting similar on their own relays if commits don't work. And by not being default off, it should be prominently documented if #2 affects users activities [1].
Indexers will distribute around it, yielding zero sum gain for the network and nodes. Multiparty onion p2p protocols could suffer though if #2 is expected to affect such things.
Was it ever discovered / confirmed what tool / usage was actually behind this recent ongoing 'DoS' phase? Whether then manifesting itself at the IP or tor protocol level.
Sorry if I missed this in all these threads.
[1] There could even be a clear section with simple named list of all options for further operator reading that could affect users activities / protocols.
On 01 Feb (04:01:10), grarpamp wrote:
Applications that use a lot of resources will have to rate-limit themselves. Otherwise, relays will rate-limit them.
It's possible if relays figure that stuff by #2 might not be an attack per se, but could be user activities... that relays might push back on that one by...
- Seeking significantly higher default values committed
- Seeking default action committed as off
- Setting similar on their own relays if commits don't
work. And by not being default off, it should be prominently documented if #2 affects users activities [1].
That I agree. We've set up default values for now and they will probably be adapted over time so for now this is experimental to see how much we make people unhappy (well except for the people doing the DoS ;).
But I do agree that we should document some "real life" use cases that could trigger defenses at the relay in some very public way (blog post or wiki) before this goes wide in the network. Large amount of tor clienst behind NAT is one I have in mind, IPv6 as well...
Indexers will distribute around it, yielding zero sum gain for the network and nodes. Multiparty onion p2p protocols could suffer though if #2 is expected to affect such things.
I just want to clarify the #2 defense which is the circuit creation mitigation. The circuit rate is something we can adjust over time and we'll see how that plays out like I said above.
However, to be identified as malicious, the IP address needs to be above 3 concurrent TCP connections (also a parameter we can adjust if too low). Normal usage of "tor" client should never make more than 1 single TCP connection to the Guard, everything is multiplexed in that connection.
So lets assume some person wants to "scan the onion space" and fires up 100 tor clients behind one single IP address which results in massive amount of HS connections on all .onion it can find. These tor clients in general won't pick the same Guard but let say 3 of them do which will trigger the concurrent connection threshold for circuit creation.
Doing 3 circuits a second continously up to a burst of 90 is still a _large_ number that relay needs to mitigate in some ways so it can operates properly to be fair to the rest of the clients doing way less in normal circumstances.
IMO, because someone can buy big servers and big uplinks doesn't mean they should be allowed to saturate links on the network. Unfortunately, the network has limitations and this latest DoS is showing us that relay have to rate limit stuff in a fair way if possible.
I bet there will be collateral damage from people currently using the network in insane ways or unique ways. But overall, I don't expect that it will hurt most of the use cases because 1) we made it that it is rare cases of tor client usage that can trigger this (or unknown usage) and 2) we can adjust any single parameters through the consensus if needed.
We'll break some eggs in the beginning and we should act accordingly but one thing is certain, the current situation is not sustainable for any user on the network.
From now on, we can only improve this DoS mitigation subsystem! :)
Cheers! David
Was it ever discovered / confirmed what tool / usage was actually behind this recent ongoing 'DoS' phase? Whether then manifesting itself at the IP or tor protocol level.
Sorry if I missed this in all these threads.
[1] There could even be a clear section with simple named list of all options for further operator reading that could affect users activities / protocols. _______________________________________________ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
I've updated my entire fleet (https://atlas.torproject.org/#search/family:2F9A6B5ADBE91EC69F55AAFB7DC49619...) today around 11:30AM to 0.3.3.1-alpha-dev (git-d1c2597096cac27e) and so far it looks like the mitigations are working nicely. Pretty graphs supporting that claim: https://tor.0x3d.lu/DoS/
Olaf is 0x3d001 + 002 which was pretty much under constant fire those past few weeks. Elsa would be 0x3d006 + 007 which was not targeted that much but it sure cut memory usage drastically.
Keep up the good work everyone!
---- Andy Weber andy@0x3d.lu
Cc'ing torservers for bridge OutboundBindAddrrss, and Mike for vanguards.
Here are the mitigations again:
o Major features:
- Give relays some defenses against the recent network overload. We start with three defenses (default parameters in parentheses). First: if a single client address makes too many connections (>100), hang up on further connections. Second: if a single client address makes circuits too quickly (more than 3 per second, with an allowed burst of 90) while also having too many connections open (3), refuse new create cells for the next while (1-2 hours).
We could patch clients so they never exceed this number of circuits by default. But that would penalise large clients that have a dedicated IP address.
Should we warn once instead?
"Your client is making a large number of circuits (%u over %u seconds). If multiple (%u) Tor clients are connected from your IP address, your guards make refuse to make circuits."
I think there would be so many false positives, it wouldn't be worth it.
Third: if a client asks to establish a rendezvous point to you directly, ignore the request. These defenses can be manually controlled by new torrc options, but relays will also take guidance from consensus parameters, so there's no need to configure anything manually. Implements ticket 24902.
On 2 Feb 2018, at 03:04, David Goulet dgoulet@torproject.org wrote:
On 01 Feb (04:01:10), grarpamp wrote: Applications that use a lot of resources will have to rate-limit themselves. Otherwise, relays will rate-limit them.
It's possible if relays figure that stuff by #2 might not be an attack per se, but could be user activities... that relays might push back on that one by...
- Seeking significantly higher default values committed
- Seeking default action committed as off
- Setting similar on their own relays if commits don't
work. And by not being default off, it should be prominently documented if #2 affects users activities [1].
That I agree. We've set up default values for now and they will probably be adapted over time so for now this is experimental to see how much we make people unhappy (well except for the people doing the DoS ;).
But I do agree that we should document some "real life" use cases that could trigger defenses at the relay in some very public way (blog post or wiki) before this goes wide in the network. Large amount of tor clienst behind NAT is one I have in mind, IPv6 as well...
We did some analysis when we were choosing these figures.
It takes a few hundred clients behind an IP address, to have a 50% probability of 3 clients choosing the same large guard. That's unusual. And if clients see their guard timeout, they will move to another guard.
Here are some other scenarios:
Peer-to-peer clients like Ricochet, when the user has >90 contacts, but only when there are hundreds of other clients on the same IP address.
Any Tor client that doesn't use guards. For example:
Bridges with multiple users, but only when there are >=3 bridges per outbound IP address. (This is unlikely, because bridges need their own IPv4 address. If you have multiple bridges using the default route, and multiple IP addresses, set OutboundBindAddress on each bridge.) We will need to consider this issue when we allow IPv6 bridges without a public IPv4 address. Perhaps an appropriate solution is to make bridge clients use vanguards.
Multiple (>=3) Tor2web or single onion services using separate tor instances, behind a single IP address, making large numbers of circuits. This is a likely source of our current issues.
T
Hi all,
Not sure where to hook into the discussion, apologies of offending anyone spanning of a new thread from this first message.
On 31 Jan 2018, at 10:16, Roger Dingledine wrote:
Hi folks,
Thanks for your patience with the relay overload issues.
We've merged https://bugs.torproject.org/24902 into tor git master. We'll be putting out an 0.3.3.2-alpha release in not too long for wider testing, and eventually backporting it all the way back to 0.2.9, but if you're the sort who enjoys running code from git, now is a great time to try it and let us know of problems and/or successes.
One relay has been running for 3 days, with all FW rate limiting removed, the other ~2 days. Is any feedback expected/appreciated?
I can share the heartbeat logging (the now three lines - Heartbeat / Circuit handshake stats / DoS) or anything else? [I assume save to share without the bandwidth on the Heartbeat line, but please confirm].
From a "how are things running?" well the "DoS attacks" come and go, and for now it looks good. Nothing out of the ordinary. CPU/Mem usage seems comparable whilst not under attack, traffic volumes seems a bit lower. The only visible thing is being marked on "Atlas" as an version with possible issues :-)
Thx, Stijn
On Wed, Jan 31, 2018 at 04:16:52AM -0500, Roger Dingledine wrote:
Thanks for your patience with the relay overload issues.
Early indications are that the overloaders have stopped. At least for now, but hopefully for longer.
https://metrics.torproject.org/userstats-relay-country.html?start=2017-12-01...
We'll know in a few days if this is real or just a misreading in our stats. But I think/hope it's real. :)
Woo, --Roger
On 03/11/2018 08:33 AM, Roger Dingledine wrote:
On Wed, Jan 31, 2018 at 04:16:52AM -0500, Roger Dingledine wrote:
Thanks for your patience with the relay overload issues.
Early indications are that the overloaders have stopped. At least for now, but hopefully for longer.
https://metrics.torproject.org/userstats-relay-country.html?start=2017-12-01...
But https://metrics.torproject.org/versions.html doesn't show a strong correlation in decrease/increase of a specific Tor version so I do wonder how to interrprete the user numbers.
But https://metrics.torproject.org/versions.html doesn't show a strong correlation in decrease/increase of a specific Tor version so I do wonder how to interrprete the user numbers.
33% of guard capacity and 37% of consensus weight is running on tor versions with DoS mitigation features.
On 03/11/2018 09:44 AM, nusenu wrote:
33% of guard capacity and 37% of consensus weight is running on tor versions with DoS mitigation features.
But there was no abrupt change around that time where the # user users droped down - so there'S no strong correlation IMO.
On 3/11/18 10:15, Toralf Förster wrote:
On 03/11/2018 09:44 AM, nusenu wrote:
33% of guard capacity and 37% of consensus weight is running on tor versions with DoS mitigation features.
But there was no abrupt change around that time where the # user users droped down - so there'S no strong correlation IMO.
Does there have to be a correlation between the number of Tor users and the relays updating to versions including DoS migitaion?
Couldn't the extra abusive users have just gone away, now that they're done with whatever they were doing?
Matt
tor-relays@lists.torproject.org