Hey,
new non-exit relay, Debian 11, tor 0.4.7.8-1~d11.bullseye+1, ~ 1 week old (-> no guard)
KVM VM with atm 4 cores, host passthrough AMD EPYC (-> AES HW accel.).
As can be seen at the attached screenshots memory consumption is irritating as well as the quite high CPU load.
All was fine when it had ~100 Mbit/s but then onion skins exploded (110 per second -> up to 4k per second) as well as CPU and memory.
Tor complains:
Your computer is too slow to handle this many circuit creation
requests! Please consider using the MaxAdvertisedBandwidth config option or choosing a more restricted exit policy.
And from time to time memory killer takes action
torrc is pretty basic:
Nickname 123 ContactInfo 123 RunAsDaemon 1 Log notice syslog RelayBandwidthRate 2X MBytes RelayBandwidthBurst 2X MBytes SocksPort 0 ControlSocket 0 CookieAuthentication 0 AvoidDiskWrites 1 Address xxxx OutboundBindAddress yyyy ORPort xxxx:yyy Address [zzzz] OutboundBindAddress [zzz] ORPort [zzz]:xxx MetricsPort hhhh:sss MetricsPortPolicy accept fffffff DirPort yy Sandbox 1 NoExec 1 CellStatistics 1 ExtraInfoStatistics 1 ConnDirectionStatistics 1 EntryStatistics 1 ExitPortStatistics 1 HiddenServiceStatistics 1
Ideas/suggestions (apart from limiting BW) to fix this?
Thanks
fran
On 22 Jul (23:28:51), Fran via tor-relays wrote:
Hey,
new non-exit relay, Debian 11, tor 0.4.7.8-1~d11.bullseye+1, ~ 1 week old (-> no guard)
KVM VM with atm 4 cores, host passthrough AMD EPYC (-> AES HW accel.).
As can be seen at the attached screenshots memory consumption is irritating as well as the quite high CPU load.
All was fine when it had ~100 Mbit/s but then onion skins exploded (110 per second -> up to 4k per second) as well as CPU and memory.
Tor complains:
Your computer is too slow to handle this many circuit creation requests!
Please consider using the MaxAdvertisedBandwidth config option or choosing a more restricted exit policy.
And from time to time memory killer takes action
torrc is pretty basic:
Nickname 123 ContactInfo 123 RunAsDaemon 1 Log notice syslog RelayBandwidthRate 2X MBytes RelayBandwidthBurst 2X MBytes SocksPort 0 ControlSocket 0 CookieAuthentication 0 AvoidDiskWrites 1 Address xxxx OutboundBindAddress yyyy ORPort xxxx:yyy Address [zzzz] OutboundBindAddress [zzz] ORPort [zzz]:xxx MetricsPort hhhh:sss MetricsPortPolicy accept fffffff DirPort yy Sandbox 1 NoExec 1 CellStatistics 1 ExtraInfoStatistics 1 ConnDirectionStatistics 1 EntryStatistics 1 ExitPortStatistics 1 HiddenServiceStatistics 1
Ideas/suggestions (apart from limiting BW) to fix this?
We are currently seeing huge memory pressure on relays. I was unsuccessful at finding any kind of memory leaks at the moment and so there is a distinct possibility that relays have been accumulating somehow legit memory. We are still heavily investigating all this and coming up with ways to reduce the footprint.
In the meantime, we know that the "CellStatistics" option is very very memory hungry and so you could disable that one and see if this stabilizes thing for you.
The other option that can help with memory pressure usually is the "MaxMemInQueues" (man 1 tor). Essentially, it tells "tor" when to start running its "out of memory handler" (OOM). It is usually set around 75% of your total memory but you could reduce it and see if this helps.
I would although, in the current network conditions, really NOT put it below 2GB. And if the OOM gets triggered too many times and you can spare memory, bump it up to 4GB at the very least.
The current network conditions are abnormal and, often couple with other things, creates these resource pressure on relays that we rarely experience and so our team needs to investigate a needle in a haystack when it happens.
Thanks for the report! And thanks to all to help us through these difficult times for our relays and users.
Cheers! David
On 25 Jul (19:31:16), Toralf Förster wrote:
On 7/25/22 14:48, David Goulet wrote:
It is usually set around 75% of your total memory
Is there's a max limit ?
Capped to "SIZE_MAX" which on 64 bit is gigantic, like around 18k Petabytes. On Linux, we use /proc/meminfo (MemTotal line) and so whatever also max limit the kernel would put for that.
Cheers! David
On 7/25/22 19:56, David Goulet wrote:
On Linux, we use /proc/meminfo (MemTotal line) and so whatever also max limit the kernel would put for that.
Here both Tor relays do use about 4 GB each:
$ pgrep tor | xargs -n 1 pmap | grep total total 4211476K total 4226580K
whilst more would be available:
$ head /proc/meminfo MemTotal: 131830284 kB MemFree: 3395252 kB MemAvailable: 105765448 kB Buffers: 40 kB Cached: 97275636 kB SwapCached: 47656 kB Active: 79333376 kB Inactive: 34977700 kB Active(anon): 6927704 kB Inactive(anon): 14077848 kB
-- Toralf
Hej,
thanks David!
In the meantime, we know that the "CellStatistics" option is very very memory hungry and so you could disable that one and see if this stabilizes thing for you.
This helped regarding the memory consumption. It's still up to 6G
The current network conditions are abnormal and, often couple with other things, creates these resource pressure on relays that we rarely experience and so our team needs to investigate a needle in a haystack when it happens.
I cut the BW nearly in halv but tor is still complaining a lot:
"Your computer is too slow to handle this many circuit creation requests!"
In between it receives more than 6k ntor_v3 c/s.
CPU and memory as well as bandwidth are not saturated.
So I wonder what'd be the best to do. Let it complain because the overload is due to the network conditions or reduce BW further until the log messages disappear?
What would be best for the network?
Thanks a lot!
Fran
tor-relays@lists.torproject.org