Re: [tor-relays] Did 'Sandbox 1' break Tor for anyone else on 0.4.5.6?

16 Mar 2021


      Hi William
William Kane:
...
Hi Peter,
...
Would be great if you could get details about the failing call.
I already thought of gathering said details by tracing the process,
but did not want to risk my uptime statistics, which would inevitably
happen if I had to restart the server and service over and over (I
disabled tracing globally through the Yama LSM as a security measure,
i.e. kernel.yama.ptrace_scope == 3) - recently I lost the guard flag
multiple times, caused by some sort of attack that I already reported
on this list (tor-relays) - someone kept creating a fuckton of
circuits through my relay (averaging 90k per minute), thus causing tor
to run out of memory / get oom-killed by the kernel before it could
even step in and close the circuits - if it was even trying to, it
would make sense for the DoS mitigation code to be active only for the
first link in the circuit aka the guard, and my node simply being a
middle-only relay, it got completely stomped by said attack.
After somewhat mitigating this attack by tweaking MaxMemInQueues,
creating a bigger swap file and tuning vm.swappiness, I regained the
guard flag, but then the hypervisor my KVM box is running under
experienced some issues and had to be rebooted - once again, I
received no notice of that until the relay was already offline for a
few days, causing me to lose the guard flag again.
Seems like luck is just not on my side these days, or well, it's been weeks now.
You could try to just run a second instance of Tor by copying the
systemd config and Tor settings. You probably don't need to enable
OrPort and ControlPort to reproduce the issue.
...
...
You should simply see a Permission Denied if the capability is the problem.
Here's a copy from stdout, only happening if Sandbox is set to 1.:
Mar 15 20:15:20.000 [notice] Configured to measure statistics. Look
for the *-stats files that will first be written to the data directory
in 24 hours from now.
Mar 15 20:15:21.000 [warn] fstat() on directory /var/lib/tor_debug failed.
Mar 15 20:15:21.000 [err] Can't create/check datadirectory /var/lib/tor_debug
Mar 15 20:15:21.000 [err] Error initializing keys; exiting
Running it as a privileged user does not change thing, so no permissions issue:
Mar 15 20:17:24.000 [notice] Configured to measure statistics. Look
for the *-stats files that will first be written to the data directory
in 24 hours from now.
Mar 15 20:17:24.000 [warn] You are running Tor as root. You don't need
to, and you probably shouldn't.
Mar 15 20:17:25.000 [warn] fstat() on directory /var/lib/tor_debug failed.
Mar 15 20:17:25.000 [err] Can't create/check datadirectory /var/lib/tor_debug
Mar 15 20:17:25.000 [err] Error initializing keys; exiting
I've traced down the origin of the fstat() call to this piece of code:
https://github.com/torproject/tor/blob/master/src/lib/fs/dir.c#L158
However, looking at the code that establishes and populates seccomp
rules, it seems like fstat and it's 64 bit counterpart are not subject
to (parameter) filtering, i.e. seccomp_rule_add_0 is invoked with the
parameter SCMP_ACT_ALLOW, reading the manpage for seccomp_rule_add(3)
reveals: "The seccomp filter will have no effect on the thread calling
the syscall if it matches the filter rule."
References:
https://github.com/torproject/tor/blob/master/src/lib/sandbox/sandbox.c#L148
https://github.com/torproject/tor/blob/master/src/lib/sandbox/sandbox.c#L159...
https://man7.org/linux/man-pages/man3/seccomp_rule_add.3.html
So, even though technically, seccomp should allow these syscalls to be
invoked, no matter which parameters are passed, somehow enabling the
whole sandbox subsystem still breaks fstat.
fstat() in the log above refers to the fstat() function in libc but libc
can use numerous syscalls in the background to implement it. I could
find fstat, fstat64 and fstatat64, and newer kernels may have even more
syscalls, that could be used. Usually, when seccomp starts failing, it
is because a library was updated (like libc) and started using another
syscall to implement a function (like fstat()) or the kernel was
updated, which the library detected, and started using a new, "improved"
syscall. To be sure what syscall is used, the auditd logs would be
invaluable. Performance impact should be neglectable if you don't
manually add any auditing rules.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-relays] Did 'Sandbox 1' break Tor for anyone else on 0.4.5.6?