Hi everyone,
I stumble upon an issue when testing torsocks[1] with firefox. I'm still wondering how this can be fixed thus I need more eyes on this :). The issue is that torsocks gets into a deadlock during the initialization phase within the libc.
Here it is. This new torsocks version hijacks the "syscall" symbol (syscall(2)) in order to intercept applications that decides to do some network operations with that interface. To do that, the torsocks library constructor (executed before the application main()) lookup the original symbol in the libc (dlopen(3)) and is used for unhandled syscall values (for instance open(2)).
Now the issue was detected with firefox which uses a custom malloc hook meaning that it handles its own memory allocation. This hook uses mmap() that firefox redefines to be a direct syscall(__NR_mmap, ...) and remember that this symbol is hijacked by torsocks.
Torsocks constructor calls dlsym() to get the original libc syscall symbol. This call locks a "loading lock" inside the libc:
dlfcn/dlsym.c +68: __rtld_lock_lock_recursive (GL(dl_load_lock));
Just after, dlerror_run is called which does a calloc() which then calls the firefox malloc hook and calls syscall() for mmap that torsocks hijacks. In torsocks, syscall() make a check on the original libc syscall pointer to see if it's NULL or not and if NULL, tries to look it up with dlsym(). And there you have the deadlock.
dlsym --> LOCK --> dlerror_run --> calloc --> syscall() --> dlsym() --> dlerror_run --> DEADLOCK.
It's a bit of a catch 22 because torsocks is basically looking for the libc syscall symbol but then it gets call inside that lookup code path...
To be honest, I am not sure what's the right fix here or if there is any way to lookup the symbol in a "special" way that would help here. Any idea or questions are VERY welcome :).
Hope this explanation is clear enough, this is a "not that trivial" issue.
Cheers! David
On Tue, Oct 29, 2013 at 2:38 PM, David Goulet dgoulet@ev0ke.net wrote:
To be honest, I am not sure what's the right fix here or if there is any way to lookup the symbol in a "special" way that would help here. Any idea or questions are VERY welcome :).
My first thought -- and I don't know how good it is -- is that perhaps you could just *not* look at syscalls that occur during the dlsym calls that you launch? In other words, disable the syscall override if the current thread is already inside the dlsym() call inside your syscall override.
Would that work? What would it break, if anything?
On 29 Oct (14:58:44), Nick Mathewson wrote:
On Tue, Oct 29, 2013 at 2:38 PM, David Goulet dgoulet@ev0ke.net wrote:
To be honest, I am not sure what's the right fix here or if there is any way to lookup the symbol in a "special" way that would help here. Any idea or questions are VERY welcome :).
My first thought -- and I don't know how good it is -- is that perhaps you could just *not* look at syscalls that occur during the dlsym calls that you launch? In other words, disable the syscall override if the current thread is already inside the dlsym() call inside your syscall override.
That would work if there is a way I can "differ" the hijack of the syscall symbol... Unfortunately, this is done at linking time thus during run time, the syscall symbol is already hijacked by torsocks.
Let say we don't try to lookup the syscall symbol, the issue is that the original syscall libc pointer will NOT exists within torsocks code so we can't handle call to syscall() because we can't route it to libc. :S
It's really that we get in a kind of "infinite loop" where dlsym calls syscall that calls dlsym and so on. But in the first place, we at least need the libc syscall symbol so we can handle them.
David
Would that work? What would it break, if anything?
Nick _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Tue, Oct 29, 2013 at 03:10:50PM -0400, David Goulet wrote:
That would work if there is a way I can "differ" the hijack of the syscall symbol... Unfortunately, this is done at linking time thus during run time, the syscall symbol is already hijacked by torsocks.
Let say we don't try to lookup the syscall symbol, the issue is that the original syscall libc pointer will NOT exists within torsocks code so we can't handle call to syscall() because we can't route it to libc. :S
It's really that we get in a kind of "infinite loop" where dlsym calls syscall that calls dlsym and so on. But in the first place, we at least need the libc syscall symbol so we can handle them.
Might it be possible to use objcopy tricks like --prefix-string or --redefine-sym to make the exported version of syscall different from the imported version? Then the torsocks code could just call syscall() as a normal libc function, linked by ld.so, but when firefox called syscall, it would call torsocks's torsocks_syscall(), or something?
- Ian
On 29 Oct (16:41:02), Ian Goldberg wrote:
On Tue, Oct 29, 2013 at 03:10:50PM -0400, David Goulet wrote:
That would work if there is a way I can "differ" the hijack of the syscall symbol... Unfortunately, this is done at linking time thus during run time, the syscall symbol is already hijacked by torsocks.
Let say we don't try to lookup the syscall symbol, the issue is that the original syscall libc pointer will NOT exists within torsocks code so we can't handle call to syscall() because we can't route it to libc. :S
It's really that we get in a kind of "infinite loop" where dlsym calls syscall that calls dlsym and so on. But in the first place, we at least need the libc syscall symbol so we can handle them.
Might it be possible to use objcopy tricks like --prefix-string or --redefine-sym to make the exported version of syscall different from the imported version? Then the torsocks code could just call syscall() as a normal libc function, linked by ld.so, but when firefox called syscall, it would call torsocks's torsocks_syscall(), or something?
I've played a bit with objcopy and redefining dynamic symbols is not possible. And a stripped binary makes things harder also...
Unless you know a way to do that, I'll check in an other direction.
Big thanks Ian! David
- Ian
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
David Goulet:
Now the issue was detected with firefox which uses a custom malloc hook meaning that it handles its own memory allocation. This hook uses mmap() that firefox redefines to be a direct syscall(__NR_mmap, ...) and remember that this symbol is hijacked by torsocks. […] It's a bit of a catch 22 because torsocks is basically looking for the libc syscall symbol but then it gets call inside that lookup code path...
Wouldn't one way out be to also hook malloc to use a static buffer until dlsym() is done? The code snippet in the following answer is doing just that: http://stackoverflow.com/a/10008252
Lunar:
David Goulet:
Now the issue was detected with firefox which uses a custom malloc hook meaning that it handles its own memory allocation. This hook uses mmap() that firefox redefines to be a direct syscall(__NR_mmap, ...) and remember that this symbol is hijacked by torsocks. […] It's a bit of a catch 22 because torsocks is basically looking for the libc syscall symbol but then it gets call inside that lookup code path...
Wouldn't one way out be to also hook malloc to use a static buffer until dlsym() is done? The code snippet in the following answer is doing just that: http://stackoverflow.com/a/10008252
Meh… scratch that. It looks like defining calloc() in libtorsocks.so is not enough to have our own function called. Not sure why.
With the attached patch, at least we panic cleanly.
On 30 Oct (12:28:19), Lunar wrote:
Lunar:
David Goulet:
Now the issue was detected with firefox which uses a custom malloc hook meaning that it handles its own memory allocation. This hook uses mmap() that firefox redefines to be a direct syscall(__NR_mmap, ...) and remember that this symbol is hijacked by torsocks. […] It's a bit of a catch 22 because torsocks is basically looking for the libc syscall symbol but then it gets call inside that lookup code path...
Wouldn't one way out be to also hook malloc to use a static buffer until dlsym() is done? The code snippet in the following answer is doing just that: http://stackoverflow.com/a/10008252
Meh… scratch that. It looks like defining calloc() in libtorsocks.so is not enough to have our own function called. Not sure why.
With the attached patch, at least we panic cleanly.
Ok, I manage to make it work with Firefox. The fix is to simply handle mmap/munmap inside the torsocks syscall code. This allows torsocks to find the syscall symbol from the libc and work well afterwards. This works because the firefox mmap() redefinition is not applied in libtorsocks thus we can call directly the mmap() symbol linked to the libc.
However, and a BIG however, this is a special fix for specific case where memory allocation is handle by the application AND syscall() is used. It will not cover the broader issue of using other syscall within a malloc hook for instance.
After two days, I only see that solution for now as a "working fix" for application that use syscall() directly for their memory allocation.
Thoughts?
Cheers! David
-- Lunar lunar@torproject.org
diff --git a/src/lib/syscall.c b/src/lib/syscall.c index 0edd460..d520c0a 100644 --- a/src/lib/syscall.c +++ b/src/lib/syscall.c @@ -17,6 +17,8 @@
#include <assert.h> #include <stdarg.h> +#include <stdlib.h> +#include <stdio.h>
#include <common/log.h>
@@ -112,6 +114,19 @@ LIBC_SYSCALL_DECL LIBC_SYSCALL_RET_TYPE ret; va_list args;
+#if defined(SYS_mmap) || defined(SYS_mmap2)
- if (NULL == tsocks_libc_syscall) {
switch (__number) {
case SYS_mmap:
+#ifdef SYS_mmap2
case SYS_mmap2:
+#endif
fprintf(stderr, "Panic! mmap has been called before we had our hands on the real syscall()\n");
exit(EXIT_FAILURE);
break;
}
- }
+#endif /* Find symbol if not already set. Exit if not found. */ tsocks_libc_syscall = tsocks_find_libc_symbol(LIBC_SYSCALL_NAME_STR, TSOCKS_SYM_EXIT_NOT_FOUND);
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
David Goulet:
Ok, I manage to make it work with Firefox.
Yeah! :)
However, and a BIG however, this is a special fix for specific case where memory allocation is handle by the application AND syscall() is used. It will not cover the broader issue of using other syscall within a malloc hook for instance.
After two days, I only see that solution for now as a "working fix" for application that use syscall() directly for their memory allocation.
Thoughts?
As long as the case is detected and we print a nice error message instead of deadlocking, I think it'll be good enough. That's why I suggested that patch (actually broken, it needs better #if/#else).
On 30 Oct (23:07:18), Lunar wrote:
David Goulet:
Ok, I manage to make it work with Firefox.
Yeah! :)
However, and a BIG however, this is a special fix for specific case where memory allocation is handle by the application AND syscall() is used. It will not cover the broader issue of using other syscall within a malloc hook for instance.
After two days, I only see that solution for now as a "working fix" for application that use syscall() directly for their memory allocation.
Thoughts?
As long as the case is detected and we print a nice error message instead of deadlocking, I think it'll be good enough. That's why I suggested that patch (actually broken, it needs better #if/#else).
Yah agree. For that, I'll simply check if the libc symbol pointer of syscall() is NULL and if so we are in a lookup loop and stop right there instead of dead locking.
I'll push that fix soon for the mmap/munmap() with a BIG FAT comment explaining why and this special use case (malloc hook + direct syscall() in the hook).
Cheers! David
-- Lunar lunar@torproject.org
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev