Re: [tor-dev] gettimeofday() Syscall Issues

2 Jan 2015

      I also completely forgot to mention this comment:
https://trac.torproject.org/projects/tor/ticket/14056#comment:6
I'll look into doing the replacements. If anyone has concerns about
whether approx_time is exact enough, please let me know - I had never
heard of it previously. I'll look into it myself as well.
On 01/01/2015 11:42 PM, Libertas wrote:
...
On 01/01/2015 03:47 PM, Nick Mathewson wrote:
...
What part of tor is accounting for all these time/gettimeofday calls?
I set up an OpenBSD 5.6 VM and built Tor 0.2.5.10 on it. I couldn't get
chutney working because of a number of strange errors that seemed
platform-specific, so I went for a less thorough solution. I can't run a
Tor relay from home, so the only activity the Tor process had was some
Firefox traffic I created.
There are surprisingly few syscall analysis tools available on OpenBSD,
and I'm new to syscall/interrupt tuning, so I eventually ended up
selecting the advanced "printf()" technique.
As a quick refresher to those not familiar, there is a linear heirarchy
of time functions in the Tor code, each calling the one below it:
gettimeofday() < tor_gettimeofday() < tor_gettimeofday_cached() <
tor_gettimeofday_cached_monotonic()
The first two will always make the syscall, while the other two will
first try the cache. I applied print statements to instances of all four.
There was a steady stream of tor_gettimeofday() calls from
circuit_expire_building() - about two per second with a single idle TCP
connection.
When there was active traffic flow, there were many more gettimeofday()
calls that were almost exclusively the product of
tor_gettimeofday_cached_monotonic(). This function is called in three
places:

once in src/or/relay.c
once in src/or/buffers.c
once in src/or/circuitlist.c

The first two account for the bulk of the calls, as they are in the core
data relaying logic.
Ultimately, the problem seems to be that the caching is very weak. At
most, only half of the calls to tor_gettimeofday_cached_monotonic() use
the cache. It appears in the vomiting print statements that loading a
single simple HTML page (http://www.openbsd.org/faq/ports/guide.html to
be exact) will cause >30 gettimeofday() syscalls. You can imagine how
that would accumulate for an exit carrying 800 KB/s if the caching
doesn't improve much with additional circuits.
If I understand correctly, this implies either a bug in the caching
logic or the simple fact that libevent calls back and clears the cache
too much for the caching to be very useful.
The cache is only cleared in two places (both pretty self-explanatory,
and both triggered by libevent callbacks):

connection_handle_read()
connection_handle_write()

Apparently, the above functions are called >30 times when fetching a
JS-less non-HTTPS HTML page. I don't know enough about TCP to comment on
whether this makes sense.
Sorry for the magnum opus. I thought I'd try to clarify the context for
those interested, and typing this helped organize my thoughts.
Let me know if I've made any mistakes thus far. I'll start looking into
potential improvements. I'm eager to work with someone on this if
anyone's interested. Also, if you have any ideas or suggestions, please
share.
Thanks for reading!
Libertas

tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] gettimeofday() Syscall Issues