On Nov 21, 2014, at 1:06 PM, David Goulet dgoulet@ev0ke.net wrote:
On 21 Nov (12:59:43), Rob Jansen wrote:
On Nov 21, 2014, at 10:40 AM, David Goulet dgoulet@ev0ke.net wrote:
Please see https://trac.torproject.org/projects/tor/ticket/13802 about the instrumentation part. We'll definitely have to talk more on the integration of Shadow and a userspace tracer but of what I got from Nick, it sounds totally doable without too much trouble.
If we want the tracer to also work inside of Shadow, then the biggest potential problem I can think of right now is thread safety. Shadow uses several worker threads, each of which are assigned to run hundreds to thousands of Tor nodes. If Tor is using lttng as a dynamic library and it is not thread-safe, we will run into issues.
One way to avoid those issues could be to statically link lttng to Tor. However, even this could go bad if lttng uses global state, because that would mean that those hundreds of Tor nodes assigned to a Shadow worker thread would be sharing that state. Probably not what we want. To get around the global state issue, Shadow would have to compile lttng specially, using the same LLVM pass to hoist out the global variables as we use for Tor. That may get messy.
LTTng is an inprocess library and spawns a thread to handle all the tracing and interaction with the main tracing registry of lttng (that manages the buffers, clients, consumers, streaming, etc...).
Nick told me that Shadow moves forward the clock so as long as you highjack clock_gettime for monotonic time, we'll be fine :).
Great! Shadow does interpose clock_gettime (among other time functions).
So it really depends on how robust lttng is, and as I have no experience with it, I can only speculate. But if you let me know when you have some minimal instrumentation ready, I can test in Shadow early enough that we could adjust if needed.
The LTTng userspace tracer is thread safe, no issue with that :).
That’s a relief!
I already have a couple of tracepoints in the HS client subsystem as we speak, I'm currently adding more to do some very basic measurements on the timings of each client HS cell (in rend_process_relay_cell()).
Once I have something that you can try, I'll send you a link to the branch with the instrumentation and you can see if you can make it happen with shadow :).
OK, great!
-Rob
Cheers! David
-Rob