On 21 Nov (10:14:23), Rob Jansen wrote:
On Nov 20, 2014, at 4:59 PM, David Goulet dgoulet@ev0ke.net wrote:
On 20 Nov (14:45:12), Rob Jansen wrote:
Are there other HS performance improvements that we think may be ready by January?
On my part, I have a chutney network with an HS and clients that fetch data on it. I'm currently working on instrumenting the HS subsystem so we can gather performance data and analyze it for meaningful pointers on where are the contention points, confirm expected behaviors, etc... I'll begin soon updating the following ticket with more information on the work I'm doing. (I'm in Boston right now collaborating with Nick for the week so things are a bit more slow on this front until monday).
https://trac.torproject.org/projects/tor/ticket/13792
This could be used also with shadow I presume. Since the deadline is near us, I choose chutney for simplicity reasons here.
Chutney is the right tool for tracing CPU resource problems. Shadow is the right tool when trying to gather realistic network level performance statistics, and testing code at scale. Also, Shadow potentially runs faster than real time if you are only using a handful of nodes. If you are not using Shadow because it is too complex, then please, please let me help with that.
Yes, considering the amount of resources you have for a big Tor private network, shadow is definitely a good idea!
The plan right now is for me to start instrumenting the Tor code base (for now it will only be the HS subsystem) and start collecting under chutney to make it work and useful.
Once this is done, we should definitely move that experiment to shadow and make it run on a huge network with multiple events on it (clients, non tor traffic, etc...).
Please see https://trac.torproject.org/projects/tor/ticket/13802 about the instrumentation part. We'll definitely have to talk more on the integration of Shadow and a userspace tracer but of what I got from Nick, it sounds totally doable without too much trouble.
I'll have a talk with Nick tomorrow on how we can possibly have this instrumentation upstream (either logs, controller event or/and tracing).
That would be great! Making it easy to gather data, even if only in TestingTorNetwork mode, will pay dividends.
Yes and having it upstream will make things easier to scale with performance analysis in the future. :)
Things are going forward, we still have some work ahead to gather the HS performance baseline and start trying to improve it. I'm fairly confident that the performance statistics in a private network will give us a good insight on the current situation.
Feel free to propose anything that could be useful to make this thing more efficient/faster/useful :).
I totally agree that a private network is the right approach. A small network will be useful to isolate some performance issues, but I think we also need to make sure we test at a larger scale with the addition of realistic background traffic, etc, so that we understand the performance benefits in a more realistic environment. Shadow allows us to do this and have stats across the entire network on the order of hours. I have the resources to run at least 6000 relays and 30000 clients in a private ShadowTor deployment, and I hope that having results on this scale will impress our funder in January.
That is a HUGE network, love it. For sure, we should definitely run the hs stats in your setup ;).
Perhaps after you finish your traces in chutney and work out some of the code bottlenecks, I can run some more realistic network experiments in Shadow. (Separate branches for each improvement would help here.) Would this actually be helpful? Or do we think that by the time we get to the Shadow step we would have already learned everything we need to know?
I think I mostly answered these above but to answer the last question here, I think having this large of a network will most probably show us things we can NOT observerd in a small chutney network.
Cheers! David
-Rob