On 9/25/13 10:30 PM, Kevin wrote:
Hi Karsten, Sathya,
Sorry for the delayed response, been having connection issues all week on anything other than a phone ~_~. I've included updates from Sathya's later mails below also with additional comments added.
I don't see how we could make new experiments language agnostic. These new experiments will want to re-use Torperf classes/components, which they can't do if they're just "shell scripts". They need to implement an interface provided by Torperf, and Torperf needs to provide functionality via an interface that the experiments understand. If an experiment is in fact just a shell script, it should be trivial to write a wrapper class to call that. But I think that's the exception, not the rule.
Or maybe we have a different understanding of an "experiment". Can you give an example for an experiment that is not listed in Section 3 of the design document and state how you'd want to integrate that into Torperf without touching its sources?
I don't think it's such a hard piece to achieve, if you consider the Alexa & Selenium experiment. We're going to need to have a basic HTTP proxy implementation inside the Torperf service (proxying to the socks proxy of a specific tor version, specified by the experiment's config).
If you imagine this, this HTTP proxy is already an interface that applies most of the logic required (times appropriate http and socks timings, socks ports, tor version, time started, etc) so the selenium client is really just responsible for it's unique data.
Then, assuming the result format is not hard to muster up (as you've specified it currently, it should be simple), then gaining agnostic experiments would not be difficult.
A concrete example that differs from Section 3, would be; that, to change the alexa experiment to do say the top5 sites in France or something, it should hopefully be trivial to just change a text file and be done with it instead of having to be familiar with Python/whatever.
That said, it is an 'ideally we could' kind of point, so not a blocker to not aim for it. Either way, the user will be free to hack on whatever experiments so I'm sure it won't be hard for them to do the above by just hacking on the final implementation :) The real users are likely technically adept I guess!(?)
Users are likely technically adept, yes, but it might be that some just do us a favor by running Torperf on their well-connected machine and don't want to think hard what they're doing.
But it seems we agree here that we shouldn't include this in the first version.
I also added an Appendix A with suggested data formats. Maybe these data formats make it clearer what I'd expect as output from an experiment.
This is great, thanks for this. We're thinking along the same lines for that experiment at least :) I think it would be useful to also add desired/required information on other experiments as we progress as it can definitely help clarify what is required from each implementation.
Want to help define the remaining data formats? I think we need these formats:
- file_upload would be quite similar to file_download, but for the GET POST performance experiment. Or maybe we can generalize file_download to cover either GET or POST requests and the respective timings.
- We'll need a document for hidden_service_request that does not only contain timings, but also references the client-side circuit used to fetch the hidden service descriptor, rendezvous circuit, and introduction circuit, and server-side introduction and rendezvous circuits.
- These data formats are all for fetching/posting static files. We should decide on a data format for actual website fetches. Rob van der Hoeven suggested HAR, which I included in a footnote. So, maybe we should extend HAR to store the tor-specific stuff, or we should come up with something else.
- Are there any other data formats missing?
I agree with you that this is a rather unusual requirement and that adding new experiments to Torperf is the better approach. That's why the paragraph said "should" and "ideally". I added your concerns to the design document to make this clearer. (Maybe we should mark requirements as either "must-do", "should-do", or "could-do"?)
Well, "ideally" implies that we want to do this at some point. Do we?
I don't feel strongly. I'd prefer a design that makes it easy to add new experiments, but I'm fine with an approach that requires merging patches. We can always add the functionality to drop something in a directory and make Torperf magically detect and run the new experiment, but that can happen later. Maybe we shouldn't let that distract us from getting the first version done. I commented out this section.
I think the magically detect and run part can definitely be left for future, but installation should still be this easy.
Surely it's just as easy to implement detecting new experiments on service startup as to implement not doing that. (while still somehow allowing experiments to be added... is this implying hardcoded experiments?)
I guess experiment types will be hard-coded, but experiment instances will be configurable.
Also, perhaps you don't want to support this, but how does the patch and merge system work for quick deployments of short lived experiments? (Is there ever such a thing? Karsten?)
Yes, there might be such a thing as short-lived experiments. We'd probably commit such a patch to a separate branch and decide after the experiment if it's worth adding the experiment to the master branch.
Or what if someone does develop a neat set of experiments for their own personal use that doesn't really apply to the project as a whole, are we expected to merge them upstream? What if they don't want to share?
I think we should only merge experiments that are general enough for others to run.
It should be possible to run different experiments with different tor versions or binaries in the same Torperf service instance.
I don't think we need this now. I'm totally ok with having users run different torperf instances for different tor versions.
Running multiple Torperf instances has disadvantages that I'm not sure how to work around. For example, we want a single web server listening on port 80 for all experiments and for providing results.
Oh. I did not mean running multiple torperf instances *simultaneously*; I just meant sequentially.
But what if we want to run multiple experiments at the same time? That's a quite common requirement. Right now, we run 3 Torperf experiments on ferrinii at the same time. A few years ago, we ran 15 experiments with tors using different guard selection strategies and downloading different file sizes.
I disagree with removing this requirement.
Yes, so do I.
Why do you think it's hard to run different tor versions or binaries in the same Torperf service instance?
Then each experiment needs to deal with locating, bootstrapping, and shutting down Tor. We could just run a torperf test against a particular tor version, once that's completed, we can run against another tor version and so on. I'm not against this idea -- it can be done. I just don't think it's high priority.
Torperf should help with bootstrapping and shutting down tor, because that's something that all experiments need. Locating tor could just be a question of passing the path to a tor binary to Torperf. See above for sequential vs. parallel experiments.
Locating Tor should just be settings in 'the Torperf config'. { ... tor_versions: { '0.X.X' => '/Path/to/0/x/x/' } ... }
Is there requirement to run the *same* experiment across different Tor versions at the same time (literally parallel) or just to have "I as a user set this up to run for X,Y,Z versions and ran it one time and got all my results."?
You would typically not run an experiment a single time, but set it up to run for a few days. And you'd probably set up parallel experiments to start with a short time offset. (Not sure if this answers your question.)
I think this is what Sathya is saying with:
We could just run a torperf test against a particular tor version, once that's completed, we can run against another tor version and so on.
i.e. for each experiment, there's only one instance of Tor allocated for it at any time, and it does it's versioned runs sequentially.
For each experiment there's one tor instance running a given version. You wouldn't stop, downgrade/upgrade, and restart the tor instance while the experiment is running. If you want to run an experiment on different tor versions, you'd start multiple experiments. For example:
1. download 50KiB static file, use tor 0.2.3.x on socks port 9001, start every five minutes starting at :01 of the hour.
2. download 50KiB static file, use tor 0.2.4.x on socks port 9002, start every five minutes starting at :02 of the hour.
3. download 50KiB static file, use tor 0.2.5.x on socks port 9003, start every five minutes starting at :03 of the hour.
I think the discussion above is talking about two different things, I think it would be beneficial to decide what needs to be actually parallel and what just needs to be one-time setup for a user.
Are there any concerns around parallel requests causing noise in the timing information? Or are we happy to live with a small 1-2(?)ms noise level per experiment in order to benefit from faster experiment runtimes in aggregate?
Not sure which of these questions are still open. We should definitely get this clear in the design document. What would we write, and where in the document would we put it?
On that, can we be clear with our vocabulary, "Torperf tests" means "Torperf experiments", right?
Yes. Hope my "experiment type" vs. "experiment instance" was not too confusing. ;)
It might be beneficial to provide a mechanism to download and verify the signature of new tor versions as they are released. The user could speficy if they plan to test stable, beta or alpha versions of tor with their Torperf instance.
IMHO, torperf should just measure performance, not download Tor or verify signatures. We have good package managers that do that already.
Ah, we don't just want to measure packaged tors. We might also want to measure older versions which aren't contained in package repositories anymore, and we might want to measure custom branches with performance tweaks. Not sure if we actually want to verify signatures of tor versions.
I think we should take Shadow's approach (or something similar). Shadow can download a user-defined tor version ('--tor-version'), or it can build a local tor path ('--tor-prefix'):
If the user wants to run torperf against tor versions that are not present in the package managers, then the user should download and build tor -- not torperf. Once a local binary is present, the user can run torperf against it with a --tor prefix.
It's perfectly fine if the first version only supports a '--tor-binary' option and leaves downloading and building of custom tor versions to the user. Of course, Torperf should be able to support the default tor binary that comes with the operating system for non-expert users. But supporting a '--tor-version' option that downloads and builds a tor binary can come in version two. I tried to describe this approach in the design document. (Please rephrase as needed.)
https://github.com/shadow/shadow/blob/master/setup#L109
Do you see any problems with this?
Nope, this is perfectly fine. I just don't want torperf to download, verify and build tor.
Perfectly fine to ignore for now. It's not a crazy feature. But let's do this later.
Agree on this, no need to do downloading / verifying / installing Tor in initial releases, it's likely a huge PITA. But I think we should have the tor binary locations listed in the config rather than a command flag. (Listing multiple Tor versions via command flag seems a lot more error prone to me)
Ah, my mistake with the command flag. Yes, config option.
A Torperf service instance should be able to accumulate results from its own experiments and remote Torperf service instances
Torperf should not accumulate results from remote Torperf service instances. If by "accumulate", you mean read another file from /results which the *user* has downloaded, then yes. Torperf shouldn't *download* result files from remote instances.
Why not? The alternative is to build another tool that downloads result files from remote instances. That's what we do right now (see footnote: "For reference, the current Torperf produces measurement results which are re-formatted by metrics-db and visualized by metrics-web with help of metrics-lib. Any change to Torperf triggers subsequent changes to the other three codebases, which is suboptimal.")
This could just be a wget script that downloads the results from another server. I just don't want that to be a part of torperf. Torperf should just measure performance and display data, IMHO -- not worry about downloading and aggregating results from another system. Or maybe we can do this later and change it to "Ideally torperf should .."
This isn't the most urgent feature to build, though we need it before we can kill the current Torperf and replace it with the new one. However, using wget to download results from another service is exactly the approach that brought us to the current situation of Torperf being a bunch of scripts. I'd rather not want to write a single script for Torperf to do what it's supposed to do, but design it in a way that it can already do all the things we want it to do. Accumulating results and presenting them is part of these things.
"Torperf should just measure performance and display data", displaying aggregate data is displaying data! :P
But, surely if Torperf just achieves this by wget'ing stuff, and the user doesn't have to worry about anything other than setting a remote server and an interval to poll, that would be considered done? (Torperf handles the scheduling and managing of the data files)
This isn't going to be difficult code, but I'd want to avoid relying on wget if you mean the command-line tool.
results database Store request details, retrieve results, periodically delete old results if configured.
Not sure if we really need a database. These tests look pretty simple to me.
Rephrased to data store. I still think a database makes sense here, but this is not a requirement. As long as we can store, retrieve, and periodically delete results, everything's fine.
Cool!
I don't think we need a database for the actual results (but a flat file structure is just a crap database! :). I do however think, once we start to provide the data visualisation aspects, it will need a database for performance when doing queries that are more than simple listings.
Yup, don't feel strongly.
Thanks for your feedback!
All the best, Karsten