Yes :-) I've seen projects which have tests which take nearly 10 hours to run. However, the longer the tests take to run then the less likely that developers will run them.
Agreed. I run stem's unit tests more often than the integ tests since those have the runtime of around five seconds. For the integ tests I usually supply the '--test' argument so it only runs the test that I'm interested in.
... except that ideally Tor needs to have about 100 times as many tests to get code coverage and quality (of Tor itself) up to the 90% plus range. So with this few tests taking 34 seconds then 100 times more tests would take in the many minutes / hours range.
Not necessarily. It mostly depends on what the tests do - there's a few tests that take around five seconds each and a whole lot of other ones that take a few milliseconds. We could greatly expand stem's test coverage of tor without impacting the runtime much, and could probably lower the runtime a fair bit if we put some more effort into it.
It would be great if the tests themselves reported their own times.
Feel free to add an option for this, it would be reasonably easy to do. Personally I find that it's enough to know the test module that's taking a while, but I could see per-test runtimes being helpful.
And also had a common output format to the standard Tor make test results.
I haven't looked at tor's testing output. Is it better? I invested quite a bit of time in making stem's test output nice and easily readable.
However, during those pauses then I'm seeing almost no CPU, network, or disk activity which leads me to believe that some tests are not written as well as they could be.
I just ran the integ tests and it pegged the cpu of my poor little netbook (it also took 63 seconds - it would be nice if it only took 34 seconds like your system...). I'm not sure why it isn't showing significant resource use on your system.
It looks like the tests are starting up daemons using fixed ports which stops other tests from running in parallel.
This shouldn't be an issue. Multiple controller can bind to the control port.
So what's the difference between Stem tests and 'Chutney'?
Stem is a controller library with integration tests to check its interaction with a live tor instance. Its tests focus on the behavior of tor's control interface.
Chutney however is a framework specifically for testing how multiple tor instances interact. It's under very light development by comparison to stem...
https://gitweb.torproject.org/nickm/chutney.git/shortlog https://gitweb.torproject.org/stem.git/shortlog
Why are neither set of tests included in the Tor repo so that they can be run using make test?
Because they're both separate applications from the core tor executable. Mixing the projects (and their git histories) would be confusing. I would like to see stem be more actively used in core tor development for testing though. In an ideal world new tor controller features would include a corresponding test in stem...