On Wed, Dec 19, 2012 at 1:49 PM, Nick Mathewson nickm@alum.mit.edu wrote:
On Wed, Dec 19, 2012 at 2:29 PM, Simon simonhf@gmail.com wrote: [...]
Tor seems to have good planning compared to most open source projects. So I would be interested in hearing why testing is apparently 'falling between the cracks'. Why isn't there just 10 times more test LOC?
Not because of any hatred or disapproval of tests--just because nobody's written that 100 kloc of testing code yet.
I think that's for two main reasons:
- We were in a hurry when we wrote lots of the stuff we wrote.
I'm not trying to start a flame war but this sounds similar to excuses from fat people who don't want to do exercise :-) So I'm just going to pretend you never wrote this :-)
- Large parts of the codebase have been written in a tightly coupled
style that needs refactoring before it can be tested without a live Tor network at hand.
Much automated (unit) testing is done my mocking data structures used by functions and/or mocking functions used by functions. This is possible even with tight coupling.
Personally I think the most effective way to test with code coverage is to test at the system/integration level to get the majority of low-hanging-fruit coverage, and then make up the rest of the coverage with more complicated-to-write unit testing. For the system/integration level testing then it would be great to actually start up a complete test Tor network e.g. on localhost, containing all the components necessary for end-to-end testing using real UDP & TCP traffic. Maybe some of the production code isn't beneficial for such an end-to-end automated test right now, but that's the beauty of developers writing their own tests; they can change the production code to make it more easily support such activities. With an end-to-end Tor network test then I would guess that 'happy path' coverage would jump up to somewhere between 40% and 60%. At least numbers in this range are what I have seen with other projects.
- Until Tor 0.2.2, our testing framework didn't let us have tests
that touched global state, which made our tests in general pretty fragile.
What about implementing a new policy immediately: Any new production LOC committed must be covered by tests, or peer reviewed and democratically excluded?
Goodness knows we need more review and testing.
It doesn't seem fair to reject patches for lacking test coverage when they are patches to code that itself lacks coverage, though. If you write a 5-line patch to connection_ap_rewrite_and_attach, for instance, you probably shouldn't be on the hook for refactoring the whole function to make it testable, though you will be hard-pressed to write any tests for that monster without a complete refactor.
I agree with you that it seems unfair. But the alternative is systematically writing tests to cover all code which is unrealistic and will never happen. There is no other alternative, or? The developer who submits the patch has already comprehended the code in question and is therefore in an excellent position to create the necessary automated tests. Even if the patch comes without tests then presumably the person reviewing and integrating and patch can start a discussion and/or add the test for coverage themselves if necessary.
It might be a reasonable goal to try to set a plan for increasing test coverage by a certain percentage with each release.
One thing is for sure, coverage figures won't improve much without developer discipline :-( I've also seen teams where coverage is enforced, but only prior to releasing and coverage expectations is set below 100%, e.g. at 70% to 90%. Personally I think this is not good for a bunch of reasons: The code being covered is long forgotten by the developer and therefore the test code takes unnecessarily longer. The developers/testers doing the test code will just go for the low hanging fruit, just to get the coverage numbers up. Having to go back and revisit code just seems like a chore and makes the word coverage in the team seem like a dirty word instead of the joyous word which is should be :-) It's a case of 'Look after the pennies and the pounds will look after themselves' :-)
If you like and you have time, it would be cool to stop by the tickets on trac.torproject.org for milestone "Tor: 0.2.4.x-final" in state "needs_review" and look to see whether you think any of them have code that would be amenable to new tests, or to look through currently untested functions and try to figure out how to make more of them tested and testable.
If I were you then I'd first try to create an end-to-end system/integration test via localhost that works via make test. This might involve refactoring the production code or even re-arranging source bases etc. The test script would build and/or mock all necessary parts, bring up the localhost Tor network, run a variety of end-to-end tests, and shut down the localhost Tor network. Next the makefiles should be doctored so that it is easier to discover the coverage, e.g. something like make test-coverage ? At this point the happy path coverage should be much larger than it is today but still way off the desirable 80% to 100% range. At this point one would consider adding the discipline to cover all new lines. The patch author has the personal choice of using unit and/or system/integration level testing to achieve coverage. And there is also a chance that no extra coverage is necessary because the patch is already coverage in the happy path.
If you like the end-to-end localhost Tor network idea then I would be happy to collaborate on creating such a mechanism as a first step.
HTH, Simon
yrs,
Nick _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev