I'm moving this to tor-dev as it's relevant to other developers.
My chutney repository a little messy at the moment, and I've merged draft throughput-testing code into my master branch. You're welcome to test the draft code by merging
The throughput-testing functionality works, but it needs to be tidied up, and have the hard-coded constants turned into command-line options. So it's not ready for release, but it is ready for an early review to check for functionality gaps and bugs.
You might find throughput metrics useful to confirm that any multithreading changes are actually improving performance.
Testing
Once tor is built, you can test it using:
# basic tests
make test
# extended tests
make check
# benchmarks
/src/test/bench
# verify tor connectivity using a local test network - requires chutney
make test-network
# verify that the core functionality of tor works, requires chutney and IPv6 on localhost
src/test/test-network.sh --flavour bridges+ipv6+hs
# verify that the core functionality of tor works, using chutney, but without IPv6
src/test/test-network.sh --flavour bridges
src/test/test-network.sh --flavour hs
There's also various other testing tools:
shadow Tor network simulator
static analysers, such as coverity and clang-scan
dynamic sanitizers: Undefined Behaviour (UBSan), Address (ASAN), …
fuzzing (I'm working on some tor-specific harnesses for fuzzing, but they're not ready)
and I'm sure there are others which I've missed.
Generally, I'm happy with code once I know it:
* compiles with no warnings,
* passes the extended unit tests (make check), including any tests written for new functionality, and
* passes all the connectivity tests in a comprehensive test network (bridges+ipv6+hs).
For bonus points, you can compile tor using UBSan and ASAN. If you're using clang, you'll find that tor/contrib/clang helps with setting this up. It lists known undefined behaviour in the Tor codebase, so it may be useful if you want to do something similar with gcc. (However, getting sanitizers to work can be incredibly fiddly and a total time-suck, too. So if it's just not working, skip it.)
I'll also occasionally run code through a static analyser, to find subtle bugs which haven't been uncovered using UBSan/ASAN. Again, this isn't something everyone needs to do.
Others may have advice on testing multithreading code in particular.
Tim