On Wed, Dec 19, 2012 at 5:20 AM, George Kadianakis desnacked@riseup.net wrote:
Philipp Winter identity.function@gmail.com writes:
Hi there,
Deliverable 6 for sponsor Z says:
- Start a tool that a censored developer can run to discover why their Tor is
failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)
The deliverable is due on Feb. 28, 2013 so we should get started.
Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.
I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer
Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.
One thing I consider important in such a tool is unit and integration testing. Ideally, it should be possible to run unit tests on all of its features, to test whether they would work in a real environment and whether any of them are trivially broken.
Unfortunately, designing and writing such unit tests is not easy since you have to emulate a censored network. While developing daphne, me and Arturo considered doing that by using iptables or by monkey-patching the networking methods of Python/Twisted with methods that censor outgoing traffic. Both of those ideas wouldn't fully emulate a censored network, but if developed correctly they would give you an idea of whether a test will work in Real Life or not.
I'm mentioning this because I noticed that you don't have testability included in your feature list, and that might bite you in the long-term. Either because you will have to spend lots of unscheduled time writing tests, or because you won't have the time to write any tests (and your features will break frequently, like in OONI).
Maybe there is no automated testing for any Tor projects? At least a quick search on the wiki only found [1] which lists possible ways to test (but was created 7 months ago and apparently not updated since and collecting dust) and [2] discussing a manual test procedure for TBB. However, tor-0.2.3.25.tar.gz does reveal some test files but the source code ratio of production code to test code is not inspiring at first glance:
$ find src/ -type f | egrep ".c" | egrep -v "/test/" | xargs wc -l 3721 src/or/connection_edge.c ... 4553 src/common/util.c 117674 total
$ find src/ -type f | egrep ".c" | egrep "/test/" | xargs wc -l 143 src/test/test_pt.c ... 3134 src/test/test_util.c 10328 total
I tried ./configure && make && make test and got the following output: ... config/addressmap: OK 89 tests ok. (1 skipped)
That's one test for every 1,322 (== 117,674 / 89) LOC.
To test code coverage then I added '-fprofile-arcs -ftest-coverage' to the CFLAGS in the Makefiles and did make clean && make && make test to rebuild and test. Next to see the code coverage in e.g. src/or/* then I ran the following perl one-liner which runs gcov and tots up everything:
$ gcov *.c | perl -lane 'if(m~File (.*)~){$file=$1;next;} if(m~Lines executed:([\d.]+)% of (\d+)~){next if($file=~m~(/|.h)~); ($pc,$loc)=($1,$2); $tloc+=$loc; $tlocc+=int($loc*$pc/100); $t++; printf qq[Lines executed:%6s%% of %5u LOC in %s\n], $pc, $loc, $file;} sub END{printf qq[Lines executed:%6.2f%% of %5u LOC in src/or/*.c or %u lines covered in $t c source files\n], $tlocc/$tloc*100, $tloc, $tlocc;}' Lines executed: 44.73% of 825 LOC in 'buffers.c' Lines executed: 16.04% of 2300 LOC in 'circuitbuild.c' Lines executed: 0.00% of 626 LOC in 'circuitlist.c' Lines executed: 0.00% of 739 LOC in 'circuituse.c' Lines executed: 0.00% of 528 LOC in 'command.c' Lines executed: 12.40% of 2855 LOC in 'config.c' Lines executed: 0.00% of 2 LOC in 'config_codedigest.c' Lines executed: 0.00% of 1552 LOC in 'connection.c' Lines executed: 8.19% of 1441 LOC in 'connection_edge.c' Lines executed: 0.00% of 821 LOC in 'connection_or.c' Lines executed: 1.44% of 2008 LOC in 'control.c' Lines executed: 0.00% of 187 LOC in 'cpuworker.c' Lines executed: 4.59% of 1633 LOC in 'directory.c' Lines executed: 6.91% of 1592 LOC in 'dirserv.c' Lines executed: 44.72% of 1648 LOC in 'dirvote.c' Lines executed: 0.00% of 646 LOC in 'dns.c' Lines executed: 0.00% of 141 LOC in 'dnsserv.c' Lines executed: 57.39% of 582 LOC in 'geoip.c' Lines executed: 2.07% of 387 LOC in 'hibernate.c' Lines executed: 0.00% of 943 LOC in 'main.c' Lines executed: 66.46% of 328 LOC in 'microdesc.c' Lines executed: 11.78% of 1053 LOC in 'networkstatus.c' Lines executed: 17.71% of 350 LOC in 'nodelist.c' Lines executed: 31.74% of 167 LOC in 'onion.c' Lines executed: 63.45% of 632 LOC in 'policies.c' Lines executed: 0.00% of 140 LOC in 'reasons.c' Lines executed: 0.00% of 1057 LOC in 'relay.c' Lines executed: 0.00% of 474 LOC in 'rendclient.c' Lines executed: 25.60% of 629 LOC in 'rendcommon.c' Lines executed: 0.00% of 123 LOC in 'rendmid.c' Lines executed: 0.29% of 1045 LOC in 'rendservice.c' Lines executed: 23.14% of 1223 LOC in 'rephist.c' Lines executed: 10.75% of 1088 LOC in 'router.c' Lines executed: 9.03% of 2513 LOC in 'routerlist.c' Lines executed: 51.81% of 2297 LOC in 'routerparse.c' Lines executed: 0.00% of 44 LOC in 'status.c' Lines executed: 25.69% of 436 LOC in 'transports.c' Lines executed: 15.57% of 35184 LOC in 'transports.c' Lines executed: 15.55% of 70239 LOC in src/or/*.c or 10924 lines covered in 38 c source files
Code coverage in src/common/* is somewhat better although still poor:
$ gcov *.c | perl -lane 'if(m~File (.*)~){$file=$1;next;} if(m~Lines executed:([\d.]+)% of (\d+)~){next if($file=~m~(/|.h)~); ($pc,$loc)=($1,$2); $tloc+=$loc; $tlocc+=int($loc*$pc/100); $t++; printf qq[Lines executed:%6s%% of %5u LOC in %s\n], $pc, $loc, $file;} sub END{printf qq[Lines executed:%6.2f%% of %5u LOC in src/common/*.c or %u lines covered in $t c source files\n], $tlocc/$tloc*100, $tloc, $tlocc;}' Lines executed: 69.04% of 604 LOC in 'address.c' Lines executed:100.00% of 23 LOC in 'aes.c' Lines executed: 45.64% of 642 LOC in 'compat.c' Lines executed:100.00% of 17 LOC in 'strlcat.c' Lines executed:100.00% of 13 LOC in 'strlcpy.c' Lines executed: 0.00% of 143 LOC in 'compat_libevent.c' Lines executed: 91.76% of 534 LOC in 'container.c' Lines executed: 70.94% of 1091 LOC in 'crypto.c' Lines executed:100.00% of 21 LOC in 'di_ops.c' Lines executed: 8.92% of 426 LOC in 'log.c' Lines executed: 80.53% of 113 LOC in 'memarea.c' Lines executed: 85.29% of 238 LOC in 'mempool.c' Lines executed: 0.00% of 42 LOC in 'procmon.c' Lines executed: 59.90% of 192 LOC in 'torgzip.c' Lines executed: 0.00% of 918 LOC in 'tortls.c' Lines executed: 76.27% of 1412 LOC in 'util.c' Lines executed: 0.00% of 2 LOC in 'util_codedigest.c' Lines executed: 55.59% of 6507 LOC in 'util_codedigest.c' Lines executed: 55.52% of 12938 LOC in src/common/*.c or 7183 lines covered in 18 c source files
Overall gcc sees 70,239 + 12,938 == 83,177 LOC total for src/or/*.c and src/common/*.c, and sees 10,924 + 7,183 == 18,107 of these lines executed after running make test. That's a grand total code coverage of 21.77% of lines covered via make test. Better than no tests but still very poor :-(
An interesting paper about the effects of automated testing, production to test LOC ratios, and code coverage can be found here [3].
Tor seems to have good planning compared to most open source projects. So I would be interested in hearing why testing is apparently 'falling between the cracks'. Why isn't there just 10 times more test LOC? What about implementing a new policy immediately: Any new production LOC committed must be covered by tests, or peer reviewed and democratically excluded?
[1] https://trac.torproject.org/projects/tor/wiki/doc/Testing [2] https://trac.torproject.org/projects/tor/wiki/doc/Testing/TBBSmokeTest [3] http://research.microsoft.com/en-us/groups/ese/nagappan_tdd.pdf