Hi there,
Deliverable 6 for sponsor Z says:
- Start a tool that a censored developer can run to discover why their Tor is
failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)
The deliverable is due on Feb. 28, 2013 so we should get started.
Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.
I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer
Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.
Cheers, Philipp
Philipp Winter identity.function@gmail.com writes:
Hi there,
Deliverable 6 for sponsor Z says:
- Start a tool that a censored developer can run to discover why their Tor is
failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)
The deliverable is due on Feb. 28, 2013 so we should get started.
Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.
I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer
Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.
One thing I consider important in such a tool is unit and integration testing. Ideally, it should be possible to run unit tests on all of its features, to test whether they would work in a real environment and whether any of them are trivially broken.
Unfortunately, designing and writing such unit tests is not easy since you have to emulate a censored network. While developing daphne, me and Arturo considered doing that by using iptables or by monkey-patching the networking methods of Python/Twisted with methods that censor outgoing traffic. Both of those ideas wouldn't fully emulate a censored network, but if developed correctly they would give you an idea of whether a test will work in Real Life or not.
I'm mentioning this because I noticed that you don't have testability included in your feature list, and that might bite you in the long-term. Either because you will have to spend lots of unscheduled time writing tests, or because you won't have the time to write any tests (and your features will break frequently, like in OONI).
On Wed, Dec 19, 2012 at 5:20 AM, George Kadianakis desnacked@riseup.net wrote:
Philipp Winter identity.function@gmail.com writes:
Hi there,
Deliverable 6 for sponsor Z says:
- Start a tool that a censored developer can run to discover why their Tor is
failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)
The deliverable is due on Feb. 28, 2013 so we should get started.
Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.
I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer
Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.
One thing I consider important in such a tool is unit and integration testing. Ideally, it should be possible to run unit tests on all of its features, to test whether they would work in a real environment and whether any of them are trivially broken.
Unfortunately, designing and writing such unit tests is not easy since you have to emulate a censored network. While developing daphne, me and Arturo considered doing that by using iptables or by monkey-patching the networking methods of Python/Twisted with methods that censor outgoing traffic. Both of those ideas wouldn't fully emulate a censored network, but if developed correctly they would give you an idea of whether a test will work in Real Life or not.
I'm mentioning this because I noticed that you don't have testability included in your feature list, and that might bite you in the long-term. Either because you will have to spend lots of unscheduled time writing tests, or because you won't have the time to write any tests (and your features will break frequently, like in OONI).
Maybe there is no automated testing for any Tor projects? At least a quick search on the wiki only found [1] which lists possible ways to test (but was created 7 months ago and apparently not updated since and collecting dust) and [2] discussing a manual test procedure for TBB. However, tor-0.2.3.25.tar.gz does reveal some test files but the source code ratio of production code to test code is not inspiring at first glance:
$ find src/ -type f | egrep ".c" | egrep -v "/test/" | xargs wc -l 3721 src/or/connection_edge.c ... 4553 src/common/util.c 117674 total
$ find src/ -type f | egrep ".c" | egrep "/test/" | xargs wc -l 143 src/test/test_pt.c ... 3134 src/test/test_util.c 10328 total
I tried ./configure && make && make test and got the following output: ... config/addressmap: OK 89 tests ok. (1 skipped)
That's one test for every 1,322 (== 117,674 / 89) LOC.
To test code coverage then I added '-fprofile-arcs -ftest-coverage' to the CFLAGS in the Makefiles and did make clean && make && make test to rebuild and test. Next to see the code coverage in e.g. src/or/* then I ran the following perl one-liner which runs gcov and tots up everything:
$ gcov *.c | perl -lane 'if(m~File (.*)~){$file=$1;next;} if(m~Lines executed:([\d.]+)% of (\d+)~){next if($file=~m~(/|.h)~); ($pc,$loc)=($1,$2); $tloc+=$loc; $tlocc+=int($loc*$pc/100); $t++; printf qq[Lines executed:%6s%% of %5u LOC in %s\n], $pc, $loc, $file;} sub END{printf qq[Lines executed:%6.2f%% of %5u LOC in src/or/*.c or %u lines covered in $t c source files\n], $tlocc/$tloc*100, $tloc, $tlocc;}' Lines executed: 44.73% of 825 LOC in 'buffers.c' Lines executed: 16.04% of 2300 LOC in 'circuitbuild.c' Lines executed: 0.00% of 626 LOC in 'circuitlist.c' Lines executed: 0.00% of 739 LOC in 'circuituse.c' Lines executed: 0.00% of 528 LOC in 'command.c' Lines executed: 12.40% of 2855 LOC in 'config.c' Lines executed: 0.00% of 2 LOC in 'config_codedigest.c' Lines executed: 0.00% of 1552 LOC in 'connection.c' Lines executed: 8.19% of 1441 LOC in 'connection_edge.c' Lines executed: 0.00% of 821 LOC in 'connection_or.c' Lines executed: 1.44% of 2008 LOC in 'control.c' Lines executed: 0.00% of 187 LOC in 'cpuworker.c' Lines executed: 4.59% of 1633 LOC in 'directory.c' Lines executed: 6.91% of 1592 LOC in 'dirserv.c' Lines executed: 44.72% of 1648 LOC in 'dirvote.c' Lines executed: 0.00% of 646 LOC in 'dns.c' Lines executed: 0.00% of 141 LOC in 'dnsserv.c' Lines executed: 57.39% of 582 LOC in 'geoip.c' Lines executed: 2.07% of 387 LOC in 'hibernate.c' Lines executed: 0.00% of 943 LOC in 'main.c' Lines executed: 66.46% of 328 LOC in 'microdesc.c' Lines executed: 11.78% of 1053 LOC in 'networkstatus.c' Lines executed: 17.71% of 350 LOC in 'nodelist.c' Lines executed: 31.74% of 167 LOC in 'onion.c' Lines executed: 63.45% of 632 LOC in 'policies.c' Lines executed: 0.00% of 140 LOC in 'reasons.c' Lines executed: 0.00% of 1057 LOC in 'relay.c' Lines executed: 0.00% of 474 LOC in 'rendclient.c' Lines executed: 25.60% of 629 LOC in 'rendcommon.c' Lines executed: 0.00% of 123 LOC in 'rendmid.c' Lines executed: 0.29% of 1045 LOC in 'rendservice.c' Lines executed: 23.14% of 1223 LOC in 'rephist.c' Lines executed: 10.75% of 1088 LOC in 'router.c' Lines executed: 9.03% of 2513 LOC in 'routerlist.c' Lines executed: 51.81% of 2297 LOC in 'routerparse.c' Lines executed: 0.00% of 44 LOC in 'status.c' Lines executed: 25.69% of 436 LOC in 'transports.c' Lines executed: 15.57% of 35184 LOC in 'transports.c' Lines executed: 15.55% of 70239 LOC in src/or/*.c or 10924 lines covered in 38 c source files
Code coverage in src/common/* is somewhat better although still poor:
$ gcov *.c | perl -lane 'if(m~File (.*)~){$file=$1;next;} if(m~Lines executed:([\d.]+)% of (\d+)~){next if($file=~m~(/|.h)~); ($pc,$loc)=($1,$2); $tloc+=$loc; $tlocc+=int($loc*$pc/100); $t++; printf qq[Lines executed:%6s%% of %5u LOC in %s\n], $pc, $loc, $file;} sub END{printf qq[Lines executed:%6.2f%% of %5u LOC in src/common/*.c or %u lines covered in $t c source files\n], $tlocc/$tloc*100, $tloc, $tlocc;}' Lines executed: 69.04% of 604 LOC in 'address.c' Lines executed:100.00% of 23 LOC in 'aes.c' Lines executed: 45.64% of 642 LOC in 'compat.c' Lines executed:100.00% of 17 LOC in 'strlcat.c' Lines executed:100.00% of 13 LOC in 'strlcpy.c' Lines executed: 0.00% of 143 LOC in 'compat_libevent.c' Lines executed: 91.76% of 534 LOC in 'container.c' Lines executed: 70.94% of 1091 LOC in 'crypto.c' Lines executed:100.00% of 21 LOC in 'di_ops.c' Lines executed: 8.92% of 426 LOC in 'log.c' Lines executed: 80.53% of 113 LOC in 'memarea.c' Lines executed: 85.29% of 238 LOC in 'mempool.c' Lines executed: 0.00% of 42 LOC in 'procmon.c' Lines executed: 59.90% of 192 LOC in 'torgzip.c' Lines executed: 0.00% of 918 LOC in 'tortls.c' Lines executed: 76.27% of 1412 LOC in 'util.c' Lines executed: 0.00% of 2 LOC in 'util_codedigest.c' Lines executed: 55.59% of 6507 LOC in 'util_codedigest.c' Lines executed: 55.52% of 12938 LOC in src/common/*.c or 7183 lines covered in 18 c source files
Overall gcc sees 70,239 + 12,938 == 83,177 LOC total for src/or/*.c and src/common/*.c, and sees 10,924 + 7,183 == 18,107 of these lines executed after running make test. That's a grand total code coverage of 21.77% of lines covered via make test. Better than no tests but still very poor :-(
An interesting paper about the effects of automated testing, production to test LOC ratios, and code coverage can be found here [3].
Tor seems to have good planning compared to most open source projects. So I would be interested in hearing why testing is apparently 'falling between the cracks'. Why isn't there just 10 times more test LOC? What about implementing a new policy immediately: Any new production LOC committed must be covered by tests, or peer reviewed and democratically excluded?
[1] https://trac.torproject.org/projects/tor/wiki/doc/Testing [2] https://trac.torproject.org/projects/tor/wiki/doc/Testing/TBBSmokeTest [3] http://research.microsoft.com/en-us/groups/ese/nagappan_tdd.pdf
On Tue, Dec 18, 2012 at 7:07 PM, Philipp Winter identity.function@gmail.com wrote:
Hi there,
Hi Philipp,
Deliverable 6 for sponsor Z says:
- Start a tool that a censored developer can run to discover why their Tor is
failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)
The deliverable is due on Feb. 28, 2013 so we should get started.
Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.
I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer
Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.
Thanks for starting this! I have updated the page with a few extra things.
On Dec 18, 2012, at 8:07 PM, Philipp Winter identity.function@gmail.com wrote:
Hi there,
Deliverable 6 for sponsor Z says:
- Start a tool that a censored developer can run to discover why their Tor is
failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)
The deliverable is due on Feb. 28, 2013 so we should get started.
Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.
I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer
Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.
I believe you should be using ooniprobe to build a the tests you are interested in building, or you may at least be interested in looking at our code to see how to do the things you are interested in doing.
The main points where ooniprobe would be of use to you (now) are:
# Standard reporting format
All ooniprobe tests share a common base format depending on the test template your test is based on.
I recommend you look at the Test Writing tutorial to get an idea of how this looks like: https://ooni.torproject.org/docs/writing_tests.html
# Collection of packet captures
When you run an ooniprobe test and you have set your ooniprobe.conf file to "includepcap: true" then you will collect a full pcap of what has happened on the probes network during the test run.
Note: This requires the test to be run as root and will include *all* the network traffic during the testing session (i.e. if the user is looking at their favorite kitten website while running the test, such data will be in the pcap)
# Collection of packet captures specific to the sent and received packets
When you run a ooniprobe test that inherits from the scapy test template (https://ooni.torproject.org/docs/api/ooni.templates.html#module-ooni.templat...) the packets sent and received (i.e. that are answers to the packet(s) sent) will be captured.
When configured to not include the probe IP address, source IP of sent packets and dst IP of received packets is replaced with 127.0.0.1. (warning: if the IP address of the probe is present in some other parts of the packet it will not get stripped, for example if it's present in the ICMP citation)
# Reporting system
Currently we only support collection of YAML formatted reports (that means not .pcap files) and only via Tor Hidden Services.
Extending it to support reporting via HTTP(s) should be trivial and is a feature that we have already received a request for.
Adding support for collecting also .pcaps also probably does not require that much amount of time and is something that will happen in the near future.
# Things to come
ooniprobe will soon expose a HTTP based API that binds to localhost that can then be (optionally) exposed as a Tor Hidden Service. Such API will allow researchers to connect to a probe and run some tests and will allow us to build a JS/HTML5 client interface to allow users to select which tests to run and monitor the status of running tests.
More details here: https://ooni.torproject.org/docs/architecture.html#ooniprobe-api
For a birds-eye view of the project see: https://ooni.torproject.org/docs/architecture.html
Even if you do not end up using ooniprobe for developing your system today, I highly encourage you to use the libraries that we are using so that in the future we can find a way to integrate code from each others projects.
The main libraries that we are using are:
* Twisted http://twistedmatrix.com * Scapy http://www.secdev.org/projects/scapy/ * txtorcon https://github.com/meejah/txtorcon
~ Art.
First of all thanks a lot for summing all of that up in such great detail, Arturo. Comments inline.
On Fri, Dec 21, 2012 at 04:16:32PM +0100, Arturo Filastò wrote:
# Collection of packet captures specific to the sent and received packets
When you run a ooniprobe test that inherits from the scapy test template (https://ooni.torproject.org/docs/api/ooni.templates.html#module-ooni.templat...) the packets sent and received (i.e. that are answers to the packet(s) sent) will be captured.
When configured to not include the probe IP address, source IP of sent packets and dst IP of received packets is replaced with 127.0.0.1. (warning: if the IP address of the probe is present in some other parts of the packet it will not get stripped, for example if it's present in the ICMP citation)
Sounds like a good thing to have.
# Reporting system
Currently we only support collection of YAML formatted reports (that means not .pcap files) and only via Tor Hidden Services.
Extending it to support reporting via HTTP(s) should be trivial and is a feature that we have already received a request for.
Adding support for collecting also .pcaps also probably does not require that much amount of time and is something that will happen in the near future.
That sounds good. Hidden services will not be useful in this case because Tor is expected to be unavailable but HTTPS could work.
# Things to come
ooniprobe will soon expose a HTTP based API that binds to localhost that can then be (optionally) exposed as a Tor Hidden Service. Such API will allow researchers to connect to a probe and run some tests and will allow us to build a JS/HTML5 client interface to allow users to select which tests to run and monitor the status of running tests.
Hmm, what's the use case here? To provide an "anonymous" ooniprobe which can be controlled remotely by people I trust? I guess it won't be possible to hide the probe's IP address since I can just run a test which makes it connect to an IP address under my control?
More details here: https://ooni.torproject.org/docs/architecture.html#ooniprobe-api
For a birds-eye view of the project see: https://ooni.torproject.org/docs/architecture.html
Thanks. On a more general note, a core requirements is to make the analysis tool easy to use since we can't expect users to mess around with configuration. How easy do you think would it be to package an ooniprobe with our analysis tests in a self-contained executable which can then simply be run by users?
Even if you do not end up using ooniprobe for developing your system today, I highly encourage you to use the libraries that we are using so that in the future we can find a way to integrate code from each others projects.
Yes, agreed.
Cheers, Philipp