Brainstorming a Tor censorship analysis tool

List overview All Threads
Download

newer

older

Opt-in model for flash proxy

Testing in Tor [was Re:...

Philipp Winter

18 Dec 2012 18 Dec '12

7:07 p.m.

Hi there,

Deliverable 6 for sponsor Z says:

...

Start a tool that a censored developer can run to discover why their Tor is

failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)

The deliverable is due on Feb. 28, 2013 so we should get started.

Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.

I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer

Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.

Cheers, Philipp

Show replies by date

George Kadianakis

19 Dec 19 Dec

1:20 p.m.

Philipp Winter identity.function@gmail.com writes:

...

Hi there,

Deliverable 6 for sponsor Z says:

...

Start a tool that a censored developer can run to discover why their Tor is

failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)

The deliverable is due on Feb. 28, 2013 so we should get started.

Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.

I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer

Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.

One thing I consider important in such a tool is unit and integration testing. Ideally, it should be possible to run unit tests on all of its features, to test whether they would work in a real environment and whether any of them are trivially broken.

Unfortunately, designing and writing such unit tests is not easy since you have to emulate a censored network. While developing daphne, me and Arturo considered doing that by using iptables or by monkey-patching the networking methods of Python/Twisted with methods that censor outgoing traffic. Both of those ideas wouldn't fully emulate a censored network, but if developed correctly they would give you an idea of whether a test will work in Real Life or not.

I'm mentioning this because I noticed that you don't have testability included in your feature list, and that might bite you in the long-term. Either because you will have to spend lots of unscheduled time writing tests, or because you won't have the time to write any tests (and your features will break frequently, like in OONI).

Simon

7:29 p.m.

On Wed, Dec 19, 2012 at 5:20 AM, George Kadianakis desnacked@riseup.net wrote:

...

Philipp Winter identity.function@gmail.com writes:

...
Hi there,

Deliverable 6 for sponsor Z says:

...

Start a tool that a censored developer can run to discover why their Tor is

failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)

The deliverable is due on Feb. 28, 2013 so we should get started.

Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.

I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer

Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.

One thing I consider important in such a tool is unit and integration testing. Ideally, it should be possible to run unit tests on all of its features, to test whether they would work in a real environment and whether any of them are trivially broken.

Unfortunately, designing and writing such unit tests is not easy since you have to emulate a censored network. While developing daphne, me and Arturo considered doing that by using iptables or by monkey-patching the networking methods of Python/Twisted with methods that censor outgoing traffic. Both of those ideas wouldn't fully emulate a censored network, but if developed correctly they would give you an idea of whether a test will work in Real Life or not.

I'm mentioning this because I noticed that you don't have testability included in your feature list, and that might bite you in the long-term. Either because you will have to spend lots of unscheduled time writing tests, or because you won't have the time to write any tests (and your features will break frequently, like in OONI).

Maybe there is no automated testing for any Tor projects? At least a quick search on the wiki only found [1] which lists possible ways to test (but was created 7 months ago and apparently not updated since and collecting dust) and [2] discussing a manual test procedure for TBB. However, tor-0.2.3.25.tar.gz does reveal some test files but the source code ratio of production code to test code is not inspiring at first glance:

$ find src/ -type f | egrep ".c" | egrep -v "/test/" | xargs wc -l 3721 src/or/connection_edge.c ... 4553 src/common/util.c 117674 total

$ find src/ -type f | egrep ".c" | egrep "/test/" | xargs wc -l 143 src/test/test_pt.c ... 3134 src/test/test_util.c 10328 total

I tried ./configure && make && make test and got the following output: ... config/addressmap: OK 89 tests ok. (1 skipped)

That's one test for every 1,322 (== 117,674 / 89) LOC.

To test code coverage then I added '-fprofile-arcs -ftest-coverage' to the CFLAGS in the Makefiles and did make clean && make && make test to rebuild and test. Next to see the code coverage in e.g. src/or/* then I ran the following perl one-liner which runs gcov and tots up everything:

$ gcov *.c | perl -lane 'if(m~File (.*)~){$file=$1;next;} if(m~Lines executed:([\d.]+)% of (\d+)~){next if($file=~m~(/|.h)~); ($pc,$loc)=($1,$2); $tloc+=$loc; $tlocc+=int($loc*$pc/100); $t++; printf qq[Lines executed:%6s%% of %5u LOC in %s\n], $pc, $loc, $file;} sub END{printf qq[Lines executed:%6.2f%% of %5u LOC in src/or/*.c or %u lines covered in $t c source files\n], $tlocc/$tloc*100, $tloc, $tlocc;}' Lines executed: 44.73% of 825 LOC in 'buffers.c' Lines executed: 16.04% of 2300 LOC in 'circuitbuild.c' Lines executed: 0.00% of 626 LOC in 'circuitlist.c' Lines executed: 0.00% of 739 LOC in 'circuituse.c' Lines executed: 0.00% of 528 LOC in 'command.c' Lines executed: 12.40% of 2855 LOC in 'config.c' Lines executed: 0.00% of 2 LOC in 'config_codedigest.c' Lines executed: 0.00% of 1552 LOC in 'connection.c' Lines executed: 8.19% of 1441 LOC in 'connection_edge.c' Lines executed: 0.00% of 821 LOC in 'connection_or.c' Lines executed: 1.44% of 2008 LOC in 'control.c' Lines executed: 0.00% of 187 LOC in 'cpuworker.c' Lines executed: 4.59% of 1633 LOC in 'directory.c' Lines executed: 6.91% of 1592 LOC in 'dirserv.c' Lines executed: 44.72% of 1648 LOC in 'dirvote.c' Lines executed: 0.00% of 646 LOC in 'dns.c' Lines executed: 0.00% of 141 LOC in 'dnsserv.c' Lines executed: 57.39% of 582 LOC in 'geoip.c' Lines executed: 2.07% of 387 LOC in 'hibernate.c' Lines executed: 0.00% of 943 LOC in 'main.c' Lines executed: 66.46% of 328 LOC in 'microdesc.c' Lines executed: 11.78% of 1053 LOC in 'networkstatus.c' Lines executed: 17.71% of 350 LOC in 'nodelist.c' Lines executed: 31.74% of 167 LOC in 'onion.c' Lines executed: 63.45% of 632 LOC in 'policies.c' Lines executed: 0.00% of 140 LOC in 'reasons.c' Lines executed: 0.00% of 1057 LOC in 'relay.c' Lines executed: 0.00% of 474 LOC in 'rendclient.c' Lines executed: 25.60% of 629 LOC in 'rendcommon.c' Lines executed: 0.00% of 123 LOC in 'rendmid.c' Lines executed: 0.29% of 1045 LOC in 'rendservice.c' Lines executed: 23.14% of 1223 LOC in 'rephist.c' Lines executed: 10.75% of 1088 LOC in 'router.c' Lines executed: 9.03% of 2513 LOC in 'routerlist.c' Lines executed: 51.81% of 2297 LOC in 'routerparse.c' Lines executed: 0.00% of 44 LOC in 'status.c' Lines executed: 25.69% of 436 LOC in 'transports.c' Lines executed: 15.57% of 35184 LOC in 'transports.c' Lines executed: 15.55% of 70239 LOC in src/or/*.c or 10924 lines covered in 38 c source files

Code coverage in src/common/* is somewhat better although still poor:

Overall gcc sees 70,239 + 12,938 == 83,177 LOC total for src/or/*.c and src/common/*.c, and sees 10,924 + 7,183 == 18,107 of these lines executed after running make test. That's a grand total code coverage of 21.77% of lines covered via make test. Better than no tests but still very poor :-(

An interesting paper about the effects of automated testing, production to test LOC ratios, and code coverage can be found here [3].

Tor seems to have good planning compared to most open source projects. So I would be interested in hearing why testing is apparently 'falling between the cracks'. Why isn't there just 10 times more test LOC? What about implementing a new policy immediately: Any new production LOC committed must be covered by tests, or peer reviewed and democratically excluded?

[1] https://trac.torproject.org/projects/tor/wiki/doc/Testing [2] https://trac.torproject.org/projects/tor/wiki/doc/Testing/TBBSmokeTest [3] http://research.microsoft.com/en-us/groups/ese/nagappan_tdd.pdf

Runa A. Sandvik

10:20 p.m.

On Tue, Dec 18, 2012 at 7:07 PM, Philipp Winter identity.function@gmail.com wrote:

...

Hi there,

Hi Philipp,

...

Deliverable 6 for sponsor Z says:

...

Start a tool that a censored developer can run to discover why their Tor is

failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)

The deliverable is due on Feb. 28, 2013 so we should get started.

Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.

I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer

Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.

Thanks for starting this! I have updated the page with a few extra things.

-- Runa A. Sandvik

Arturo Filastò

21 Dec 21 Dec

3:16 p.m.

On Dec 18, 2012, at 8:07 PM, Philipp Winter identity.function@gmail.com wrote:

...

Hi there,

Deliverable 6 for sponsor Z says:

...

Start a tool that a censored developer can run to discover why their Tor is

failing to connect: brainstorm a list of "things to check", and sort them by how useful they'd be to check / how hard they'd be to build. (#7137)

The deliverable is due on Feb. 28, 2013 so we should get started.

Some background about the deliverable: The reason for this project is that debugging possible censorship events is tedious right now. We often have no access to machines in censoring countries and we are dependent on users creating packet dumps for us. This tool should speed up and automate this process to some extent. Censored users should run it and the tool should then collect data which should then somehow reach us.

I created the following wiki page which should contain all the necessary information: https://censorshipwiki.torproject.org/TorCensorshipAnalyzer

Please add/modify stuff and share your opinion. Since there is quite some overlap with OONI, it would be great if the OONI people could give feedback.

I believe you should be using ooniprobe to build a the tests you are interested in building, or you may at least be interested in looking at our code to see how to do the things you are interested in doing.

The main points where ooniprobe would be of use to you (now) are:

# Standard reporting format

All ooniprobe tests share a common base format depending on the test template your test is based on.

I recommend you look at the Test Writing tutorial to get an idea of how this looks like: https://ooni.torproject.org/docs/writing_tests.html

# Collection of packet captures

When you run an ooniprobe test and you have set your ooniprobe.conf file to "includepcap: true" then you will collect a full pcap of what has happened on the probes network during the test run.

Note: This requires the test to be run as root and will include *all* the network traffic during the testing session (i.e. if the user is looking at their favorite kitten website while running the test, such data will be in the pcap)

# Collection of packet captures specific to the sent and received packets

When you run a ooniprobe test that inherits from the scapy test template (https://ooni.torproject.org/docs/api/ooni.templates.html#module-ooni.templat...) the packets sent and received (i.e. that are answers to the packet(s) sent) will be captured.

When configured to not include the probe IP address, source IP of sent packets and dst IP of received packets is replaced with 127.0.0.1. (warning: if the IP address of the probe is present in some other parts of the packet it will not get stripped, for example if it's present in the ICMP citation)

# Reporting system

Currently we only support collection of YAML formatted reports (that means not .pcap files) and only via Tor Hidden Services.

Extending it to support reporting via HTTP(s) should be trivial and is a feature that we have already received a request for.

Adding support for collecting also .pcaps also probably does not require that much amount of time and is something that will happen in the near future.

# Things to come

ooniprobe will soon expose a HTTP based API that binds to localhost that can then be (optionally) exposed as a Tor Hidden Service. Such API will allow researchers to connect to a probe and run some tests and will allow us to build a JS/HTML5 client interface to allow users to select which tests to run and monitor the status of running tests.

More details here: https://ooni.torproject.org/docs/architecture.html#ooniprobe-api

For a birds-eye view of the project see: https://ooni.torproject.org/docs/architecture.html

Even if you do not end up using ooniprobe for developing your system today, I highly encourage you to use the libraries that we are using so that in the future we can find a way to integrate code from each others projects.

The main libraries that we are using are:

* Twisted http://twistedmatrix.com * Scapy http://www.secdev.org/projects/scapy/ * txtorcon https://github.com/meejah/txtorcon

~ Art.

Philipp Winter

26 Dec 26 Dec

9:52 p.m.

First of all thanks a lot for summing all of that up in such great detail, Arturo. Comments inline.

On Fri, Dec 21, 2012 at 04:16:32PM +0100, Arturo Filastò wrote:

...

# Collection of packet captures specific to the sent and received packets

When you run a ooniprobe test that inherits from the scapy test template (https://ooni.torproject.org/docs/api/ooni.templates.html#module-ooni.templat...) the packets sent and received (i.e. that are answers to the packet(s) sent) will be captured.

When configured to not include the probe IP address, source IP of sent packets and dst IP of received packets is replaced with 127.0.0.1. (warning: if the IP address of the probe is present in some other parts of the packet it will not get stripped, for example if it's present in the ICMP citation)

Sounds like a good thing to have.

...

# Reporting system

Currently we only support collection of YAML formatted reports (that means not .pcap files) and only via Tor Hidden Services.

Extending it to support reporting via HTTP(s) should be trivial and is a feature that we have already received a request for.

Adding support for collecting also .pcaps also probably does not require that much amount of time and is something that will happen in the near future.

That sounds good. Hidden services will not be useful in this case because Tor is expected to be unavailable but HTTPS could work.

...

# Things to come

ooniprobe will soon expose a HTTP based API that binds to localhost that can then be (optionally) exposed as a Tor Hidden Service. Such API will allow researchers to connect to a probe and run some tests and will allow us to build a JS/HTML5 client interface to allow users to select which tests to run and monitor the status of running tests.

Hmm, what's the use case here? To provide an "anonymous" ooniprobe which can be controlled remotely by people I trust? I guess it won't be possible to hide the probe's IP address since I can just run a test which makes it connect to an IP address under my control?

...

More details here: https://ooni.torproject.org/docs/architecture.html#ooniprobe-api

For a birds-eye view of the project see: https://ooni.torproject.org/docs/architecture.html

Thanks. On a more general note, a core requirements is to make the analysis tool easy to use since we can't expect users to mess around with configuration. How easy do you think would it be to package an ooniprobe with our analysis tests in a self-contained executable which can then simply be run by users?

...

Even if you do not end up using ooniprobe for developing your system today, I highly encourage you to use the libraries that we are using so that in the future we can find a way to integrate code from each others projects.

Yes, agreed.

Cheers, Philipp

4334

Age (days ago)

4342

Last active (days ago)

tor-dev@lists.torproject.org

5 comments

5 participants

tags (0)

participants (5)

Arturo Filastò
George Kadianakis
Philipp Winter
Runa A. Sandvik
Simon