Greetings from India,
So I've been testing networks in Bangalore and I've noticed a few odd quirks with using a test deck.
Here is my ooniprobe.conf:
% cat ooniprobe.conf # This is the configuration file for OONIProbe # This file follows the YAML markup format: http://yaml.org/spec/1.2/spec.html # Keep in mind that indentation matters.
basic: # Where OONIProbe should be writing it's log file logfile: ooniprobe-bangalore.log privacy: # Should we include the IP address of the probe in the report? includeip: true # Should we include the ASN of the probe in the report? includeasn: true # Should we include the country as reported by GeoIP in the report? includecountry: true # Should we include the city as reported by GeoIP in the report? includecity: true # Should we collect a full packet capture on the client? includepcap: false reports: # This is a packet capture file (.pcap) to load as a test: pcap: Null advanced: # XXX change this to point to the directory where you have stored the GeoIP # database file. This should be the directory in which OONI is installed # /path/to/ooni-probe/data/ #geoip_data_dir: /usr/share/GeoIP/ geoip_data_dir: /home/a/ooni-probe/data/ debug: true # tor_binary: '/usr/sbin/tor' # For auto detection interface: auto # Of specify a specific interface #interface: wlan0 # If you do not specify start_tor, you will have to have Tor running and # explicitly set the control port and SOCKS port start_tor: true # After how many seconds we should give up on a particular measurement measurement_timeout: 30 # After how many retries we should give up on a measurement measurement_retries: 2 # How many measurments to perform concurrently measurement_concurrency: 10 # After how may seconds we should give up reporting reporting_timeout: 30 # After how many retries to give up on reporting reporting_retries: 6 # How many reports to perform concurrently reporting_concurrency: 10 tor: socks_port: 9250 control_port: 9251 # Specify the absolute path to the Tor bridges to use for testing #bridges: bridges.list # Specify path of the tor datadirectory. # This should be set to something to avoid having Tor download each time # the descriptors and consensus data. data_dir: ~/.tor/
Here is the test deck:
% cat decks/india-full.deck - options: collector: null help: 0 logfile: null pcapfile: null reportfile: null subargs: [-t, '192.168.1.1', -f, 'inputs/india-uniq-hosts-with-alexa-top-1000.txt'] test_file: nettests/blocking/dnsconsistency.py - options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: null pcapfile: null reportfile: null subargs: [-b, 'http://93.95.227.200'] test_file: nettests/manipulation/http_header_field_manipulation.py - options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: null pcapfile: null reportfile: null subargs: [-b, 'http://93.95.227.200'] test_file: nettests/manipulation/http_invalid_request_line.py - options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: null pcapfile: null reportfile: null subargs: [-b, 'http://93.95.227.200', -f, 'inputs/india-uniq-urls-with-alexa-top-1000.txt'] test_file: nettests/manipulation/http_host.py
A few things happen when I attempt to use this deck.
Tor fails to return my IP: 2013-06-01 00:44:15+0530 [TorControlProtocol,client] [D] 100%: Done 2013-06-01 00:44:15+0530 [TorControlProtocol,client] [D] Building a TorState 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Successfully bootstrapped Tor 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] We now have the following circuits: 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 1 BUILT [194.132.32.43 165.225.132.54 46.165.221.166] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 2 EXTENDED [194.132.32.43] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 3 EXTENDED [] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 4 EXTENDED [] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] Obtained our IP address from a Tor Relay None 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Unhandled Error Traceback (most recent call last): Failure: txtorcon.torcontrolprotocol.TorProtocolError: 551 Address unknown
2013-06-01 00:44:16+0530 [TorControlProtocol,client] Unable to lookup the probe IP via Tor. 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] Cannot determine the probe IP address with a traceroute, becase of insufficient priviledges 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Looking up your IP address via maxmind
Then things get a little strange - http_host.py is never executed. Another is that http_header_field_manipulation.py runs and the log file shows everything, the yamloo file shows only this:
% cat report-http_header_field_manipulation-2013-05-31T191417Z.yamloo ########################################### # OONI Probe Report for http_header_field_manipulation (0.1.3) # Sat Jun 1 00:57:40 2013 ########################################### --- options: [-b, 'http://93.95.227.200'] probe_asn: AS24560 probe_cc: IN probe_ip: 122.167.211.176 software_name: ooniprobe software_version: 0.0.11 start_time: 1370027657.776991 test_name: http_header_field_manipulation test_version: 0.1.3 ...
The debug log shows the headers being sent and the data being returned with an issue at the collector: 2013-06-01 00:57:40+0530 [SOCKS5Client,client] Creating report with OONIB Reporter. Please be patient. 2013-06-01 00:57:40+0530 [SOCKS5Client,client] This may take up to 1-2 minutes... 2013-06-01 00:57:40+0530 [SOCKS5Client,client] [D] Successfully performed report <ooni.tasks.ReportEntry object at 0x588c190> 2013-06-01 00:57:40+0530 [SOCKS5Client,client] [D] None 2013-06-01 00:57:40+0530 [Uninitialized] [!] Failed to connect to reporter backend 2013-06-01 00:57:40+0530 [Uninitialized] Traceback (most recent call last): 2013-06-01 00:57:40+0530 [Uninitialized] File "/home/io/Documents/backup/git/tor/ooni-probe/ooni/reporter.py", line 323, in createReport 2013-06-01 00:57:40+0530 [Uninitialized] bodyProducer=bodyProducer) 2013-06-01 00:57:40+0530 [Uninitialized] ConnectError: An error occurred while connecting: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion: Connection lost. 2013-06-01 00:57:40+0530 [Uninitialized] ]. 2013-06-01 00:57:40+0530 [Uninitialized] [!] Failed to open <ooni.reporter.OONIBReporter object at 0x461d710> reporter, giving up... 2013-06-01 00:57:40+0530 [Uninitialized] [!] Reporter <ooni.reporter.OONIBReporter object at 0x461d710> failed, removing from report... 2013-06-01 00:57:40+0530 [Uninitialized] [D] Starting this task <generator object generateMeasurements at 0x51906e0> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class 'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'> test_put 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup 2013-06-01 00:57:40+0530 [Uninitialized] [D] Performing request http://93.95.227.200 PUT {'Accept-Language': ['en-US,en;q=0.8'], 'Accept-Encoding': ['gzip,deflate,sdch'], 'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 'User-Agent': ['Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6'], 'Accept-Charset': ['ISO-8859-1,utf-8;q=0.7,*;q=0.3'], 'Host': ['XAxlpMzUMfI5Vvi.com']} 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class 'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'> test_get_random_capitalization 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup 2013-06-01 00:57:40+0530 [Uninitialized] [D] Performing request http://93.95.227.200 gET {'accePt-lanGuAGe': ['en-US,en;q=0.8'], 'accEpT-eNcoDING': ['gzip,deflate,sdch'], 'ACCepT': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 'USeR-aGEnT': ['Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7'], 'aCcEPt-chaRseT': ['ISO-8859-1,utf-8;q=0.7,*;q=0.3'], 'HoSt': ['l5tHomKVddWW1A4.com']} 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class 'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'> test_post_random_capitalization 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup
In the end, I didn't have any yamloo files from the nettests/manipulation/http_invalid_request_line.py test. I had three files that updated and had some data which was basically:
report-dns_consistency-2013-05-31T191417Z.yamloo report-http_header_field_manipulation-2013-05-31T191417Z.yamloo ooniprobe-bangalore.log
I expected a few different things - one is that each test in the deck should produce a yamloo file. If the reporting back end takes the report, I suppose I might find it alright to not have the file but in the event of a failure, I really hope the data will be logged to a local .yamloo file.
When I run the following deck:
% cat decks/india.deck - options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: http_host_india_bangalore_justa_hotel.log pcapfile: null reportfile: http_host_india_cis.yamloo subargs: [-b, 'http://93.95.227.200', -f, 'inputs/india-uniq-urls-with-alexa-top-1000.txt'] test_file: nettests/manipulation/http_host.py
I have the proper output for http_host.py:
% head report-http_host-2013-05-31T193306Z.yamloo ########################################### # OONI Probe Report for http_host (0.2.3) # Sat Jun 1 01:03:06 2013 ########################################### --- options: [-b, 'http://93.95.227.200', -f, inputs/india-uniq-urls-with-alexa-top-1000.txt] probe_asn: AS24560 probe_cc: IN probe_ip: 122.167.211.176 software_name: ooniprobe
% tail report-http_host-2013-05-31T193306Z.yamloo url: http://93.95.227.200 response: body: '{"headers_dict": {"Connection": ["close"], "Host": ["zustmovies.com"]}, "request_line": "\nGET / HTTP/1.1", "request_headers": [["Connection", "close"], ["Host", "zustmovies.com"]]}' code: 200 headers: [] socksproxy: null transparent_http_proxy: false ...
Note that the yamloo file is created not as http_host_india_bangalore_justa_hotel.log but as report-http_host-2013-05-31T193306Z.yamloo...
It seems that perhaps test decks are too experimental for actual use with these issues - or did I do something horribly wrong?
Thoughts?
All the best, Jacob
On Fri, May 31, 2013 at 8:30 PM, Jacob Appelbaum jacob@appelbaum.netwrote:
Greetings from India,
So I've been testing networks in Bangalore and I've noticed a few odd quirks with using a test deck.
Here is my ooniprobe.conf:
% cat ooniprobe.conf # This is the configuration file for OONIProbe # This file follows the YAML markup format: http://yaml.org/spec/1.2/spec.html # Keep in mind that indentation matters.
basic: # Where OONIProbe should be writing it's log file logfile: ooniprobe-bangalore.log privacy: # Should we include the IP address of the probe in the report? includeip: true # Should we include the ASN of the probe in the report? includeasn: true # Should we include the country as reported by GeoIP in the report? includecountry: true # Should we include the city as reported by GeoIP in the report? includecity: true # Should we collect a full packet capture on the client? includepcap: false reports: # This is a packet capture file (.pcap) to load as a test: pcap: Null advanced: # XXX change this to point to the directory where you have stored the GeoIP # database file. This should be the directory in which OONI is installed # /path/to/ooni-probe/data/ #geoip_data_dir: /usr/share/GeoIP/ geoip_data_dir: /home/a/ooni-probe/data/ debug: true # tor_binary: '/usr/sbin/tor' # For auto detection interface: auto # Of specify a specific interface #interface: wlan0 # If you do not specify start_tor, you will have to have Tor running and # explicitly set the control port and SOCKS port start_tor: true # After how many seconds we should give up on a particular measurement measurement_timeout: 30 # After how many retries we should give up on a measurement measurement_retries: 2 # How many measurments to perform concurrently measurement_concurrency: 10 # After how may seconds we should give up reporting reporting_timeout: 30 # After how many retries to give up on reporting reporting_retries: 6 # How many reports to perform concurrently reporting_concurrency: 10 tor: socks_port: 9250 control_port: 9251 # Specify the absolute path to the Tor bridges to use for testing #bridges: bridges.list # Specify path of the tor datadirectory. # This should be set to something to avoid having Tor download each time # the descriptors and consensus data. data_dir: ~/.tor/
Here is the test deck:
% cat decks/india-full.deck
- options: collector: null help: 0 logfile: null pcapfile: null reportfile: null subargs: [-t, '192.168.1.1', -f,
'inputs/india-uniq-hosts-with-alexa-top-1000.txt'] test_file: nettests/blocking/dnsconsistency.py
- options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: null pcapfile: null reportfile: null subargs: [-b, 'http://93.95.227.200'] test_file: nettests/manipulation/http_header_field_manipulation.py
- options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: null pcapfile: null reportfile: null subargs: [-b, 'http://93.95.227.200'] test_file: nettests/manipulation/http_invalid_request_line.py
- options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: null pcapfile: null reportfile: null subargs: [-b, 'http://93.95.227.200', -f,
'inputs/india-uniq-urls-with-alexa-top-1000.txt'] test_file: nettests/manipulation/http_host.py
A few things happen when I attempt to use this deck.
Tor fails to return my IP: 2013-06-01 00:44:15+0530 [TorControlProtocol,client] [D] 100%: Done 2013-06-01 00:44:15+0530 [TorControlProtocol,client] [D] Building a TorState 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Successfully bootstrapped Tor 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] We now have the following circuits: 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 1 BUILT [194.132.32.43 165.225.132.54 46.165.221.166] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 2 EXTENDED [194.132.32.43] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 3 EXTENDED [] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] * <Circuit 4 EXTENDED [] for GENERAL> 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] Obtained our IP address from a Tor Relay None 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Unhandled Error Traceback (most recent call last): Failure: txtorcon.torcontrolprotocol.TorProtocolError: 551 Address unknown
known issue with resolving IP by Tor before any descriptors have been fetched.
2013-06-01 00:44:16+0530 [TorControlProtocol,client] Unable to lookup the probe IP via Tor. 2013-06-01 00:44:16+0530 [TorControlProtocol,client] [D] Cannot determine the probe IP address with a traceroute, becase of insufficient priviledges 2013-06-01 00:44:16+0530 [TorControlProtocol,client] Looking up your IP address via maxmind
Does the log end here? You should see some noise about a report being created at least, because the file header was written to disk.
Then things get a little strange - http_host.py is never executed. Another is that http_header_field_manipulation.py runs and the log file shows everything, the yamloo file shows only this:
% cat report-http_header_field_manipulation-2013-05-31T191417Z.yamloo ########################################### # OONI Probe Report for http_header_field_manipulation (0.1.3) # Sat Jun 1 00:57:40 2013
###########################################
options: [-b, 'http://93.95.227.200'] probe_asn: AS24560 probe_cc: IN probe_ip: 122.167.211.176 software_name: ooniprobe software_version: 0.0.11 start_time: 1370027657.776991 test_name: http_header_field_manipulation test_version: 0.1.3 ...
The debug log shows the headers being sent and the data being returned with an issue at the collector: 2013-06-01 00:57:40+0530 [SOCKS5Client,client] Creating report with OONIB Reporter. Please be patient. 2013-06-01 00:57:40+0530 [SOCKS5Client,client] This may take up to 1-2 minutes... 2013-06-01 00:57:40+0530 [SOCKS5Client,client] [D] Successfully performed report <ooni.tasks.ReportEntry object at 0x588c190> 2013-06-01 00:57:40+0530 [SOCKS5Client,client] [D] None 2013-06-01 00:57:40+0530 [Uninitialized] [!] Failed to connect to reporter backend 2013-06-01 00:57:40+0530 [Uninitialized] Traceback (most recent call last): 2013-06-01 00:57:40+0530 [Uninitialized] File "/home/io/Documents/backup/git/tor/ooni-probe/ooni/reporter.py", line 323, in createReport 2013-06-01 00:57:40+0530 [Uninitialized] bodyProducer=bodyProducer) 2013-06-01 00:57:40+0530 [Uninitialized] ConnectError: An error occurred while connecting: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion: Connection lost. 2013-06-01 00:57:40+0530 [Uninitialized] ]. 2013-06-01 00:57:40+0530 [Uninitialized] [!] Failed to open <ooni.reporter.OONIBReporter object at 0x461d710> reporter, giving up... 2013-06-01 00:57:40+0530 [Uninitialized] [!] Reporter <ooni.reporter.OONIBReporter object at 0x461d710> failed, removing from report... 2013-06-01 00:57:40+0530 [Uninitialized] [D] Starting this task <generator object generateMeasurements at 0x51906e0> 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class
'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'> test_put 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup 2013-06-01 00:57:40+0530 [Uninitialized] [D] Performing request http://93.95.227.200 PUT {'Accept-Language': ['en-US,en;q=0.8'], 'Accept-Encoding': ['gzip,deflate,sdch'], 'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 'User-Agent': ['Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6'], 'Accept-Charset': ['ISO-8859-1,utf-8;q=0.7,*;q=0.3'], 'Host': ['XAxlpMzUMfI5Vvi.com']} 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class
'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'> test_get_random_capitalization 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup 2013-06-01 00:57:40+0530 [Uninitialized] [D] Performing request http://93.95.227.200 gET {'accePt-lanGuAGe': ['en-US,en;q=0.8'], 'accEpT-eNcoDING': ['gzip,deflate,sdch'], 'ACCepT': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'], 'USeR-aGEnT': ['Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7'], 'aCcEPt-chaRseT': ['ISO-8859-1,utf-8;q=0.7,*;q=0.3'], 'HoSt': ['l5tHomKVddWW1A4.com']} 2013-06-01 00:57:40+0530 [Uninitialized] [D] Running <class
'nettests.manipulation.http_header_field_manipulation.HTTPHeaderFieldManipulation'> test_post_random_capitalization 2013-06-01 00:57:40+0530 [Uninitialized] [D] Finished test setup
In the end, I didn't have any yamloo files from the nettests/manipulation/http_invalid_request_line.py test. I had three files that updated and had some data which was basically:
report-dns_consistency-2013-05-31T191417Z.yamloo report-http_header_field_manipulation-2013-05-31T191417Z.yamloo ooniprobe-bangalore.log
I expected a few different things - one is that each test in the deck should produce a yamloo file. If the reporting back end takes the report, I suppose I might find it alright to not have the file but in the event of a failure, I really hope the data will be logged to a local .yamloo file.
The data should always be logged to a local yamloo file. If the test fails to run, it won't write anything other than the report header (this happens before the test is started).
When I run the following deck:
% cat decks/india.deck
- options: collector: httpo://nkvphnp3p6agi5qq.onion help: 0 logfile: http_host_india_bangalore_justa_hotel.log pcapfile: null reportfile: http_host_india_cis.yamloo subargs: [-b, 'http://93.95.227.200', -f,
'inputs/india-uniq-urls-with-alexa-top-1000.txt'] test_file: nettests/manipulation/http_host.py
I have the proper output for http_host.py:
% head report-http_host-2013-05-31T193306Z.yamloo ########################################### # OONI Probe Report for http_host (0.2.3) # Sat Jun 1 01:03:06 2013
###########################################
options: [-b, 'http://93.95.227.200', -f, inputs/india-uniq-urls-with-alexa-top-1000.txt] probe_asn: AS24560 probe_cc: IN probe_ip: 122.167.211.176 software_name: ooniprobe
% tail report-http_host-2013-05-31T193306Z.yamloo url: http://93.95.227.200 response: body: '{"headers_dict": {"Connection": ["close"], "Host": ["zustmovies.com"]}, "request_line": "\nGET / HTTP/1.1", "request_headers": [["Connection", "close"], ["Host", "zustmovies.com"]]}' code: 200 headers: [] socksproxy: null transparent_http_proxy: false ...
Note that the yamloo file is created not as http_host_india_bangalore_justa_hotel.log but as report-http_host-2013-05-31T193306Z.yamloo...
This is a bug. I opened an issue at: https://github.com/TheTorProject/ooni-probe/issues/123
It seems that perhaps test decks are too experimental for actual use with these issues - or did I do something horribly wrong?
They do need better testing. Another painful failure I discovered is that if a test fails explosively the remainder of the deck will not be run. I worked around this issue with a janky shell script and just commented out tests that had already run.
Thoughts?
We had some issues with the collector being hammered to the point it ran out of file descriptors. In general, if you know you will be doing tests from remote areas with poor connectivity without much up-front notice it would be helpful to do one of the following:
1. set up and run a new collector on a spare machine or amazon instance for your tests 2. or ask someone in advance to set up a backup collector 3. familiarize yourself with oonib operation and troubleshoot
sadly things are still a little fragile, but if you know your tests, input lists, and collectors all run cleanly before heading into the field you alleviate a lot of stress.
p.s. iirc you do have access to the tpo collector; is that still the case?
--Aaron
All the best, Jacob _______________________________________________ ooni-dev mailing list ooni-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/ooni-dev