Hello,
Over the next year, the OONI team will be building native OONI Probe
apps for Windows and macOS. Hopefully this will make it easier to
install and run OONI Probe, and to engage others with censorship
measurement research.
To implement these apps, OONI started off by carrying out research to
decide what technology stack to use, and the high-level architecture.
The OONI team experimented with various solutions and documented their
advantages and disadvantages. OONI's Arturo published a blog post
outlining the architecture and design considerations that we made for
the implementation of OONI Probe desktop apps, as a result of research
and experimentation with a variety of libraries and approaches.
This document, outlining the rationale behind our choices for the apps,
is available here:
https://ooni.torproject.org/post/writing-a-modern-cross-platform-desktop-ap…
Always open to your feedback and suggestions.
~ The OONI team.
--
Maria Xynou
Research and Partnerships Coordinator
Open Observatory of Network Interference (OONI)
https://ooni.torproject.org/
PGP Key Fingerprint: 2DC8 AFB6 CA11 B552 1081 FBDE 2131 B3BE 70CA 417E
Hi Oonitarians!
We are glad to announce the release of the stable version of OONI's new measurement API, which is available here: https://api.ooni.io/.
The OONI API is based on the stable release of our data processing pipeline, which has been re-engineered over the last year to detect censorship events around the world faster and more accurately.
We added the following to the OONI API:
* API endpoints for listing and filtering anomalous measurements
* API endpoints for downloading full reports
We changed the following:
* Reverse sorting in `by_date` view and hide measurements from time travellers
* Better API documentation thanks to redoc based on OpenAPI 2.0
* Improve request validation thanks to connexion base on OpenAPI
* Oonify the UI
* Better testing
Some of the fun things you can do with this new API include, easily searching through OONI measurements to find those containing anomalies.
As an example, with the following one liner (assuming you have curl and jq) you can get a sample of censored websites in India:
$ curl 'https://api.ooni.io/api/v1/measurements?probe_cc=IN&confirmed=true' | jq -r '.results[].input' | sort | uniq | head -n 5
http://0bin.nethttp://archive.orghttp://hastebin.comhttp://mp3.com/http://pastebin.com
Have fun!
~ Arturo
Hi list,
I'm working on an experimental nettest to collect addresses of other
peer probes participating in a given P2P-like experiment. We have a
[helper][] which listens for probe connections on a TCP port and a
[nettest][] which connects to it. The nettest starts an HTTP server
process on a given port and if the process doesn't exit in a few
seconds, it assumes that running the process was successful. In both
cases, then it connects to the helper and reports the HTTP server port
(or none), then the helper saves it into a file and replies with another
entry from the file, which the nettest saves into a local file.
[helper]: https://github.com/equalitie/ooni-backend/blob/eq-testbed/oonib/testhelpers…
[nettest]: https://github.com/equalitie/ooni-probe/blob/eq-testbed/ooni/nettests/exper…
The nettest derives from ``TCPTest``, and its main test method returns a
subprocess deferred. A timeout function is set up to cancel the
deferred (assuming that the subprocess kept on running), in which case
the subprocess errback catches the cancellation error and proceeds to
normal callback. The normal callback (which also handles subprocess
exit codes) either retries running the subprocess from the beginning, or
if the subprocess was successful, proceeds to start the TCP exchange
with the helper by sending it some payload and setting handlers for
error and reply.
The nettest has an odd behaviour: although the TCP exchange completes in
less than a second (including closing both ends of the connection), the
nettest's TCP response handler only runs after the test times out
(``TCPTest.timeout``). In some occasions (esp. when running the
subprocess is retried), regardless of TCP exchange success, the
``Measurement`` task itself times out, the TCP response handler doesn't
run at all, and the test is cancelled and run again.
I'd like to ask for some advice in how to handle this situation, ideally
so that the test can finish as soon as the TCP exchange completes so
that timeouts don't trigger. If the explanations are unclear I can try
to provide simplified versions of the code. Please note that my
experience with Twisted/OONI development is very limited so I try my
best.`;)`
Thank you very much for your help!
--
Ivan Vilata i Balaguer
This is an email I sent to someone at the Internet Archive who wanted to
know about blocking of archive.org. The URLs "http://archive.org" and
"https://archive.org/web/" are in test-lists, so they are being tested
by OONI. See the README for notes on how I do analysis using ooni-sync,
jq, and R.
https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709/RE…https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709/bl…https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709.zip
Here is a description of some basic analysis using OONI to check for
blocking of archive.org. It's based on 2,080 reports covering 59
countries, dated between 2017-07-01 and 2017-07-06. I'm attaching the
source code and a graph that it produces. There are anomalous
measurements found in China, Russia, Venezuela, Mexico, Brazil, and
France. Of these, the ones in China and Russia are clearly the result of
censorship, while the others are ambiguous, and might be random
measurement error or very localized blocking. For a clearer view, you
would want to use reports from a longer time period.
Here is a summary of the countries with anomalous measurements, showing
how many anomalous measurements there were out of how many total.
country anomalous total percent_anomalous
1: CN 1 1 100.0%
2: RU 19 54 35.2%
3: VE 1 4 25.0%
4: MX 1 10 10.0%
5: BR 3 42 7.1%
6: FR 1 100 1.0%
The process of making the graph is basically (1) download OONI reports,
(2) filter them for archive.org measurements, and (3) process the data
using another script. The longest part of the process is downloading the
report files, because they include tests of many domains other than
archive.org (typically about a thousand). Currently it's necessary to
download the full report files and filter them locally. However, OONI
plans to soon deploy a system that will make it possible to download
measurements for just one domain at a time.
== China ==
The one test from China shows blocking by DNS injection (this type of
blocking is characteristic and well documented for the Great Firewall).
In this case, the false DNS response for archive.org that they injected
was the IP address 31.13.69.228, which actually belongs to Facebook.
https://explorer.ooni.torproject.org/measurement/20170701T065636Z_AS4808_oh…
== Russia ==
About 35% of tests in Russia were blocked, which is not surprising given
that a block of archive.org was ordered in 2015.
https://arstechnica.com/tech-policy/2015/06/wayback-machines-485-billion-we…
It's not unusual for a site to be available in some places, even when
ordered blocked, when enforcement of the block is left to individual
ISPs, as seems to be the case here.
The blocked tests came from AS41661 and AS21378. The unblocked tests
came from AS3239, AS8369, AS8427, AS12389, AS16345, AS21127, AS41661,
and AS42668.
The blocks from AS41661 were by DNS injection, affecting both HTTP and
HTTPS. The false IP address returned was 92.255.241.100, whose reverse
DNS is law.filter.ertelecom.ru. The web server at
http://law.filter.ertelecom.ru/ serves a block page in Russian.
https://explorer.ooni.torproject.org/measurement/20170701T190029Z_AS41661_E…
The block from AS21378 was by TCP blocking: the DNS request gave the
correct response 207.241.224.2 and the client was able to establish a
TCP connection to the server, but the firewall did not permit the HTTP
response to arrive.
https://explorer.ooni.torproject.org/measurement/20170701T135420Z_AS21378_c…
== Venezuela ==
One test from AS8048 did not get a response to its DNS request. However
it may just be a random failure (not blocking), because there were two
other successful tests from AS8048, and one successful test from AS6306.
https://explorer.ooni.torproject.org/measurement/20170705T141354Z_AS8048_Ku…
== Mexico ==
As in the Venezuela case, there was one test from AS8151 that didn't get
a DNS response; however there were 9 other successful tests, including
others from AS8151.
https://explorer.ooni.torproject.org/measurement/20170703T060009Z_AS8151_GO…
== Brazil ==
Of the five Brazilian ASes present in the sample of reports, only one
shows anomalies: AS1916, Rede Nacional de Ensino e Pesquisa (National
Education and Research Network). In this network, requests for
http://archive.org (which redirects to https://archive.org) succeed,
while those directly requesting https://archive.org/web/ consistently
time out. I don't have a good explanation for this. Certain kinds of
stateful firewall could plausibly cause such behavior.
== France ==
A single measurement (out of 100) in France timed out requesting
http://archive.org. It was in AS197422 and there were no other reports
in the sample from that AS, so it's hard to say whether it's due to a
block or a random failure.
https://explorer.ooni.torproject.org/measurement/20170705T232621Z_AS197422_…
Currently you can do queries with order_by=test_start_time,
order_by=probe_cc, etc., but you cannot do order_by=index.
https://measurements.ooni.torproject.org/api/v1/files?limit=1&order_by=index
{
"error_code": 400,
"error_message": "Invalid order_by"
}
As I understand it, the difference between index and test_start_time is
that index is always increasing over time (newly uploaded reports always
get a higher index than existing reports), while newly uploaded reports
can have a test_start_time that is in the past (if the probe was not
able to upload for a time, for example).
The ability to order_by=index would allow a slight robustness
enhancement in ooni-sync, in the case when a new report is uploaded
while ooni-sync is running. Currently ooni-sync always does
order=asc&order_by=test_start_time&limit=1000
That is, starting with the oldest reports, get a page of 1000 reports at
a time. The issue is what happens when a report from the past is
uploaded while ooni-sync is downloading. In this case ooni-sync will not
notice the new report right away. Here is an example with made-up
indexes and dates:
ooni-sync starts downloading page 0 from index=5000 (2016-01-01) to index=5999 (2016-03-31)
new report with index=9999 (2016-02-01) appears, gets inserted into page 0
ooni-sync finishes downloading page 0
ooni-sync starts downloading page 1 from index=5999 (2016-03-31) to index=6998 (2016-04-05)
ooni-sync finishes downloading page 1
In this example, ooni-sync never downloads the report with index=9999.
Also, it sees index=5999 twice, because index=9999 pushed index=5999
from page 0 to page 1.
An order_by=index option would prevent newly uploaded reports from
unaligning the pages like that (at least when order_by=asc is used).
The reasons why this is minor minor minor and hardly worth mentioning:
* index=9999 will get downloaded the next time you run ooni-sync
* it can't cause ooni-sync to skip any already uploaded reports (it
would, with order=desc, but that's why ooni-sync uses order=asc)
* ooni-sync will see but won't actually download index=5999 twice
* newly uploaded reports are likely to be on the last page anyway
I wrote a program that uses the OONI API to download reports and keep a
local directory of reports up to date. It's much faster than the Wget
loop I used to use and it finishes quickly when there is nothing new to
download.
git clone https://www.bamsoftware.com/git/ooni-sync.git
For example, lately I've had to download a lot of tcp_connect reports. I
run it like this:
ooni-sync -xz -directory reports.tcp_connect/ test_name=tcp_connect
This command downloads the index of tcp_connect reports and only
downloads the ones that are not already downloaded. It compresses the
downloaded files with xz. The next time I need to update, I run the same
command again, and it only downloads reports that are new since the last
time.
You can use other query parameters supported by the API, like probe_cc,
probe_asn, since, and until. For example:
ooni-sync -xz -directory reports.is/ probe_cc=IS since=2017-01-01
ooni-sync -xz -directory reports.as25/ probe_asn=AS25
ooni-sync -xz -directory reports.tor-turkey/ test_name=vanilla_tor probe_cc=TR
ooni-sync -xz -directory reports.web_connectivity/ test_name=web_connectivity since=2017-01-01 until=2017-01-02
I prefer to keep all the reports compressed on disk, so I always use the
-xz option, but by default reports are saved unmodified.