# What we did in April 2015
A lot of the work done in April was related to a funny problem we encountered. Basically the machine previously used for the OONI data pipeline had run out of disk storage (1TB) and the daily batch processing task was requiring more than 6 hours to run. During this time the database and other important services on the data pipeline were not responsive, so we concluded that we had to start looking into a way to start scaling our infrastructure “horizontally”.
Although the amount of data that we are currently ingesting (on average 1-2 GB per day) does not necessarily require a big data like solution, we expect this value to increase at least by 1 or 2 orders of magnitude (20-200 GB per day). Given the fact that we had just recently had to move to another more powerful machine, we concluded it was ideal to try and tackle this problem looking at the future.
On this matter we:
* Experimented with various big data solutions and implemented some patches for the existing tools:
https://github.com/mumrah/kafka-python/pull/376https://github.com/spotify/luigi/pull/910https://github.com/Parsely/streamparse/pull/142
* We got in contact with various different vendors of big data cloud and bare metal solutions in order to evaluate their offering and see if it would be possible to receive sponsorship from them.
* We started working on a hadoop based pipeline implementation:
https://github.com/hellais/ooni-pipeline-ng
Moreover:
* We worked on organising the OONI hackathon and concluded that it would be ideal to postpone it to the end of summer (probably around September).
* We finished implementing an alpha prototype of libight for iOS that allows the user to run 3 basic OONI tests:
https://github.com/TheTorProject/libight-ios
* Update the bouncer to point to another mlab-ns server
* Release ooniprobe 1.3.1 and include it inside of debian stretch
* Publish the new OONI website: https://ooni.torproject.org/
* We implemented OONI tests for some censorship circumvention tools and analysed how they work:
meek: https://github.com/TheTorProject/ooni-probe/pull/387https://github.com/TheTorProject/ooni-spec/pull/38
lantern: https://github.com/TheTorProject/ooni-probe/pull/388https://github.com/TheTorProject/ooni-spec/pull/40
~ Arturo
# What we did in March 2015
* Worked on making citizen labs test list more community oriented.
In particular we added support for adding inputs via a command line tool: https://github.com/citizenlab/test-lists/pull/4 <https://github.com/citizenlab/test-lists/pull/4>
We also added support for adding lists relevant to services: https://github.com/citizenlab/test-lists/pull/9 <https://github.com/citizenlab/test-lists/pull/9>
* Make progress on the libight iOS sample app
* Attend a workshop at Oxford university about Ethics of Network measurement.
* We analysed a variety of different censorship circumvention tools with the goal of implementing them as OONI tests.
* Add the test helper IP and port to the reports
* Add unique identifiers for reports
* Make progress on releasing the latest ooniprobe version to debian with the cronjob
~ Arturo
Hello Oonitarians,
This is a reminder that today there will be the weekly OONI meeting.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (19:00 CET, 13:00 EST, 10:00 PST).
Everybody is welcome to join us and bring their questions and feedback.
See you later,
~ Arturo
--
Arturo Filastò
@hellais
jabber: hellais(a)jabber.ccc.de
Hello Oonitarians,
This is a reminder that today there will be the weekly OONI meeting.
It will happen as usual on the #ooni channel on irc.oftc.net at 17:00
UTC (19:00 CET, 13:00 EST, 10:00 PST).
Everybody is welcome to join us and bring their questions and feedback.
See you later,
~ Arturo
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi to all,
i just discovered this project, and i'm studying it.
I'm the creator of the no-profit service http://www.neumon.org .
It's a project similar to OONI, but focused only on DNS and HTTP.
Basically, i detect DNS around the world (the same technique we rebuild
here: http://ipleak.net/ ).
NeuMon browse collected DNS servers, and check if can be queried (open
and recursive). This because most of DNS ISP are recursive only from
it's customers subnet.
We maintain a huge list of domains to check (mix of known blocked
website, top alexa, etc).
Every DNS it's queried for each domains, we collect results, compare
against a known good value and discover custom injection (generally
that point to blocking page, i published some
example here: http://tinyurl.com/pl8znb4 ).
So, i have:
- - a huge list of DNS servers, with country geolocation.
- - lists of domains blocked, country-based. Not exaustive.
- - i know many IP address that are destination of DNS redirection,
typically IP of servers that show html blocking pages. And DNS servers
of ISP that redirect to these addresses.
An example of data i know:
# dig www.sex.com @203.146.237.237 +short
answer: http://203.146.43.133/
i know hundred of domains that redirect to the same answer.
I never made public my results. Because i never know anyone interested
in that data, only about the lists of domains censored. But i never
publish it because contains child pornography domains.
We also build a probe software, to allow other activists connected to
the ISP directly to launch it and detect censorship not based on DNS.
But nobody want to run a software that fetch also child pornography
domains, so nobody want to run our probe.
In general, the NeuMon project has never attracted the interest of no
one, and actually is abandoned and no longer maintained.
I'm here to understand if i can help OONI project (or my system &
collected data).
Ciao!
Fabrizio - Clodo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEbBAEBCAAGBQJVRVFDAAoJEC/ixHrG0m4LyPwH9REkL4RNiVzWKlv4qCxf/YGP
t+3U7hOOm95HGZL6VX7XzSBeo3tL32FlppJNdxL0T+B6I2oKPSOizpYGjHSrkTgm
ETqdgsOdF5AYWH+BCTS+U+EtOyN6ckIk1+bLI6gfUuFuwxcHC09GisYDIqRqTaX+
X5+ovXuZPkC+lwAZKfKIxfIw7zvrpWKnsfxJskp7pCypNh0vzd4C/erRBg3RcYi9
cExA1yKpbRxzK2dO3nJmQfL0nYwCWEFVWTQmZBJd8v62AbeYg7FDXth8PJZeVJhp
WAJx0nTWlR6MzpShTZiJGxSnPuZzuxU+oEDKKSaYw7xBQhaFxQko7McTOTgztg==
=NR6I
-----END PGP SIGNATURE-----