Re: [ooni-dev] Let's come up with the roadmap for the future of OONI

19 Feb 2015

      Thank you all for the feedback and very useful comments!
I will quote replies from another thread and comment on some of the
topics there.
At the end based on the feedback received I will make a list of what are
the next main areas of focus for the next 6 months, some of the relevant
tickets and what new tickets should be created.
If you have ideas on some specific topic/issue/feature that you believe
should be tackled, please do append to this list.
On 2/17/15 5:28 PM, Nick Feamster wrote:
...
There are many interesting ideas in here.  Thanks!
Off the top of my head, one could get a better global picture of censorship using data from the “global censorship measurement” tools (Sam’s Encore tool, Roya and Ben’s “Censored Planet” project) to trigger measurements from OONI, and vice versa.
Tools such as Encore and CP can get a signal of filtering from a larger and more diverse set of clients than an OONI deployment could (and, at lower risk); the drawback is a lack of detailed information (typically the information is a binary “yes/no” about whether filtering is taking place in TCP/IP, DNS, or HTTP, but not much else).
I could imagine an OONI deployment using information about observed filtering from Encore or CP to trigger a more extensive and detailed set of tests from the OONI nodes.  Likewise, blocking observed at one OONI node at one layer could be fed back into Encore or CP to see whether any observable filtering behavior is observed across a broader range of sites.
I view this as a specific “research use case” for the integration, orchestration, and data analysis points that Arturo lists below.
I think Roya and Ben are probably not quite ready for this with CP (I’d want to ask them), but Encore is very stable and we could look into good ways of passing information back and forth, either to OONI or to a common visualization engine (or both).
Some of the other efforts look interesting, but the research payoff isn’t as readily apparent to me.  (I’m suspicious of monthly reports without broader, more global baseline coverage like Encore or CP could provide, but we are also quite interested in visualization tools and reports for that data, so integrating all of that into a single “dashboard” would be something that could be very useful, based on what political scientists and others seem to be asking for.)
This is indeed a very interesting concept. With respect to using
ooniprobe data to trigger other tests, this will be easier once we
finish implementing the API to the pipeline.
We now have all the data collected with ooniprobe inside of a database
and will be working (mainly to provide analytics and visualization
tools) on exposing access and query functionality.
Work on this has only recently started, but you can find code on it
here: https://github.com/hellais/ooni-app.
With resepect to triggering ooniprobe measurements based on data
collected by encore, censored planet, etc. I think we would need to
support: https://trac.torproject.org/projects/tor/ticket/12551 unless we
consider it acceptable to have to wait some time before all deployed
probes run the measurement.
On 2/17/15 5:36 PM, Phillipa Gill wrote:
...
I would say the daily measurements from stable nodes is less interesting for us since ICLab is already 99% of the way there on those (ie., running baseline tests from VPN + deployed endpoints).
I think the idea of detecting different censorship circumvention tools could be interesting and if the tests are well specified this could be something that is ported/run in ICLab as well.
Yes we plan on openly specifying all the tests that we will be deploying
in a study in a specific country.
I am not sure how many details I can disclose at the moment on this, but
will be sure to update the list when more is known.
On 2/17/15 8:53 PM, Meredith Whittaker wrote:
...
The biggest issues that I see for OONI, and all efforts in this nascent
space are around:

*Getting consistent measurements at scale, over time, across broad

geographies.*
      - Consistent = from a stable set of tests that, *if they do change,
      are clearly documented as changed*, and this versioning is reflected
      in collected data and elsewhere. The less change, the better (which isn't
      to say that new tests are bad, at all, but that there needs to be set
      "cannon" of core tests that work to set a baseline. Without this, much of
      the research etc. is much, much less valuable.)
      - at scale = a lot -- a significant number of representative links,
      mapping diurnal patterns, etc. (omg obvs, but still)
      - over time = for as long as possible -- letting us compare the
      results of Stable Text X for Place Y in 2015 vs 2017, etc..
      - Across broad geographies = how *are* UK censorship techniques
      different from those used in Liberia? Etc.

*Dealing with issues around user consent and risk. *
This is huge. We all know that. For now, the more OONI probes and

measurement points can be decoupled from individuals (i.e. can

be deployed
      without requiring a user to download, install, flash, access),
the better.
      Impersonal Pis, or similar, seem like the best option for this currently.
      But, even these pose risk if they can be linked to an individual
      (scapegoat) in a hostile country. The bigger OONI's impact, the
bigger this
      problem.
I would advocate that any work undertaken focus on these two goals -- which
I believe to be fundamental to many of the great other goals that have been
put forward here.
With that declaration, my specific comments on the individuals ideas in
Arturo's thoughtful email:

*Getting daily OONI measurements from 50 countries*

This is clearly a laudable goal. I am concerned with the means suggested
   for achieving this. How can it be done without placing users in danger? How
   can an answer to that question be obtained for 50 countries (and orders of
   magnitude more regions, factions, climates within each country)? This is
   also ambitious practically/logistically -- this is an increase in vantage
   points that will likely require localization, maintenance, support,
   troubleshooting, and consistent and well-documented updates, such that
   apples can be compared to apples and data can stand as "proof." A roadmap
   that narrows the focus, from 50 to "a pilot of 5," and that addresses the
   above issues, would be welcome.

*Tests of circumvention tools*

This seems cool, but could as easily be titled The How Well is My
   Censorship Working Index" Answering the question "whom does this serve, and
   how?" would seem to be the next step to assessing the value of this
   proposition.

*Orchestrating OONI probes*

This seems HUGELY problematic WRT privacy, user consent, and security. I
   am also concerned with how this butts up agains the need for consistent,
   and durational testing (which, IMO, is more important as a research tool
   and a means to understand censorship than novel tests run a couple times
   during a given month).

*Data analytics and visualizations*

Is there enough consistent available data (and a roadmap that would
   guarantee a consistent pipeline of consistent data over time) to make this
   useful and worthwhile currently? If/when yes, I would suggest bringing in
   people who have professional experience with data visualizations and
   analysis. With M-Lab this goal has been a continual challenge -- there
   aren't accepted statistical ways of working with network data (more on that
   if you want in another thread), and visualizations need to be gorgeous,
   need to be written in a browser-friendly language, need to be maintained
   and updated, need to not have [too many] gaps, etc.. You are, in this
   effort, producing a user-facing product. All of the forever-work that
   attends a product attends this effort.

*Pub system*

This seems potentially useful (I don't know enough to be concrete here),
   but again, my question is, Whose needs does this serve, specifically, and
   how does serving those needs further a longer-term OONI strategy? More
   generally, shared storage and transport mechanisms for measured data are
   something this space could use, for sure. Would this potentially help build
   those systems?

*Production-quality OONI Pis. *

I like this, and I like the idea of partnering with CI lab (hey!) or
   others. Deployment and maintenance is expensive. The more work these Pis
   can do once they're deployed, before they die, the better. I would be more
   enthusiastic about this if it involved a deployment partner, as I don't see
   a huge value in spending dev time to ensure that the handful of people
   who'd flash a Pi can.

*OONI on mobile*

I vote to have a more stable OONI before launching a mobile test. We at
   M-Lab have explored mobile at length, and it's tricky (I'm not sure tests
   written for a non-mobile environment would be as relevant to a mobile
   environment), deployment is hard (marketing is key! and, who wants to use
   their data cap on something that isn't whatsapp? (etc.).)

*Research based on OONI*

If there's enough consistent data, it could be interesting. But, not
   sure that there is (?)

*Monthly reports*

As above, I think this is premature. Getting good data, and getting
   enough of it, should come first.

*Adopt an OONI probe*

I worry about consent and permissions here. Once those are figured out,
   I would suggest getting a big donor to adopt (deploy) a bunch of OONIs,
   instead of a smaller campaign.

*Integration with other censorship measurement projects*

Really like the sound of this! The more resources can be shared, the
   better. I'll let y'all discuss...

*Reaching out to communities inside censored regions (like the UK?)*

I'm all for this, but I think this should be led by groups like Citizen
   Lab, maybe Amnesty, and others that have experience in qualitative user
   studies and access to networks on the ground.

*Redesign the OONI website*

Definitely necessary. Not clear on its priority. I would suggest that
   any redesign minimize the focus on "censorship" (and the use of the term).
   For all the reasons we've discussed. And that a technical writer and
   someone with some communications training be employed in drafting and
   tinkering.

*Internet censorship conference*

What would the goals be? How would things be better/different when these
   goals were achieved? Without a concrete motive, I worry that this could be
   another "fly the same people somewhere new and pretend we're innovating"
   model.

*Implement a GUI for OONI*

I would prioritize this after the backend is stable, and the consent and
   permissions issues have been worked through. (I also think this is
   something that should engage designers and UX experts outside of the OONI
   core team, because all the reasons
I agree fully with all that you say. In particular I think, although I
had not put it amongst the options to vote on, that we should dedicate
quite a bit of time working on sorting out the informed consent issue.
On 2/19/15 12:08 AM, Jed Crandall wrote:
...
Sorry for chiming in late, have been a bit under the weather this week.
I'll just second Philipp's comment:
https://lists.torproject.org/pipermail/ooni-dev/2015-February/000253.html
I.e., a "me too" for giving "Implement data analytics and visualization
for OONI tests" a 5.  I don't mind writing a bit of code to start
looking at some data, but there's so much data out there and I'm more
likely to start looking at data (or suggest that a research or
networking class student do so) if I already have some idea of what's
there.
I'll also add that a "wish list" of problems you'd like solved could be
helpful, like:
https://research.torproject.org/ideas.html
https://www.torproject.org/getinvolved/volunteer.html.en#Research
For example, I've found IP geolocation services like MaxMind to be
pretty bad in certain places.  It says something is in a country and
it's not, which makes debugging why that data point doesn't behave like
others that are supposed to be in that country a pain.  Is that a
problem that would help OONI if solved?  If not, is something else?
FWIW, this is a step in the direction of doing IP geolocation better in
parts of the world other than the U.S. and Europe:
http://www.cs.unm.edu/~crandall/infocom2015rtt.pdf
Lastly, in terms of baselines, even very specific baslines like "Tor
bridge reachability in Country X" can be very illuminating if the amount
of space and time the data is taken over is broad enough.  Measuring
everything everywhere all the time would be nice, of course, but like
Salvador Dali said, "Have no fear of perfection - you'll never reach
it."
The suggestion for the wish list of problems we would like solved
research wise is very good.
There are a few of those and I think it would be a good idea to write
them down somewhere.
The MaxmMind issue is indeed a problem of ours. Luckily we will collect
by default also the ASN and in most cases you will be able to identify
the inconsistency manually by looking up the details of the ASN (in a
whois database or similar).
Regarding reachability of Tor brides we have at this point about 3-4
months of data (daily measurements) for obfs2/obfs3/scramblesuit/fte
bridges in Iran, China, Russia, Ukraine.
The latest data is not yet public, because it needs to be scrubbed of
the metadata and I have not yet finished re-setting up the pipeline
(after our other machine ran out of disk space).
We also have some visualization, that needs to be completed, for it and
if somebody is interested in hacking on this I could give them access to
the data and code.
....
And now onto the result of the voting session:
Implement data analytics and visualization for OONI tests
4.75
Reach production quality ooni rasperry-pi (beagle-board) images
4.25
Develop OONI tests for censorship circumvention tools
4
Develop scheme for orchestrating ooni-probes
4
Promote and further develop OONI on mobile (Android, iOS)
4
Get daily OONI measurements from 50 countries
3.75
Publish monthly reports about the status of internet censorship in a country
3.75
Implement a GUI for ooniprobes
3.75
Do research based on OONI
3.666666667
Implement pub-sub system for ooni collectors
3.5
Integration with other censorship measurement projects
3.5
Reaching out to communities inside of censored regions
3.5
Run "adopt an ooni-probe" campaign
3.25
Redesign the website for ooni
2.75
Hold an international internet censorship conference
2.25
To me these results are not surprising at all and it reflects more or
less what have already been the main areas of work for the past couple
of months.
I think therefore we should continue in this direction and hence focus
on the the three main areas of "Implement data analytics and
visualization for OONI tests", "Reach production quality ooni
rasperry-pi (beagle-board) images", "Informed consent research",
moreover we will also be working on "Develop OONI tests for censorship
circumvention tools" for a project in a specific country.
Here is a list of what are the existing tickets on these areas and what
are some potentially new tickets to be created.
# Implement data analytics and visualization for OONI tests
## Existing tickets
Add generation of reports index to the export task of the pipeline
https://trac.torproject.org/projects/tor/ticket/13842
Migrating OONI data-pipeline containers and server configuration on a
different server
https://trac.torproject.org/projects/tor/ticket/13825
Better and more efficient database schema
https://trac.torproject.org/projects/tor/ticket/13803
Mongodb queries for the nettest visualization
https://trac.torproject.org/projects/tor/ticket/13759
Brainstorm ideas for possible visualisations
https://trac.torproject.org/projects/tor/ticket/13731
Investigate possible performance improvements to the ooni-pipeline
https://trac.torproject.org/projects/tor/ticket/13720
Align the dates in the visual timeline
https://trac.torproject.org/projects/tor/ticket/13639
Better tokening in the output json data format for bridge reachability
visualisation
https://trac.torproject.org/projects/tor/ticket/13638
## New tickets
### Design and implement OONI reports explorer
This will allow users of OONI to explore the data that we have so far
collected, by filtering and searching it.
# Reach production quality ooni rasperry-pi (beagle-board) images
## Existing tickets
OONI on Raspberry Pi
https://trac.torproject.org/projects/tor/ticket/13870
## New tickets
### Embedded device configuration wizarcd
Setup a OONI wifi network on the raspberry pi to configure the device at
first
start. This will allow the user to configure how ooni-probe should
connect to
the internet and what measurements should be run.
It would also be useful to provide an informed consent information page.
# Informed consent research
## Existing tickets
Write documentation of benefits for running ooniprobe
https://trac.torproject.org/projects/tor/ticket/14760
Brainstorm on possible ways of minimizing the risks involved with
running ooniprobe while keeping the benefits
https://trac.torproject.org/projects/tor/ticket/14761
Redesign how we inform the user of the risks of running ooniprobe and
get informed consent from them
https://trac.torproject.org/projects/tor/ticket/14762
## New tickets
Get legal feedback for the risks of running ooniprobe in a set of specific
countries
Thanks for taking the time to reach this.
I will soon send out an email to schedule next weeks IRC meeting, since
we have skipped it this week.
Have fun!
~ Arturo

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [ooni-dev] Let's come up with the roadmap for the future of OONI