I had created a PGP key for the ooni-dev mailing list and always forget to disable encryption when sending to the list.
Should create a rule for that...
Anyways here it goes:
On 12/28/14 6:50 AM, royaen wrote:
Hi Arturo and all,
Hi Roya,
Thanks for the taking the time to chime into this discussion.
When it comes to ethics of soliciting measurements and informed consent, I have a different take which has been my research topic over the past years. There are many reasons why I think that directly measuring censorship is scary. First of all, you need to acquire reliable vantage points to run your measurements. Volunteering one’s machine to foreign researchers, or operating a device on their behalf, might be viewed by the government as espionage. Besides, many regions, especially places where we don’t have good infrastructure, have a limited number of companies/volunteers (if any) that allow foreigners to rent computers inside the country. All the current direct approaches, such as RIPE Atlas [1] or other distributed platforms or volunteers running Raspberry Pis are often easy to spot and data collected from them may not be reliable. For example, regarding China, we showed [2] that censorship is different in CERNET (China Education and Research Network) compared to other ISPs.
The reason why all these projects rely on vantage points from the network point of view is that this is the way to have the most accurate measurements and in a lot of cases it is the only way to measure that particular kind of censorship.
Being from the vantage point of the censored user allows you to fully emulate what a user would be doing when accessing the censored site.
When it comes to measuring connectivity, I believe that it is better to involve the whole country in doing the measurements rather than volunteers whose safety is at stake. Therefore, I have developed effective methods for remotely measuring Internet censorship around the world, without requiring access to any of the machines whose connectivity is tested to or from. These techniques are based on novel network inference channels, a.k.a idle scans. That is, given two arbitrary IP addresses on the Internet that meet some simple requirements such as global IPID behaviour, our proposed technique can discover packet drops (e.g., due to censorship) between the two remote machines, as well as infer in which direction the packet drops are occurring. Here are more references to read [3,4]. Basically, for one of the idle scans (hybrid idle scan), we only create unsolicited packets (a bunch of SYNACK and RST segments) between two remote IPs, and look at the changes in the global IPID variable to infer whether censorship is happening and if so, in which direction packets are dropped.
Your research is very interesting and I believe very important for getting more data when it would just be too risky to have network vantage. I do think, though, that we can't rely only on these sorts of measurements. They complement and extend what we measure from the network vantage point of the user, but may not work as reliably in all censorship systems and only give you a subset of the information we are interested in acquiring.
For example things that we are interested in gathering with ooniprobe are also fingerprints for the censorship equipment being used. This is something that I don't think will be as accurately measured with indirect scans.
Back to my main point, why I am trying so hard to convince you that we also need to use side channels and how this relates to ethics, well, here is the story: The discussion you brought up has been discussed heavily in academia in the past six months after two papers got rejected from the IMC conference because of ethics. One of them was my paper [2] after having received good reviews on the technical contribution. Here is the link to the reviews:
https://imc2014.cs.wisc.edu/hotcrp/paper/243?cap=0243a2kWYrwVqbv0
Yes I perfect agree with the fact that we should also be collecting measurements gathers using these sorts of technique using ooniprobe. It would be epic if you or somebody else were to implement ooniprobe measurements for them.
I would however like to make the point that with the OONI project our main goal is not that of publishing academic papers. If that comes as part of the process then it's great, but our top priority is finding the right balance between the safety of users and impact we can reach by exposing facts about internet censorship. This is something very tough, but I think that by not being directly affiliated with a university (hence not having to jump through the various hoops you folks have to before doing your research), we have a slight advantage. We don't have to get approval from IRBs or have to publish a certain amount of papers per year. The only people we are accountable to are our users.
I personally just got an email with above link from IMC, and because of having had a single-entry visa, I couldn’t attend IMC or the Citizen Lab workshop where a lot of the discussions about ethics were taking place. The ethical issues that usually come up are two: First, using idle scans, no consent from users is collected. Second, censors could mistakenly assume that two machines measured by us are deliberately communicating with each other. This could have negative consequences if a censor believes that a user is communicating with a sensitive or forbidden IP address.
In response to the latter argument, it is unlikely that a censor would come to such a conclusion as only RST segments are created from a client inside a country to a server and only SYN/ACK segments are sent from a server to a client inside the censoring country. An adversary would not witness a full TCP handshake, let alone any actual data transfer.
One mitigation technique that I have been focusing on is to use routers instead of end points for the side channel measurements.
I think that the censor would have a pretty hard job proving in a just court of law that such user was engaging in censorship measurements (assuming they consider censorship measurements to be an illegal thing). Unfortunately in some countries were we measure the courts of law are not just and we have to make all sorts of crazy assumptions on how they will interpret what we are doing. Using routers instead of real users when doing the scans could be a safer move if it does not affect your measurement.
If you or anyone else is interested in using these techniques, I am more than happy to help.
I will keep your experiments in mind if somebody comes wanting to hack on something interesting and point them to you.
I think the best thing to do would be to create a ticket(s) for implementing your tests on the Ooni component on: https://trac.torproject.org/
~ Arturo
Roya
[2] http://arxiv.org/abs/1410.0735
[3]http://arxiv.org/pdf/1312.5739v1.pdf
[4]http://www.usenix.org/event/sec10/tech/full_papers/Ensafi.pdf