Hi Arturo, thanks for providing these answers.
On 29/07/15 10:04, Arturo Filastò wrote:
On Jul 19, 2015, at 23:00, Daniel Ramsay daniel@dretzq.org.uk wrote:
Hi,
As part of our work incorporating ooniprobe into the blocked.org.uk scheduling system, we now have a final piece of code that relays ooniprobe results from the scheduling system back to the main OONI collectors. We've currently got about a million individual test results stored, covering 7 ISPs.
Hi Daniel,
Thank for reaching out with these questions.
I will summarise what I told Richard on IRC during this weeks dev gathering so that it’s recorded here.
I have a few questions though:
- Is it possible to submit results to the OONI collector over
HTTP/HTTPS instead of TOR, and if so, is there a public DNS name used for the collector? For the volume of results we've got, it could be a lot more bandwidth efficient and faster to run.
Currently we only support reporting via Tor Hidden Service or via vanilla HTTP. Reporting via HTTP is currently not supported by the canonical OONI collector so in order to support that you would have to run your own collector that support HTTP and peer with the OONI data collection pipeline to submit the results it gathers.
We're already emulating a collector in the blocked.org.uk API, so perhaps we can go directly to peering with the pipeline. Is there any reference information that I can read on how to go about getting this set up (protocols, hostnames, etc)?
Since we have received many requests of supporting HTTPS collectors we have plans of adding support for it in the near future. Nowadays it should be much easier since the twisted API for doing HTTPS has improved since version 14.0.
I did add some minimal support for HTTPS collector URLs in the patch set. It's still being worked on for upstream submission. The HTTPS support probably doesn't go as far as you'd like though.
Still I would like to preserve the property of having URLs be self authenticating and designed a scheme to extend HTTPS URIs to support something similar to certificate pinning here: https://github.com/hellais/sslpin. That code is just a POC and is based on an old version of twisted when it was harder to do cert validation. I think supporting this in recent versions of twisted should be much easier.
Newer versions of twisted and python will do certificate verification using the operating system's certificate store, but as you point out, that doesn't provide a way of ensuring that the only certificate that can be used is from the official CA rather than any of the others.
It may be possible to force a twisted agent to only use a bundled CA certificate for verification, rather than relying on the system installed CA list. The python requests library supports this usage, but I'm not sure about twisted.
Thanks again,
Daniel.