This is an email I sent to someone at the Internet Archive who wanted to know about blocking of archive.org. The URLs "http://archive.org" and "https://archive.org/web/" are in test-lists, so they are being tested by OONI. See the README for notes on how I do analysis using ooni-sync, jq, and R.
https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709/REA... https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709/blo... https://people.torproject.org/~dcf/graphs/archive.org-anomalies-20170709.zip
Here is a description of some basic analysis using OONI to check for blocking of archive.org. It's based on 2,080 reports covering 59 countries, dated between 2017-07-01 and 2017-07-06. I'm attaching the source code and a graph that it produces. There are anomalous measurements found in China, Russia, Venezuela, Mexico, Brazil, and France. Of these, the ones in China and Russia are clearly the result of censorship, while the others are ambiguous, and might be random measurement error or very localized blocking. For a clearer view, you would want to use reports from a longer time period.
Here is a summary of the countries with anomalous measurements, showing how many anomalous measurements there were out of how many total. country anomalous total percent_anomalous 1: CN 1 1 100.0% 2: RU 19 54 35.2% 3: VE 1 4 25.0% 4: MX 1 10 10.0% 5: BR 3 42 7.1% 6: FR 1 100 1.0%
The process of making the graph is basically (1) download OONI reports, (2) filter them for archive.org measurements, and (3) process the data using another script. The longest part of the process is downloading the report files, because they include tests of many domains other than archive.org (typically about a thousand). Currently it's necessary to download the full report files and filter them locally. However, OONI plans to soon deploy a system that will make it possible to download measurements for just one domain at a time.
== China ==
The one test from China shows blocking by DNS injection (this type of blocking is characteristic and well documented for the Great Firewall). In this case, the false DNS response for archive.org that they injected was the IP address 31.13.69.228, which actually belongs to Facebook. https://explorer.ooni.torproject.org/measurement/20170701T065636Z_AS4808_ohP...
== Russia ==
About 35% of tests in Russia were blocked, which is not surprising given that a block of archive.org was ordered in 2015. https://arstechnica.com/tech-policy/2015/06/wayback-machines-485-billion-web... It's not unusual for a site to be available in some places, even when ordered blocked, when enforcement of the block is left to individual ISPs, as seems to be the case here.
The blocked tests came from AS41661 and AS21378. The unblocked tests came from AS3239, AS8369, AS8427, AS12389, AS16345, AS21127, AS41661, and AS42668.
The blocks from AS41661 were by DNS injection, affecting both HTTP and HTTPS. The false IP address returned was 92.255.241.100, whose reverse DNS is law.filter.ertelecom.ru. The web server at http://law.filter.ertelecom.ru/ serves a block page in Russian. https://explorer.ooni.torproject.org/measurement/20170701T190029Z_AS41661_EZ...
The block from AS21378 was by TCP blocking: the DNS request gave the correct response 207.241.224.2 and the client was able to establish a TCP connection to the server, but the firewall did not permit the HTTP response to arrive. https://explorer.ooni.torproject.org/measurement/20170701T135420Z_AS21378_c2...
== Venezuela ==
One test from AS8048 did not get a response to its DNS request. However it may just be a random failure (not blocking), because there were two other successful tests from AS8048, and one successful test from AS6306. https://explorer.ooni.torproject.org/measurement/20170705T141354Z_AS8048_KuY...
== Mexico ==
As in the Venezuela case, there was one test from AS8151 that didn't get a DNS response; however there were 9 other successful tests, including others from AS8151. https://explorer.ooni.torproject.org/measurement/20170703T060009Z_AS8151_GO6...
== Brazil ==
Of the five Brazilian ASes present in the sample of reports, only one shows anomalies: AS1916, Rede Nacional de Ensino e Pesquisa (National Education and Research Network). In this network, requests for http://archive.org (which redirects to https://archive.org) succeed, while those directly requesting https://archive.org/web/ consistently time out. I don't have a good explanation for this. Certain kinds of stateful firewall could plausibly cause such behavior.
== France ==
A single measurement (out of 100) in France timed out requesting http://archive.org. It was in AS197422 and there were no other reports in the sample from that AS, so it's hard to say whether it's due to a block or a random failure. https://explorer.ooni.torproject.org/measurement/20170705T232621Z_AS197422_p...