Sam Burnett sam.burnett@gatech.edu writes:
Hi,
I'd like to help improve the Tor Censorship Detector. I've read some background material and think I understand the basics of George Danezis' detection algorithm [1, 2].
Is anyone still working on this? Two tickets from a year ago talk about experimenting with various detection algorithms and turning one of them into a standalone utility [3, 4]. Has anything happened since then?
My background: I'm a graduate student at Georgia Tech studying network censorship circumvention and measurement. Although I've met Tor developers on various occasions, I haven't directly contributed to the project; I'd like to change that.
Hi there,
this project has been pretty stale lately indeed.
Some random thoughts:
* George Danezis started designing another censorship anomaly detection algorithm, but he never implemented it.
He did not like how the first detection algorithm triggered multiple times after a censorship event was reported (since all subsequent days after censorship deviated from the week before and it used a 7 days delta). Furthermore, he did not like that the algorithm only considered clients from a week ago; maybe a day ago would be a better model in some cases.
I think I have an email with his ideas on the new model; I will ask George whether I can send it to you.
* IIRC, the current model compares the rate of change of rate of clients of a jurisdiction (between t_i and t_{i-1}) with the same value in other ("stable") jurisdictions.
This means that the current model only cares about the trends of a jurisdiction against the "global trends". While this makes sense, it might also be useful to compare the current parameters of a jurisdiction with past (old) parameters from the same jurisdiction (for example, see how the rate of change of rate of number of clients of this week compares with that of two months ago).
* You can find daily results of the algorithm here: https://lists.torproject.org/pipermail/tor-censorship-events/
As you can see the signal-to-noise ratio is pretty high. Maybe there are heuristics that we could use to weed out useless events?
* If we get better at anomaly detection, it would be fun to use our algorithms to find anomalies in various properties of the Tor network. For example, finding anomalies on the values of the relay flag thresholds might help us detect attacks on the network. Related Tor tickets: https://trac.torproject.org/projects/tor/ticket/8164 https://trac.torproject.org/projects/tor/ticket/8151