On 12/22/20 7:58 PM, The23rd Raccoon wrote:> Recent advances in traffic analysis defenses have finally proved that my
controversial but revolutionary theories[5,6] are valid, overturning decades of theories about traffic analysis attacks against anonymity networks!
Ok ok, so I might have made a mistake in the math[7]. But I'm just a Raccoon who reads discarded academic research papers in a dumpster. While I have been highly educated through my dumpster schooling, one can't expect raccoons to do math correctly. Such math is best left to others who can properly express my theory in terms of equations.
I am glad you liked the papers!
Others like Panchenko, Pulls, Danezis, Kadianakis, et al; and maybe Perry. (But probably not Perry.)
Hey, I can MATH!
As Tor's Research Janitor, I confirm that your bulletin contains valid novel ideas, and they are very testable (see below). This is insanely great! I wish I thought of it!
In fact, unification of correlation and fingerprinting, along with the unification and combination of defenses, is an entire research area, with many possible paper topics.
Recently, Panchenko et al found traffic splitting to be highly effective against state of the art Website Traffic Fingerprinting attacks based on deep learning[24].
Concurrently, Tobias Pulls used Perry and Kadianakis's padding machines in an optimization problem, using a Genetic Algorithm to evolve optimal padding machines against deep learning classifiers, for use in defense against Website Traffic Fingerprinting[25]. With this result, we have finally entered the age of the machines versus the machines. Raccoon math, while groundbreaking, is no longer necessary.
Both of these defenses were highly successful on their own.
Pulls's methodology in your reference[25] was exemplary. Using the circpad simulator and the circpad frameworks allows us to rapidly and directly deploy exact research solutions on the Tor network, as-is.
In fact, we could deploy the GA-generated machine specifications in his paper on live Tor relays today.
We will need to re-tune everything once congestion control and conflux is deployed, and when timing is involved, so I think the best plan is to have another round or two of research into optimizing and tuning for that scenario.
However, with the combination of traffic splitting and cover traffic defenses, Tor will be on the CUTTING EDGE of making a PARADIGM SHIFT in its threat model, to tackle the hardest problem of all: END-TO-END TRAFFIC CORRELATION.
I also agree that the combination should require less overhead and better performance than either one by themselves. Obviously, testing this is a very promising research area. I encourage full collaboration between Pulls, Panchenko, Tor, wild raccoons, and others, in this area.
For those who are considering studying this, see: https://gitlab.torproject.org/mikeperry/torspec/-/blob/ticket40202_01/propos...
We are optimizing that using congestion control, to achieve high-speed low-latency traffic splitting, to exit relays and onion services. We will likely only use 2 circuits, to reduce exposure to guard relays with respect to other potential attacks, so some padding overhead will likely still be necessary.
The combination could also be tuned to help reduce the overhead needed by padding, in an optimization problem context, like Pulls's GA.
I will be updating that draft with more information as the proposal solidifies.
Note to those from the future: this proposal draft link will eventually be merged to the torspec repo. Check for the final version here: https://gitlab.torproject.org/tpo/core/torspec/-/tree/master/proposals
Indeed, once time is included as a feature, deep learning based Website Traffic Fingerprinting attacks will effectively be correlating the timing and traffic patterns of websites to their representations in its neural model. This model comparison is extremely similar to how end-to-end correlation compares the timing and traffic patterns of Tor entrance traffic to Tor exit traffic. In fact, deep learning classifiers have already shown success in correlating end-to-end traffic on Tor[28].
While you have offered no specific testable predictions for this theory, presumably to score more crackpot points, allow me to provide a reduction proof sketch, as well as an easily testable result.
To see that Deep Fingerprinting reduces to Deep Correlation, consider the construction where the correlator function from DeepCorr is used to correlate pairs of raw test traces to the raw training traces that were used to train the Deep Fingerprinting classifier. The correlated pairs would be constructed from the monitored set's test and training examples. This means that instead of correlating client traffic to Exit traffic, DeepCorr is correlating "live" client traces directly to the raw fingerprinting training model, as you said.
This gets us "closed world" fingerprinting results. For "open world" results, include the unmonitored set as input that does not contain matches (to represent partial network observation that results in unmatched pairs).
If the accuracy from this DeepCorr Fingerprinting construction is better than Deep Fingerprinting for closed and open world scenarios, one can conclude that Deep Fingerprinting reduces to DeepCorr, in a computational complexity and information-theoretic sense. This is testable.
If the accuracy is worse, then Deep Fingerprinting is actually a more powerful attack than DeepCorr, and thus defenses against Deep Fingerprinting should perform even better against DeepCorr, for web traffic. This is also testable.
This reduction also makes sense intuitively. The most powerful correlation and fingerprinting attacks now use CNNs under the hood. So they should both have the same expressive power, and inference capability.
Interestingly, the dataset that Pulls used was significantly larger than what DeepCorr used, in terms of "pairs" that must be matched.
More interestingly, DeepCorr also found that truncating flows to the initial portion was still sufficient for high accuracy. Pulls's defenses also found that the beginning of website traces were most important to pad heavily.
Some say that Long Term Statistical Disclosure (LTSD) attacks will still always win the end-to-end correlation game against anonymity networks, in the fullness of time[29].
However, LTSD attacks are only a theory. And much like quantum mechanics, relativity, and LSD, these attacks also warp one's perception of reality, time, and space. All of these theories are fundamentally misguided.
LTSD attacks predict that over time, correlation gradually leaks enough information to fully deanonymize users of anonymity networks. But also much like quantum mechanics, they fail to fully define the mechanism.
Consider this thought experiment (feel free to use whatever mind expanding devices you have at hand to assist you): LTSD assumes that an adversary has complete high resolution information of all traffic that enters and exits an anonymity network. Additionally, LTSD assumes that an adversary has identifiers available to properly track traffic streams on *each* side of the correlation, over the full duration of observation and long-term correlation.
For a more modern treatment of LTSD-like correlation attack theory, see The Anonymity Trilemma: https://eprint.iacr.org/2017/954.pdf
Even so, all of the limitations you have identified still apply. Some have been incorporated into the theory and indeed show decrease in efficacy, but others have still not been accounted for!
As I said in the circpad framework documentation, I prefer an empirical approach to pure formalism, for this reason. I agree that it looks like we can do much better than today, for a realistic amount of overhead.
All of that said, anonymity is a complicated problem. As your earlier posts indicate: targeting, stylometry, and mailinglist post timing can degrade anonymity in surprising ways. The Raccoon Effect only works if we have enough raccoons who behave and look alike, and are exceedingly careful about it. The machines can do much more than correlate traffic patterns, these days!
This by itself is a huge win. We can now say with certainty that The Raccoon Effect has thoroughly discromulated correlation attacks.
(Discromulation is my term to describe what this kind of defense does. Most interestingly, I am forced into winning this crackpot point. Because deep learning is an opaque machine generated attack, and because the GA-optimized defense is also machine generated, it is actually impossible to precisely describe the complete behaviors of either one, other than with the resulting model definitions themselves! Brave new world.)
This *is* interesting. Pulls also pointed this out in his paper. This is another reason why it seems better to rely on reproducible empirical methods, rather than pure formalism.
Now, what about alien intervention? Well, assuming we do not consider the AI that participated in this work to be alien: if aliens did intervene, none would argue the discromulating conflugruity of The Raccoon Effect. Unfortunately however, I can neither confirm nor deny these allegations[34], at this time[35].
The fact that Pulls's AI named itself 'Interspace' has me curious and eager to subscribe to your newsletter!
But that's not all! Since the circuit padding framework is implemented in Tor, this means that it is covered by Tor's bug bounty. While research papers that break padding defenses are not covered by the bounty (especially if those defenses are not actually deployed), there *is* in fact prize money for any flaws found in the framework that could lead to code execution, or deanonymization[36].
Unfortunately, when OTF lost funding due to the Trump administration's desire to fund closed source Internet Freedom tools, we also lost our OTF funding for this bug bounty, and had to temporarily suspend it while we look for a new sponsor.
However, to keep you honest (and preserve your crackpot points), I will personally honor the bounty for any bugs found in the circpad framework, as deployed in Tor, that lead to code execution or full deanonymization, as a result of that code (excluding correlation and fingerprinting attacks, until we deploy strong defenses). It is mostly my code anyway, and I doubt George Kadianakis made any mistakes.
If anyone wants to help support Tor's ability to make progress on these types of problems, please consider donating: https://donate.torproject.org/
P.P.P.S. At 1004 points on the crackpot index, I believe this post is now the highest scoring publication with a valid novel idea that has been written, to date[2].
If it helps to get a raccoon into the world record books: I again confirm this is a valid, novel idea. I have kept John Baez on Cc for this reason. We should probably take him off after this :).
P.P.P.P.S. Fucking bored as fuck during this fucking pandemic. Fuck![42]
I hear you. To help pass the time until the aliens reveal themselves, I've made a playlist: https://open.spotify.com/playlist/5iYQ0BZNEOaoRhf8Pydvqp
- https://math.ucr.edu/home/baez/crackpot.html
- https://www.reddit.com/r/math/comments/4r05wh/has_anyone_with_a_high_crackpo...
- https://en.m.wikipedia.org/wiki/Betteridge%27s_law_of_headlines
- http://www.stinkymeat.net/
- https://archives.seul.org/or/dev/Mar-2012/msg00019.html - Raccoon23 Post1
- https://archives.seul.org/or/dev/Sep-2008/msg00016.html - Raccoon23 Post2
- https://conspicuouschatter.wordpress.com/2008/09/30/the-base-rate-fallacy-an...
- https://fahrplan.events.ccc.de/congress/2006/Fahrplan/speakers/1242.en.html
- https://awards.acm.org/award_winners/syverson_5067587
- https://web.archive.org/web/20121130072122/http://www.foreignpolicy.com/arti...
- https://lists.torproject.org/pipermail/tor-dev/2008-September/002493.html
- https://en.wikipedia.org/wiki/Simulation_hypothesis#The_simulation_argument
- https://en.wikipedia.org/wiki/Alcubierre_drive
- https://www.bbc.com/news/world-europe-36173247 - Weasel takes down LHC
- https://www.slashgear.com/elon-musk-has-banned-hot-tub-talks-about-simulated...
- https://www.forbes.com/sites/janetwburns/2016/10/13/elon-musk-and-friends-ar...
- https://www.youtube.com/watch?v=qLcma0YyzhY - Elon Musk Flame Thrower
- https://www.yogonet.com/international/noticias/2020/12/07/55695-boring-compa...
- https://www.inverse.com/innovation/tesla-electric-jet-3-4-years-away
- https://blog.torproject.org/critique-website-traffic-fingerprinting-attacks
- https://github.com/torproject/tor/blob/master/doc/HACKING/CircuitPaddingDeve...
- https://arxiv.org/pdf/1801.02265.pdf - Deep Fingerprinting Tor
- https://www.youtube.com/watch?v=TvjMr6DU7C8 - Raccoon call
- https://www.comsys.rwth-aachen.de/fileadmin/papers/2020/2020-delacadena-traf...
- https://arxiv.org/abs/2011.13471 - Pulls GA Defense
- https://abcnews.go.com/blogs/headlines/2014/05/ex-nsa-chief-we-kill-people-b...
- https://www.full-thesis.net/fragments-of-energy-not-waves-or-particles-may-b...
- https://people.cs.umass.edu/~amir/papers/CCS18-DeepCorr.pdf
- https://www.freehaven.net/anonbib/cache/statistical-disclosure.pdf
- https://transparencyreport.google.com/https/overview?hl=en
- https://en.wikipedia.org/wiki/Accelerando#Characters
- https://fahrplan.events.ccc.de/congress/2006/Fahrplan/attachments/1167-Speak...
- https://www.youtube.com/watch?v=jSRfIMjvtFk - Raccoons and cats <3
- https://edition.cnn.com/2020/04/27/politics/pentagon-ufo-videos/index.html
- https://www.nbcnews.com/news/weird-news/former-israeli-space-security-chief-...
- https://hackerone.com/torproject
- https://www.msn.com/en-ie/news/coronavirus/during-a-pandemic-isaac-newton-ha...
- https://www.youtube.com/watch?v=Ofp26_oc4CA - Raccoons are Legion
- https://www.usenix.org/conference/usenixsecurity21/artifact-evaluation-infor...
- https://petsymposium.org/artifacts.php
- https://en.wikipedia.org/wiki/Liar_paradox
- https://www.youtube.com/watch?v=04_rIuVc_qM - WTF
This is an auspicious number of top-tier references!
For Karsten: https://cs5.livemaster.ru/storage/3a/1f/1449eb23f3c3b318ab4960815fn4--waterc...
It is comforting to know that Karsten had friends even among the raccoons. Probably among the aliens too.