Hello Mike,
I had a talk with Marc and Mohsen today about WTF-PAD. I now understand much more about WTF-PAD and how it works with regards to histograms. I think I might even understand enough to start some sort of conversation about it:
Here are some takeaways:
1) Marc and Mohsen think that WTF-PAD might not be the way forward because of its various drawbacks and its complexity. Apparently there are various attacks on WTF-PAD that Roger has discovered (SENDME cells side-channels?) and also the deep learning crowd has done some pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They also told me that achieving needed precision on the timings might be a PITA.
2) From what I understand you are also hoping to use WTF-PAD to protect against circuit fingerprinting and not just website fingerprinting. They told me that while this might be plausible, there is no current research on how well it can achieve that. Are we hoping to do that? And what research remains here? How can I help? Which parts of the Tor circuit protocol are we hoping to hide?
3) Marc and Mohsen suggested using application-layer defences because the application-layer has much better view of the actual structures that are sent on the wire, instead of the black box view that the network layer has.
In particular they were mainly concerned about onion services fingerprinting because they are part of a restricted closed world, whereas they were less concerned about the entire internet because of its vast size.
They suggested that we could investigate using the service-side "alpaca" library for onion services (e.g. as part of securedrop?) which should resolve the most pressing concern of HS identification.
4) They also told me of research by Tobias Pulls which eliminates the needs for histograms in WTF-PAD and instead it samples from the probability distribution directly. They think that this can simplify things somewhat. Any thoughts on this?
Let me know what you think. I still don't understand the entire space completely yet, so please be gentle. ;)
Cheers! :)
George Kadianakis:
Hello Mike,
I had a talk with Marc and Mohsen today about WTF-PAD. I now understand much more about WTF-PAD and how it works with regards to histograms. I think I might even understand enough to start some sort of conversation about it:
Here are some takeaways:
- Marc and Mohsen think that WTF-PAD might not be the way forward because of its various drawbacks and its complexity. Apparently there are various attacks on WTF-PAD that Roger has discovered (SENDME cells side-channels?) and also the deep learning crowd has done some pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They also told me that achieving needed precision on the timings might be a PITA.
Are there citations for any of this? Last I heard Matt Wright was working on a deep learning study but the results were mixed.
Furthermore, we need to do adversarial learning and other optimizations on these histograms to tune them. They are a generalized approach. Just like it is not a valid evaluation to train a classifier on a dataset and then add a new defense and show that it can't classify the defended traffic using the old model, it is similarly not accurate to develop an attack on WTF-PAD with a new classifier without also adversarially optimizing the WTF-PAD histograms under that classifier. When you do this, your results are not invalidating WTF-PAD, they are only invalidating the histograms that were tuned against the previous classifier/attack.
The same thing applies to the SENDME concern. The core piece of the SENDME issue is "Tor should never send more than 1000 cells without a SENDME. So *IF* I can tell which cells are SENDMEs, and *IF* I see more than 1000 cells between them, then AHA I know that some cells are actually padding and not real traffic".
Both of these are very big *IF*s, and even if they were shown to be valid assumptions (which AFAIK they have not been), that does not mean that it is actually useful for a classifier to know the percentage of padding after 1000 cells, and it also does not mean that there isn't a simple tweak to the histograms that encodes what looks like SENDME transmission to that classifier.
- From what I understand you are also hoping to use WTF-PAD to protect against circuit fingerprinting and not just website fingerprinting. They told me that while this might be plausible, there is no current research on how well it can achieve that. Are we hoping to do that? And what research remains here? How can I help? Which parts of the Tor circuit protocol are we hoping to hide?
I am designing WTF-PAD to be a framework for deploying padding against arbitrary traffic analysis attacks. It is meant to allow us to define histograms on the fly (in the Tor consensus) as these are studied. The fact that they have not yet been studied is not super relevant to deploying the framework for it now.
Marc and Mohsen suggested using application-layer defences because the application-layer has much better view of the actual structures that are sent on the wire, instead of the black box view that the network layer has.
In particular they were mainly concerned about onion services fingerprinting because they are part of a restricted closed world, whereas they were less concerned about the entire internet because of its vast size.
They suggested that we could investigate using the service-side "alpaca" library for onion services (e.g. as part of securedrop?) which should resolve the most pressing concern of HS identification.
I mean yeah application-layer defenses are useful for website traffic fingerprinting, but that is a very narrow slice of the traffic analysis problems that I want this framework to solve.
WTF-PAD also doesn't rule out hidden service operators using alpaca, either.
- They also told me of research by Tobias Pulls which eliminates the needs for histograms in WTF-PAD and instead it samples from the probability distribution directly. They think that this can simplify things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration of WTF-PAD! The question is what form/model to use for these probability distributions. Right now we're encoding inter-burst and inter-packet timings with some weird geometric distribution determining how long these bursts should go on for, when it might be more natural to encode and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Let me know what you think. I still don't understand the entire space completely yet, so please be gentle. ;)
I hope I was gentle enough. If there's anything that triggers rage mode in me me more than someone being wrong on the internet, it's FUD and hand-wringing being spread on the internet. ;)
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Hello Mike,
I had a talk with Marc and Mohsen today about WTF-PAD. I now understand much more about WTF-PAD and how it works with regards to histograms. I think I might even understand enough to start some sort of conversation about it:
Here are some takeaways:
- Marc and Mohsen think that WTF-PAD might not be the way forward because of its various drawbacks and its complexity. Apparently there are various attacks on WTF-PAD that Roger has discovered (SENDME cells side-channels?) and also the deep learning crowd has done some pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They also told me that achieving needed precision on the timings might be a PITA.
Are there citations for any of this? Last I heard Matt Wright was working on a deep learning study but the results were mixed.
I think this is the best we have in terms of public results: https://arxiv.org/abs/1801.02265
- From what I understand you are also hoping to use WTF-PAD to protect against circuit fingerprinting and not just website fingerprinting. They told me that while this might be plausible, there is no current research on how well it can achieve that. Are we hoping to do that? And what research remains here? How can I help? Which parts of the Tor circuit protocol are we hoping to hide?
I am designing WTF-PAD to be a framework for deploying padding against arbitrary traffic analysis attacks. It is meant to allow us to define histograms on the fly (in the Tor consensus) as these are studied. The fact that they have not yet been studied is not super relevant to deploying the framework for it now.
ACK.
What other traffic analysis attacks are we looking at addressing here?
I'm thinking of stuff like "circuit fingerprinting of onion services", but I wonder if histograms and random sampling is too crude to actually be able to help against sophisticated attacks. I don't have a suggestion for something better currently.
On that topic, is it decided whether the adaptive padding of WTF-PAD will also happen during circuit construction, or only after that?
Marc and Mohsen suggested using application-layer defences because the application-layer has much better view of the actual structures that are sent on the wire, instead of the black box view that the network layer has.
In particular they were mainly concerned about onion services fingerprinting because they are part of a restricted closed world, whereas they were less concerned about the entire internet because of its vast size.
They suggested that we could investigate using the service-side "alpaca" library for onion services (e.g. as part of securedrop?) which should resolve the most pressing concern of HS identification.
I mean yeah application-layer defenses are useful for website traffic fingerprinting, but that is a very narrow slice of the traffic analysis problems that I want this framework to solve.
WTF-PAD also doesn't rule out hidden service operators using alpaca, either.
Agreed.
- They also told me of research by Tobias Pulls which eliminates the needs for histograms in WTF-PAD and instead it samples from the probability distribution directly. They think that this can simplify things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration of WTF-PAD! The question is what form/model to use for these probability distributions. Right now we're encoding inter-burst and inter-packet timings with some weird geometric distribution determining how long these bursts should go on for, when it might be more natural to encode and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Hmm. Looking at the README of wtfpad (see the APE section), I think this blog post is the best resource we have on this: https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
On 29/07/18 15:42, George Kadianakis wrote:
- They also told me of research by Tobias Pulls which eliminates the needs for histograms in WTF-PAD and instead it samples from the probability distribution directly. They think that this can simplify things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration of WTF-PAD! The question is what form/model to use for these probability distributions. Right now we're encoding inter-burst and inter-packet timings with some weird geometric distribution determining how long these bursts should go on for, when it might be more natural to encode and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Hmm. Looking at the README of wtfpad (see the APE section), I think this blog post is the best resource we have on this: https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Hi George and Mike,
You found the main writeup of the hasty work I did in this direction a while back, also some comments in the source [0]. Unfortunately my funding took me in other directions and I didn't want to publish any paper without spending more time on it. As written on the blog post it looks like a promising direction, but please also note that the attack implementation of Wa-kNN used has some rough edges for example when it comes to time-based features (so robustness of the naive distributions when moving around the PT server far from a given). If someone wants to collaborate on this I'd be more than happy to contribute, got funding to work on Tor-related things again starting August.
Best, Tobias
[0]: https://github.com/pylls/basket2/blob/master/padding_ape.go
Tobias Pulls:
On 29/07/18 15:42, George Kadianakis wrote:
- They also told me of research by Tobias Pulls which eliminates the needs for histograms in WTF-PAD and instead it samples from the probability distribution directly. They think that this can simplify things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration of WTF-PAD! The question is what form/model to use for these probability distributions. Right now we're encoding inter-burst and inter-packet timings with some weird geometric distribution determining how long these bursts should go on for, when it might be more natural to encode and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Hmm. Looking at the README of wtfpad (see the APE section), I think this blog post is the best resource we have on this: https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Hi George and Mike,
You found the main writeup of the hasty work I did in this direction a while back, also some comments in the source [0]. Unfortunately my funding took me in other directions and I didn't want to publish any paper without spending more time on it. As written on the blog post it looks like a promising direction, but please also note that the attack implementation of Wa-kNN used has some rough edges for example when it comes to time-based features (so robustness of the naive distributions when moving around the PT server far from a given). If someone wants to collaborate on this I'd be more than happy to contribute, got funding to work on Tor-related things again starting August.
This is great! Sorry it took me so long to reply. I've been deep in it thinking about related traffic analysis issues with onion services.
I'm very much interested in this direction. This is the post, right: https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Did you handle deplenishing the distributions when normal traffic is transmitted? Counting traffic that fits the target distribution as "already sent padding" (and thus sending padding less overall traffic in that case) is a key piece of WTF-PAD that allows it to have better goodput. This is in fact why the original e2e defense was called "Adaptive Padding". Because its padding distributions adapt to observed traffic.
If we could alter the distribution in this same way, it may be the a good way to go. However, histograms tend to be easier to do this with, and they also encode distributions (just perhaps more tediously and verbosely).
One of the other things I want to try, that may overlap, is changing the type of information the distribution/histogram encodes. Inter-packet and inter-burst delay (encoded as two separate states in the state machines) is perhaps not as optimal or useful or easy to specify/optimize as something more naturally resembling web traffic, such as a distribution of request sizes and object sizes, and some way to simulate concurrent fetch (selection of overlap) of these object sizes, and subtract these objects-size instances from the distribution when we see them.
What do you think about that? Does that make sense?
Do you think we should try to do this as a parameterized distribution, or as a histogram?
Are you interested in attempting to implement both/either?
Ooh nice! This is done as a PT implementation.
You might like: https://github.com/mikeperry-tor/vanguards/blob/master/README_SECURITY.md
In it, I recommend obfs4 with iat-mode=2 because it does some limited traffic packet size and timing obfuscation. Should we consider recommending basket2 also? Is anyone running bridges with it? Probably not, I guess :/.
On 08/02/2018 08:26 PM, Mike Perry wrote:
Should we consider recommending basket2 also?
No.
Is anyone running bridges with it? Probably not, I guess :/.
No one should be, it is incomplete, buggy, and needs a re-design.
As a side note, I question the utility of a PT that has the AGPL3 network interaction requirement, though there is an exception for bridges distributed via BridgeDB and those shipped with Tor Browser.
Regards,
Hi Yawning!
Yawning Angel:
On 08/02/2018 08:26 PM, Mike Perry wrote:
Should we consider recommending basket2 also?
No.
Is anyone running bridges with it? Probably not, I guess :/.
No one should be, it is incomplete, buggy, and needs a re-design.
Thanks for the heads up.
As a side note, I question the utility of a PT that has the AGPL3 network interaction requirement, though there is an exception for bridges distributed via BridgeDB and those shipped with Tor Browser.
Would you recommend anything else other than obfs4 at this time, as per that README_SECURITY doc? (https://github.com/mikeperry-tor/vanguards/blob/master/README_SECURITY.md)
On 29 Jul 2018, at 23:42, George Kadianakis desnacked@riseup.net wrote:
- From what I understand you are also hoping to use WTF-PAD to protect
against circuit fingerprinting and not just website fingerprinting. They told me that while this might be plausible, there is no current research on how well it can achieve that. Are we hoping to do that? And what research remains here? How can I help? Which parts of the Tor circuit protocol are we hoping to hide?
I am designing WTF-PAD to be a framework for deploying padding against arbitrary traffic analysis attacks. It is meant to allow us to define histograms on the fly (in the Tor consensus) as these are studied. The fact that they have not yet been studied is not super relevant to deploying the framework for it now.
ACK.
What other traffic analysis attacks are we looking at addressing here?
I'm thinking of stuff like "circuit fingerprinting of onion services", but I wonder if histograms and random sampling is too crude to actually be able to help against sophisticated attacks. I don't have a suggestion for something better currently.
On that topic, is it decided whether the adaptive padding of WTF-PAD will also happen during circuit construction, or only after that?
Padding during circuit construction should work with VPADDING cells: https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n508
At least it did last time I checked: https://github.com/teor2345/endosome/blob/master/client-or-22929.py https://trac.torproject.org/projects/tor/ticket/22929
We should avoid using PADDING cells during the handshake, because Tor sometimes closes the connection: https://github.com/teor2345/endosome/blob/master/client-or-22934.py
T
-- teor
Please reply @torproject.org New subkeys 1 July 2018 PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ----------------------------------------------------------------------