Traffic Obfuscation - tor-dev

5 Sep 2013


      Can a developer please explain to me why something like the 
following obfuscation of 'torified traffic' is exploitable?
Suppose a scenario where a collective of authorities is able
to observe large parts of the www. Then observing traffic
correlation can unreveal a connection through the network.
But why can't we just alter the pattern inside the network,
such that there is no correlation between 'incomming' and
'outgoing' data anymore?
Suppose I'm connected to a server and there is a lot of
traffic between from the server to me. 
Through the TOR network, of course. Data is encrypted 
and the exploit measures the raw data stream pattern.
But why not change the data stream inside the network?
Suppose the server A is outside of the TOR network, i.e.
it is not a hidden service. The data stream into our
network then is out of our controll. Encrypted or not,
we can't change it. Now it flows to node B (exit node) 
and inside TOR. Then node B streams the data to node C, 
than node C streams the data to node D and node D exits the 
stream to me. 
(simplification)
Ok now node B got the data from the 'outside world'. Then
B and C first make a handshake to define a shared key for 
a private encryption protocoll only valid for some time.
Now node B does not stream the data to node C, but obfuscates
it. That means if there are n packages it transforms them into
m packages in some unpredictable way and each new packages gets
a small amount of additional random-data.
(The point is that the new stream will not look at all like the
old one)
Only node B nows the way to de-obfuscate this. But B and C did
a handshake and using this encryption B shares with C how to
de-obfuscate the data.
Now C recovers the real data and then does another secret
handshake with D. (separated from the shared secret with B of 
course) Then C obfuscates the data again and only D will know 
how to recover the original data.
This repeats until I recive an obfuscatet stream and my client
can recover the original data.
======
The point is, that patterns of the in-stream of the server A
aren't correlated to what streams from D to me anymore. Hence
an observer isn't able to see correlations anymore. Number, size
and pattern of the stream packages are different.
Above this one could try to add randomly zero-information streams
between the network and its clients. That way an observer can't even 
be sure if a client recieves information at al...
======
Now can someone please downtalk this approach?
best /jo