The limits of timing obfuscation in obfs4 - tor-dev

12 Jun 2017


      Web page with graphics: https://people.torproject.org/~dcf/obfs4-timing/
I was experimenting with ways to better obfuscate the timing signature
of Tor-in-obfs4, while staying compatible with the obfs4 specification.
I tried to make a constant bitrate mode that sends 500 bytes every
100 ms, in which the size and timing of obfs4 packets is independent of
the TLS packets underneath. It turns out that obfs4 mostly makes this
possible, with one exception: a gap in the client traffic while the
client waits for the server's handshake response, during which time the
client cannot send anything because it doesn't yet know the shared key.
Currently, the implementation of obfs4 sends data only when the
underlying process (i.e., tor) has something to send. When tor is quiet,
obfs4 is quiet; and when tor wants to send a packet, obfs4 sends a
packet without delay. obfs4 does add a random amount of padding to its
packets, which slightly alters the packet size signature but not the
timing signature. Even the modes that add short interpacket delays
(iat-mode=1 and iat-mode=2) only really have an effect on bulk
upload/download—they don't have much of an effect on the initial
handshake. See the first three rows of the attached graphic—the timing
of the first dozen or so packets hardly varies across the three modes.
This design, where obfs4 only sends packets when driven by tor, is an
implementation choice and isn't inherent in the protocol. obfs4's
framing structure [spec §5] allows for frames that contain only padding:
    +------------+----------+--------+--------------+------------+------------+
    |  2 bytes   | 16 bytes | 1 byte |   2 bytes    | (optional) | (optional) |
    | Frame len. |   Tag    |  Type  | Payload len. |  Payload   |  Padding   |
    +------------+----------+--------+--------------+------------+------------+
     _ Obfs.  _/ ___________ NaCl secretbox (Poly1305/XSalsa20) ___________/
obfs4 could send padding frames (at whatever size and rate) during the
times when tor is quiet. The current implementation, in pseudocode,
works like this (transports/obfs4/obfs4.go obfs4Conn.Write):
    on recv(data) from tor:
    	send(frame(data))
If it instead worked like this, then obfs4 could choose its own packet
scheduling, independent of tor's:
    on recv(data) from tor:
    	enqueue data on send_buffer
func give_me_a_frame(): # never blocks
    	if send_buffer is not empty:
    		dequeue data from send buffer
    		return frame(data)
    	else:
    		return frame(padding)
in a separate thread:
    	buf = []
    	while true:
    		while length(buf) < 500:
    			buf = buf + give_me_a_frame()
    		chunk = buf[:500]
    		buf = buf[500:]
    		send(chunk)
    		sleep(100 ms)
The key idea is that give_me_a_frame never blocks: if it doesn't have
any application data immediately available, it returns a padding frame
instead. The independent sending thread calls give_me_a_frame as often
as necessary and obeys its own schedule. Note also that the boundaries
of chunks sent by the sending thread are independent of frame
boundaries. Yawning points me to this code in basket2 that uses the same
idea of independently sending padding according to a schedule:
https://git.schwanenlied.me/yawning/basket2/src/72f203e133c90a26f68f0cd33b0c...
I attach a proof-of-concept patch for obfs4proxy that makes it operate
in a constant bitrate mode. You can see its timing signature in the
fourth row of the attached graphic. With this proof of concept, I'm not
trying to claim that a constant bitrate is good for performance or for
censorship resistance. It's just an example of shaping obfs4's traffic
pattern in a way that is independent of the underlying stream. The fifth
row of the graphic shows a more complicated sine wave pattern—it could
be anything.
You will however notice an oddity in the fourth and fifth rows of the
graphic, a gap in the stream of client packets. This is what I alluded
to in the first paragraph, where after the client has sent its client
handshake but before it has received the server handshake, the client
doesn't yet know the shared key. Because every frame sent after the
handshake needs to begin with a tag that depends on the shared key, the
client cannot send anything, not even padding, until it receives the
server's reponse. During this time, the give_me_a_frame function has no
choice but to block.
k = ephemeral key
    p = random amount of padding
    m = MAC signifying end of padding
    a = authentication tag
    d = data frames
The obfs4 client handshake looks like this [spec §4]:
    k | p | m
The server doesn't reply until it has verified the client's MAC, which
proves that the client knows the bridge's out-of-band secret (this is
how obfs4 resists active probing). The server handshake reply is
similar, with the addition of an authentication tag:
    k | a | p | m	(different values than in the client handshake)
This means that no matter how you schedule packet sending, the traffic
will always have this form, with a ...... gap where the client is
waiting for the server's handshake to come back:
    client  kpppppm......ddddddddddddddddddddddddddd
    server  .......kapppmddddddddddddddddddddddddddd
There isn't a way to completely remove the client gap in obfs4 and still
follow the protocol. A future protocol could perhaps remove it (I say
perhaps because I haven't thought about the crypto implications) by
changing the client's handshake to have a second round of padding, which
it could send while waiting for the server to reply:
    k | p | m1 | p | m2
[spec] https://gitweb.torproject.org/pluggable-transports/obfs4.git/tree/doc/obfs4-...