Web page with graphics: https://people.torproject.org/~dcf/obfs4-timing/
I was experimenting with ways to better obfuscate the timing signature of Tor-in-obfs4, while staying compatible with the obfs4 specification. I tried to make a constant bitrate mode that sends 500 bytes every 100 ms, in which the size and timing of obfs4 packets is independent of the TLS packets underneath. It turns out that obfs4 mostly makes this possible, with one exception: a gap in the client traffic while the client waits for the server's handshake response, during which time the client cannot send anything because it doesn't yet know the shared key.
Currently, the implementation of obfs4 sends data only when the underlying process (i.e., tor) has something to send. When tor is quiet, obfs4 is quiet; and when tor wants to send a packet, obfs4 sends a packet without delay. obfs4 does add a random amount of padding to its packets, which slightly alters the packet size signature but not the timing signature. Even the modes that add short interpacket delays (iat-mode=1 and iat-mode=2) only really have an effect on bulk upload/download—they don't have much of an effect on the initial handshake. See the first three rows of the attached graphic—the timing of the first dozen or so packets hardly varies across the three modes.
This design, where obfs4 only sends packets when driven by tor, is an implementation choice and isn't inherent in the protocol. obfs4's framing structure [spec §5] allows for frames that contain only padding: +------------+----------+--------+--------------+------------+------------+ | 2 bytes | 16 bytes | 1 byte | 2 bytes | (optional) | (optional) | | Frame len. | Tag | Type | Payload len. | Payload | Padding | +------------+----------+--------+--------------+------------+------------+ _ Obfs. _/ ___________ NaCl secretbox (Poly1305/XSalsa20) ___________/ obfs4 could send padding frames (at whatever size and rate) during the times when tor is quiet. The current implementation, in pseudocode, works like this (transports/obfs4/obfs4.go obfs4Conn.Write): on recv(data) from tor: send(frame(data)) If it instead worked like this, then obfs4 could choose its own packet scheduling, independent of tor's: on recv(data) from tor: enqueue data on send_buffer
func give_me_a_frame(): # never blocks if send_buffer is not empty: dequeue data from send buffer return frame(data) else: return frame(padding)
in a separate thread: buf = [] while true: while length(buf) < 500: buf = buf + give_me_a_frame() chunk = buf[:500] buf = buf[500:] send(chunk) sleep(100 ms) The key idea is that give_me_a_frame never blocks: if it doesn't have any application data immediately available, it returns a padding frame instead. The independent sending thread calls give_me_a_frame as often as necessary and obeys its own schedule. Note also that the boundaries of chunks sent by the sending thread are independent of frame boundaries. Yawning points me to this code in basket2 that uses the same idea of independently sending padding according to a schedule: https://git.schwanenlied.me/yawning/basket2/src/72f203e133c90a26f68f0cd33b0c...
I attach a proof-of-concept patch for obfs4proxy that makes it operate in a constant bitrate mode. You can see its timing signature in the fourth row of the attached graphic. With this proof of concept, I'm not trying to claim that a constant bitrate is good for performance or for censorship resistance. It's just an example of shaping obfs4's traffic pattern in a way that is independent of the underlying stream. The fifth row of the graphic shows a more complicated sine wave pattern—it could be anything.
You will however notice an oddity in the fourth and fifth rows of the graphic, a gap in the stream of client packets. This is what I alluded to in the first paragraph, where after the client has sent its client handshake but before it has received the server handshake, the client doesn't yet know the shared key. Because every frame sent after the handshake needs to begin with a tag that depends on the shared key, the client cannot send anything, not even padding, until it receives the server's reponse. During this time, the give_me_a_frame function has no choice but to block.
k = ephemeral key p = random amount of padding m = MAC signifying end of padding a = authentication tag d = data frames The obfs4 client handshake looks like this [spec §4]: k | p | m The server doesn't reply until it has verified the client's MAC, which proves that the client knows the bridge's out-of-band secret (this is how obfs4 resists active probing). The server handshake reply is similar, with the addition of an authentication tag: k | a | p | m (different values than in the client handshake) This means that no matter how you schedule packet sending, the traffic will always have this form, with a ...... gap where the client is waiting for the server's handshake to come back: client kpppppm......ddddddddddddddddddddddddddd server .......kapppmddddddddddddddddddddddddddd
There isn't a way to completely remove the client gap in obfs4 and still follow the protocol. A future protocol could perhaps remove it (I say perhaps because I haven't thought about the crypto implications) by changing the client's handshake to have a second round of padding, which it could send while waiting for the server to reply: k | p | m1 | p | m2
[spec] https://gitweb.torproject.org/pluggable-transports/obfs4.git/tree/doc/obfs4-...