On Mon, Feb 12, 2024 at 10:34:21AM -0800, Micah Elizabeth Scott wrote:
The "normal process" of sending traffic through tor does not directly involve TCP or TCP headers, nor are there boundaries preserved which would correspond to TCP segments. Individual streams are encapsulated within multiple other layers (tor streams and circuits, then TLS) before we encounter any real TCP segments.
Right -- and this is a feature in the sense that it removes a bunch of end-to-end information that would otherwise leak.
For example, if the exit relay or the destination server can look at the TCP headers that the client generates, they could examine how they are constructed and how they respond to errors to make a good guess about which operating system, and even which kernel version, the client is running.
And there are more esoteric attacks, like knowing that the rate of clock skew change is a physical characteristic of the hardware clock and then using "change in skew" (looking at the timestamp in each TCP header) as a cookie-like feature to distinguish users: https://2019.www.torproject.org/docs/faq#RemotePhysicalDeviceFingerprinting https://www.caida.org/catalog/papers/2005_fingerprinting/KohnoBroidoClaffy05... (I bet there are many more papers published after that one, but our design means we have mercifully not needed to keep up with the remote device fingerprint literature.)
--Roger