On Thu, 20 Apr 2023 at 17:16, Jansen, Robert G CIV USN NRL (5543) Washington DC (USA) via tor-project tor-project@lists.torproject.org wrote:
The primary information being measured is the directionality of the first 5k cells sent on a measurement circuit, and a keyed-HMAC of the first domain name requested on the circuit.
I suppose this is kind of a non-question, since you wouldn't be doing it otherwise, but I am surprised that associating the traffic patterns to a single key, that of the first domain name, is sufficient. Every page or query made to that domain (e.g. duckduckgo) will have the same key, with potentially a lot of entirely disparate traffic patterns.
Obviously this is limited by what you can technically achieve in this scenario: you have the plaintext DNS requests, and everything else is going to be TLS-encrypted. The alternative would be to instrument a tor client/browser and find volunteers to opt-in to their data collection.
-tom