On Thu, 20 Apr 2023 at 17:16, Jansen, Robert G CIV USN NRL (5543) Washington DC (USA) via tor-project <tor-project@lists.torproject.org> wrote:
The primary
information being measured is the directionality of the first 5k cells sent on a
measurement circuit, and a keyed-HMAC of the first domain name requested on the
circuit.


I suppose this is kind of a non-question, since you wouldn't be doing it otherwise, but I am surprised that associating the traffic patterns to a single key, that of the first domain name, is sufficient.  Every page or query made to that domain (e.g. duckduckgo) will have the same key, with potentially a lot of entirely disparate traffic patterns. 

Obviously this is limited by what you can technically achieve in this scenario: you have the plaintext DNS requests, and everything else is going to be TLS-encrypted. The alternative would be to instrument a tor client/browser and find volunteers to opt-in to their data collection.

-tom