Re: [tor-relays] clarification on what Utah State University exit relays store ("360 gigs of log files")

12 Aug 2015


      Sharif Olorin:
...
...
I would expect most US universities to be logging netflow in the very
least. Even if the Tor operator isn't keeping logs, it seems safe to assume
the network operator is.
I'd be surprised if it was different for non-US universities - I'd
expect this to be the case for every university with its own AS, and
probably most without. It's not specific to universities either; it
would be a rare ISP that doesn't retain netflow for traffic accounting
purposes. It's often somewhat aggregated, but to varying degrees - the
last such system I worked on was designed to retain indefinitely at
sub-minute granularity for training/crossvalidation of network anomaly
detection.
Green & Sharif (& any others with direct netflow experience) -
At what resolution is this type of netflow data typically captured?
Are we talking about all connection 5-tuples, bidirectional/total
transfer byte totals, and open and close timestamps, or more (or less)
detail than this?
Are timestamps always included? Are bidirectional transfer bytecounts
always included? Are subsampled packet headers (or contents)
sometimes/often included?
What about UDP sessions? IPv6?
I think for various reasons (including this one), we're soon going to
want some degree of padding traffic on the Tor network at some point
relatively soon, and having more information about what is typically
recorded in these cases would be very useful to inform how we might want
to design padding and connection usage against this and other issues.
Information about how UDP is treated would also be useful if/when we
manage to switch to a UDP transport protocol, independent of any
padding.
Thanks a bunch!
-- 
Mike Perry

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-relays] clarification on what Utah State University exit relays store ("360 gigs of log files")