On Thu, Mar 17, 2016 at 02:54:49PM +0100, Arturo Filastò wrote:
So I looked a bit into this matter and it seems like the root of the problem is actually in ooni-probe:
The problem is this function here: https://github.com/TheTorProject/ooni-probe/blob/master/ooni/otime.py#L87
that is called with the result of time.time(). Since time.time() should return a seconds since UTC Epoch, the result of then converting it again with utcfromtimestamp() leads it to actually being converted back into the local timezone of the probe.
This issue seems like something much harder to fix in historical data, but I am going to include a fix for this in versions 1.4.0 of ooni-probe together with the rest of the JSON data format changes.
Thank you for checking it out. I agree that the past historical timestamps are probably not easily recoverable. I will keep parsing the timestamps as if they are UTC.
I don't understand your explanation that "utcfromtimestamp() leads it to actually being converted back into the local timezone." That's what I would expect of fromtimestamp, not utcfromtimestamp. Consider these:
# fromtimestamp gives different values according to TZ: $ TZ=America/Denver python -c 'import datetime; print str(datetime.datetime.fromtimestamp(1458239719))' 2016-03-17 12:35:19 $ TZ=Europe/Rome python -c 'import datetime; print str(datetime.datetime.fromtimestamp(1458239719))' 2016-03-17 19:35:19 $ TZ=UTC python -c 'import datetime; print str(datetime.datetime.fromtimestamp(1458239719))' 2016-03-17 18:35:19
# utcfromtimestamp gives the same value regardless of TZ: $ TZ=America/Denver python -c 'import datetime; print str(datetime.datetime.utcfromtimestamp(1458239719))' 2016-03-17 18:35:19 $ TZ=Europe/Rome python -c 'import datetime; print str(datetime.datetime.utcfromtimestamp(1458239719))' 2016-03-17 18:35:19 $ TZ=UTC python -c 'import datetime; print str(datetime.datetime.utcfromtimestamp(1458239719))' 2016-03-17 18:35:19
However what you did with defining a UTC tzinfo is probably equivalent. I still think there should not be calls to fromtimestamp without a tzinfo, otherwise the output will differ depending on the time zone it is run in.