-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Hi all,
While working on peer-to-peer hidden services I've been wondering how long two clients need to wait before arriving at the same view of the Tor network. The answer, which I find surprising, is that this is never guaranteed to happen. Here's why.
In what follows I'll assume that the interval between network consensuses being published is one hour, but the conclusions remain the same if the interval is longer or shorter - just change the units. I'll use "1:00 consensus" to mean a consensus that becomes valid at 1:00, is fresh until 2:00, and is valid until 4:00.
According to dir-spec.txt, each directory cache downloads the 1:00 consensus at a randomly chosen time between 1:00 and 1:30. Each client replaces the 1:00 consensus with a newer consensus at a randomly chosen time between 2:45 and 3:51. (The precise endpoint is 3:50:37.5, due to a calculation described in section 5.1 of the spec. I'm curious to know the origin of that calculation.)
Some observations:
1. The period when clients are replacing each consensus slightly overlaps the period when they're replacing the previous consensus. For about five and a half minutes of every hour there are twice as many clients replacing their consensuses, so directory caches will experience greater load during this period.
How much greater? We can divide clients that are downloading the consensus into three groups:
A. Clients that have never downloaded a consensus.
B. Clients that are replacing a consensus that's no longer valid.
C. Clients that are replacing a valid consensus with a newer consensus.
Clients in group A bypass the directory caches and hit the authorities, so they don't contribute to load on the caches. Clients in group B have just started up and loaded an expired consensus from disk. These clients are equally likely to hit the caches at any point in the consensus interval (although the load will vary by time of day, day of week, etc). Clients in group C are twice as likely to hit the caches between 45 minutes and 51 minutes past the hour as at any other time, due to the overlap between replacement intervals. So unless group C is dwarfed by group B, we should see a noticeable increase in load on the caches between 45 and 51 minutes past the hour. (If group C is dwarfed by group B then the increase will be lost in the noise.)
If you run a directory cache, do your logs show that pattern?
2. The time a client chooses for replacing a newly downloaded consensus may be earlier than the time it chose for replacing the previous consensus. Again, this is due to the overlap between replacement intervals: a client may replace the 1:00 consensus at 3:48 and then choose 3:47 as the random time to replace the newly downloaded 2:00 consensus.
What happens in that case? I haven't looked at the code, but I'm going to speculate that it doesn't explode in flames or travel through time. A reasonable way to handle the situation would be to round up the replacement time to the present time - in other words, to replace the newly downloaded consensus immediately. If that guess is correct, the extra load during the overlap between replacement intervals will get squeezed towards the end of the overlap, as replacement times occasionally get rounded up but never get rounded down. Some clients will download two consensuses back-to-back during the overlap.
Again, if you run a directory cache I'd be interested to know whether you're seeing this pattern.
3. Clients often skip consensuses. The interval for clients to replace the 1:00 consensus runs from 2:45 to 3:51, and the interval for caches to download the 3:00 consensus runs from 3:00 to 3:30. So there's roughly a 46% chance that a client replacing the 1:00 consensus will get the 2:00 consensus, and roughly a 54% chance that it will get the 3:00 consensus.
This means it's possible for two clients to replace their consensuses regularly but never see the same consensus: one of them sees the consensuses published on odd hours, the other sees the consensuses published on even hours. This situation is unlikely to continue for long if each client makes a fresh random choice for each replacement time, but there's no time at which we can say the clients will definitely have seen the same consensus.
I can't see any partitioning attacks arising from this, but I thought I'd point it out anyway because I find the result surprising.
As I said before, I'm curious to know the origin of the calculation in section 5.1 that produces the overlap between replacement intervals. The replacement time is chosen "uniformly at random from the interval between the time 3/4 into the first interval after the consensus is no longer fresh, and 7/8 of the time remaining after that before the consensus is invalid". It's the 7/8 part that results in the odd figure of 50 minutes 37.5 seconds.
If the replacement time were to be chosen uniformly at random from the interval between the time 3/4 into the first interval after the consensus is no longer fresh, and the time one consensus interval after that, there would be no gap or overlap between replacement intervals.
This wouldn't affect the third observation, however - there would still be no guarantee of two clients ever seeing the same consensus.
Cheers, Michael