Aaron Johnson:
Hi Mike,
Here are some specific comments on the proposal, most of which I didn’t mention at the session yesterday.
Sec. 2: “When a hidden service picks its guard nodes, it also picks two additional sets of middle nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively for each hidden service.” appears to conflict with “When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from that second hop's corresponding `third_guard_set` as third hops of the circuit”. I think the former statement should be amended to describe how the third sets are chosen for each guard in the second set.
Sec. 3.1: “A Sybil attack is noisy”. “Noisy” in the sense that it isn’t perfectly accurate or “noisy” in the sense that it is observable?
Sec. 3.1: “the Sybil success is almost immediately obvious to third layer guard, since it will now be returned as a rend point for circuits for the hidden service in question”. The third-layer guard isn’t the a rendezvous point. So how it is immediately obvious to the adversary when their relay is selected as a third-layer guard?
Ok, I have corrected these in https://gitweb.torproject.org/user/mikeperry/torspec.git/commit/proposals/24...
I have not yet cleaned up the proposal with the discussion from the dev meeting. It still is in there in the form of XXX's.
Sec. 3.2.3: I’m not sure that FullCDF() correctly computes the rotation CDF (even approximately). It should take into account the probability of observing a previous node that has selected the given rotation time (longer rotation times are more likely to be observed). So the probability that there are exactly t days until rotation of the observed node should be proportional to \sum_{i=t}^N Pr[X=i]. After normalization to make this a probability, the expression is (\sum_{i=t}^N Pr[X=i]) / (\sum_{i=1}^N Pr[X=i]*i).
I understand what you're talking about here, and there is indeed a mistake in my original CDF calculations. I forgot to account for the non-uniform time durations of the rotations (ie: weighting their selection probability proportionally by how long the rotation duration is).
However, I came up with a different expression than you did for doing that. I believe that if you pick a time point uniformly at random, then the time-weighted probability of the rotation duration being r is:
P(R==r) = ProbMinXX(N,r)*r / \sum_{i=1}^N ProbMinXX(X=i)*i
Here's the commit that updates the proposal with this explaination in more detail, and uses this new time-weighted P(R==r) to compute the CDF: https://gitweb.torproject.org/user/mikeperry/torspec.git/commit/proposals/24...
As you can see, this does reduce the CDF rate of increase, as you might expect from taking this into account. I'm not sure what this change means with respect to how we pick the constants. I still need to think about that (but that's probably best withheld until I review all of the notes from the dev meeting and take them into consideration first).
FWIW: Your P(R==r) expression actually causes the CDF to increase *faster* (so that the probabilities are higher than before at lower t values), so that makes me more certain that it's not doing the right thing.
Here's a quick table of the first few values of P(SECOND_ROTATION <= t):
t OldP(R <= t) YourP(R <= t) MyP(R <= t) 1 0.19095 0.24788 0.07701 2 0.32222 0.40624 0.15403 3 0.42456 0.52261 0.22829
Here's the python I used to implement your ProbR. You can replace my ProbR in the script in Appendix A of the proposal to reproduce those numbers (and also verify I understood your expression correctly):
def ProbR(N, r, ProbFunc=ProbMinXX): sum1 = 0.0 i=r+1 while i < N: sum1 += ProbFunc(N, i) i += 1 return sum1/ExpFn(N, ProbFunc)
Here's the current head of the full proposal, for convenience: https://gitweb.torproject.org/user/mikeperry/torspec.git/tree/proposals/247-...