Ola Bini obini@thoughtworks.com writes:
Hi,
Here is the next version of the algorithm - put it in a gist to make it easier to look at: https://gist.github.com/olabini/343da01de8e01491bf5c
Cheers
Thanks for this.
Here we go with another review!
The full algorithm is referred to as ALGO_CHOOSE_ENTRY_GUARD. It is divided into three components, such that the full algorithm is implemented by first invocing ALGO_CHOOSE_ENTRY_GUARD_START, then repeatedly calling ALGO_CHOOSE_ENTRY_GUARD_NEXT and finally calling ALGO_CHOOSE_ENTRY_GUARD_END.
That's helpful. Maybe we can also put the pseudocode that you posted in a previous mail in the appendix to provide more context?
ALGO_CHOOSE_ENTRY_GUARD keeps track of unreachable status for guards in state private to the algorithm - this is initialized every time ALGO_CHOOSE_ENTRY_GUARD_START is called.
Interesting. That seems like both a bug and a feature in some ways.
It's a security feature because we will try our guard list from the beginning more frequently.
It's a performance "bug" because we have to cycle through all the unreachable nodes everytime we restart the algorithm, because we forgot they were unreachable. If the first multiple guards in your USED_GUARDS are actually unreachable, then this will delay bootstrap by some time. Consider the case where you need to make three circuits to connect to a hidden service as a client (HSDir/IP/RP), so you have to call the algorithm three times in a row.
Of course, if a guard is really unreachable it _should_ be marked as bad within an hour because it won't be listed in the next consensus. While this makes sense, I wonder why my laptop guard list (in the state file) has a total of 24 guards, where 18 of them are marked as unreachable and only 6 of them are marked as bad. Maybe they were all marked unreachable when the internet was down. I wonder if this influences the performance of the algorithm.
It would be nice to know if the security/performance tradeoff here is acceptable. Simulations might help, or we will have to learn it the hard way when we implement the algorithm and try it out in various types of networks.
ALGO_CHOOSE_ENTRY_GUARD_START: (arguments: USED_GUARDS, EXCLUDE_NODES, N_PRIMARY_GUARDS) If selecting directory guards, GUARDS is all guards from the consensus with the V2Flag, otherwise GUARDS is all guards from the consensus Set UTOPIC_GUARDS to be all guards to use under utopic conditions from GUARDS Set DYSTOPIC_GUARDS to be all guards to use under dystopic conditions from GUARDS Set REMAINING_UTOPIC_GUARDS to be UTOPIC_GUARDS without EXCLUDE_NODES and USED_GUARDS Set REMAINING_DYSTOPIC_GUARDS to be DYSTOPIC_GUARDS without EXCLUDE_NODES and USED_GUARDS Create a list of PRIMARY_GUARDS that contain N_PRIMARY_GUARDS that are not bad by: Taking the next entry from USED_GUARDS If USED_GUARDS is empty: randomly select an entry from UTOPIC_GUARDS, weighted by bandwidth Set TRIED_GUARDS to be an empty set Set TRIED_DYSTOPIC_GUARDS to be an empty set Set state = STATE_PRIMARY_GUARDS ALGO_CHOOSE_ENTRY_GUARD_NEXT: If a new consensus has arrived: Update all guard profiles with new bad/non-bad information If any PRIMARY_GUARDS have become bad: re-add to the list of PRIMARY_GUARDS using the same procedure
And also remove the primary guard that has become bad, right?
If any USED_GUARDS have become non-bad: add it back to PRIMARY_GUARDS at the place it would have been if it was non-bad when running ALGO_CHOOSE_ENTRY_GUARD_START. If this results in PRIMARY_GUARDS being larger than N_PRIMARY_GUARDS, remove from the end of the list until the list is N_PRIMARY_GUARDS long
How do we know that a guard in USED_GUARDS that just became non-bad deserves to be a primary guard? Even if it's the first guard in the USED_GUARDS list, how do we know whether it has priority over the last guard of the PRIMARY_GUARDS list? Do we need to keep state of previous PRIMARY_GUARDS sets?
Ensure that UTOPIC_GUARDS and DYSTOPIC_GUARDS are updated with the changes from the consensus
Isn't this implicit in "Update all guard profiles with new bad/non-bad information"? Maybe these can be merged into a single step that explicitly mentions the guardlists that need to be updated.
If it was at least 3 minutes since we tried the primary guards and we are not in STATE_PRIMARY_GUARDS: save previous state set state = STATE_PRIMARY_GUARDS STATE_PRIMARY_GUARDS: return each entry in PRIMARY_GUARDS in turn for each entry, if algorithm doesn't terminate mark entry as "unreachable" add entry to TRIED_GUARDS if the number of entries in TRIED_GUARDS that were tried within GUARDS_TRY_THRESHOLD_TIME is larger than GUARDS_TRY_THRESHOLD, return failure from ALGO_CHOOSE_ENTRY_GUARD. restore previous state (or STATE_TRY_UTOPIC if no previous) STATE_TRY_UTOPIC: for each entry in TRIED_GUARDS that was marked as unreachable more than 20 minutes ago add it back to REMAINING_UTOPIC_GUARDS return each remaining entry from USED_GUARDS in turn
What does "remaining" means here? Is it the next USED_GUARDS entry that has not been marked as unreachable?
for each entry, if algorithm doesn't terminate mark entry as "unreachable" add entry to TRIED_GUARDS if the number of entries in TRIED_GUARDS that were tried within GUARDS_TRY_THRESHOLD_TIME is larger than GUARDS_TRY_THRESHOLD, return failure from ALGO_CHOOSE_ENTRY_GUARD. if the number of entries in TRIED_GUARDS is larger than a GUARDS_FAILOVER_THRESHOLD proportion of UTOPIC_GUARDS, set state = STATE_TRY_DYSTOPIC
I wonder if returning failure and exiting the algorithm is the right thing to do in this case.
Basically, we just connected to too many different guards and we want to stop exposing ourselves to more of the network. This is something that can happen during an attack, but also something that happens naturally all the time when the network is down. So we can't just have Tor exit().
Instead, we probably want to keep our guardlist and continue connecting to the guards we already have till the network comes up again.
If we return failure here and exit the algorithm, then when we restart the algorithm we will have forgetten the TRIED_GUARDS list and hence all the guards that we tried in our previous run. Then we will attempt to connect to GUARDS_TRY_THRESHOLD new guards again. Which is probably exactly what the adversary wants.
So maybe failure is not an option here, and instead we should massage our current guardlist and start from the beginning without adding more guards?
What do you think?
return each entry from REMAINING_UTOPIC_GUARDS randomly selected, weighted by bandwidth remove the returned entry from REMAINING_UTOPIC_GUARDS for each entry, if algorithm doesn't terminate mark entry as "unreachable" add entry to TRIED_GUARDS if the number of entries in TRIED_GUARDS that were tried within GUARDS_TRY_THRESHOLD_TIME is larger than GUARDS_TRY_THRESHOLD, return failure from ALGO_CHOOSE_ENTRY_GUARD. if the number of entries in TRIED_GUARDS is larger than a GUARDS_FAILOVER_THRESHOLD proportion of UTOPIC_GUARDS, set state = STATE_TRY_DYSTOPIC
Also, I'm not sure I understand the precise security properties that the above two checks provide us with. Are both checks needed? Could you explain them to me please?
STATE_TRY_DYSTOPIC: for each entry in TRIED_DYSTOPIC_GUARDS that was marked as unreachable more than 20 minutes ago add it back to REMAINING_DYSTOPIC_GUARDS return each remaining DYSTOPIC entry from USED_GUARDS in turn for each entry, if algorithm doesn't terminate mark entry as "unreachable" add entry to TRIED_DYSTOPIC_GUARDS if the number of entries in TRIED_GUARDS+TRIED_DYSTOPIC_GUARDS that were tried within GUARDS_TRY_THRESHOLD_TIME is larger than GUARDS_TRY_THRESHOLD, return failure from ALGO_CHOOSE_ENTRY_GUARD. if the number of entries in TRIED_DYSTOPIC_GUARDS is larger than a GUARDS_FAILOVER_THRESHOLD proportion of DYSTOPIC_GUARDS: mark all guards in PRIMARY_GUARDS, TRIED_GUARDS and TRIED_DYSTOPIC_GUARDS as not "unreachable" return failure from ALGO_CHOOSE_ENTRY_GUARD. return each entry from REMAINING_DYSTOPIC_GUARDS randomly selected, weighted by bandwidth remove the returned entry from REMAINING_DYSTOPIC_GUARDS for each entry, if algorithm doesn't terminate mark entry as "unreachable" add entry to TRIED_DYSTOPIC_GUARDS if the number of entries in TRIED_GUARDS+TRIED_DYSTOPIC_GUARDS that were tried within GUARDS_TRY_THRESHOLD_TIME is larger than GUARDS_TRY_THRESHOLD, return failure from ALGO_CHOOSE_ENTRY_GUARD. if the number of entries in TRIED_DYSTOPIC_GUARDS is larger than a GUARDS_FAILOVER_THRESHOLD proportion of DYSTOPIC_GUARDS: return failure from ALGO_CHOOSE_ENTRY_GUARD. ALGO_CHOOSE_ENTRY_GUARD_END: If circuit is set up correctly, let algorithm know Algorithm marks the guard chosen as used and makes sure it is in USED_GUARDS
Maybe instead of "makes sure it is in" we should say "adds it in"? It has not been added in a previous step from what I see.