Fan Jiang fanjiang@thoughtworks.com writes:
[ text/plain ] On Fri, Apr 15, 2016 at 5:37 AM, George Kadianakis desnacked@riseup.net wrote:
Fan Jiang fanjiang@thoughtworks.com writes:
[ text/plain ] Thanks for the insights.
It seems like the latest version of prop259 was posted a few weeks
ago:
https://lists.torproject.org/pipermail/tor-dev/2016-March/010625.html
<snip>
A few things:
a) Are there still proposal design decisions that need to be taken
and
we
are unclear on? I admit I'm a bit lost in the latest [tor-dev]
thread, so
maybe I can be of help somehow here?
There are still some issues, like for_directory may leads to maintain
two
sets of sampled_set independently, which is not yet defined clearly in
proposal.
Hm, how come this is happening?
I would think that for_directory would now be just another filter (like
the
ipv6 one etc.) that can be applied on top of the sampled list.
The problem here is for_directory is a parameter of function call, that means we can't do filter before the call happens. Now the filter action only happens at START
stage.
Maybe do a check before we return the selected guard (if not valid, then continue picking new one) can be a solution.
Hello,
hm, that's an interesting problem...
So we learn whether a circuit is for_directory when choose_random_entry_prop259() is called, but at that point the prop259 algorithm has already STARTed and its filtering parameters have been determined?
So if some part of tor calls choose_random_entry_prop259() quickly twice, first with for_directory set, and the second time with for_directory unset, the guard algorithm will proceed in both cases with for_directory being set? Because the context has been set in the first call to choose_random_entry_prop259()?
This seems like a problem to me, and I'm not sure how to solve it well...
One way could be to use the 'cpath_build_state_t *state' in choose_random_entry_prop259() to be able to understand whether each call is about a different circuit or not. Then you could start a separate algorithm invocation (so new START) for each new circuit you see.
But then I'm not sure how to do this without each separate algorithm call wrecking up the sampled_guards and used_guards lists... In the dev meeting in Valencia, we discussed with Ola and Reinaldo about using locking and blocking for this, but I'm not sure how much that would impact the performance...
Do you guys have any plans here?
We can have two pending_guard one for directory, one for any usage(which
will be picked by the NEXT algo, so they can be same node), and they can be checked with directory flag before return. Locking may not work because this algo should at least return sth before it continues to another one, saying Dir and non-Dir are now in main loop, Or if you have any ideas please let me know.
Hello Fan and team,
I think I'm not a big fan of the pending_guard and pending_dir_guard concept. To me it seems like a quick hack that tries to address fundamental issues with our algorithm that appeared when we tried to adapt the proposal to the tor codebase.
I think one of the main issues with the current algorithm structure is that _one run of the algorithm_ can be asked to _setup multiple circuits_, and each of those circuits has different requirements for guards. That is, since we do filtering on START based on the requirements of circuit #1, this means that any other circuits that appear before END is called, also have to adapt to the requirements of circuit #1. This is obvious in the code since we use guard_selection->for_directory throughout the whole algorithm run, even though for_directory was just the restriction of circuit #1.
Specifically about the pending_guard trick, I feel that it interacts in unpredictable ways with other features of the algorithm. For example, consider how it interacts with the primary guards heuristic. It could be time for the algorithm to reenter the primary guards state and retry the top guards in the list, but because of the pending_guard thing we actually return the 15th guard to the circuit.
IMO we should revisit the algorithm so that one run of the algorithm can accomodate multiple circuits by design and without the need for hacks. Here is an idea towards that direction:
I think one very important change that we can do to simplify things is to remove the need to filter guards based on whether they are dirguards, fast, or stable. My suggestion here would be to *only* consider guards that are dirguards _and_ fast _and_ stable. This way, any circuit that appears will be happy with all the guards in our list and there is no need to do the pending_dir_guard trick. See [0] on why I think that's safe to do.
This is easy to do in the current codebase. You just need to call entry_is_live() with need_uptime, need_capacity and for_directory all enabled (instead of flags being 0).
If you do the above, your sampled guard set will be able to accomodate any circuit that comes its way and that should simplify logic considerably.
Let me know if the above does not make sense.
Here are some more comments:
- So the above idea addresses a large part of the filtering logic that happens on START. The rest of the filtering logic has to do with ClientUsesIPv6, ReachableAddreses, etc. . I think it's fine to conduct that filtering on START as well.
- I tried to run the branch as of bb3237d, but it segfaulted. Here is where it crashed:
#1 0x000055555567eb25 in guards_update_state (next=0x5555559c3f40, next@entry=0x5555559c35e8, guards=guards@entry=0x5555559c4620, config_name=config_name@entry=0x55555570c47e "UsedGuard") at src/or/prop259.c:1137 1137 !strchr(e->chosen_by_version, ' ')) {
Let me know if you need more info here.
- There is a memleak on 'extended' in filter_set().
In general, I feel that logic in that function is a bit weird. The function is called filter_set() but it can actually end up adding guards to the list. Maybe it can be renamed?
- What's up with the each_remaining_by_bandwidth() function name?
---
[0]: I think that's OK to do and here is why:
All Guards are Fast. About 95% of Guards are Stable (this will become 100% with #18624) About 80% of Guards are V2Dir/dirguards (this will become 100% with #12538)
#12538 got merged in 0.2.8, so if prop259 gets merged in 0.2.9, by the time prop259 becomes active, almost all guards will be dirguards.
So I think it's fine to only consider guards that are dirguards && fast && stable now, since by the time prop259 gets deployed that will be the case for almost 100% of guards.