Re: [tor-dev] Proposal 271 - improvements

15 Oct 2019


      Hello,
On 14/10/2019 13:29, Roger Dingledine wrote:
...
On Mon, Oct 14, 2019 at 07:56:29AM +0000, Florentin Rochet wrote:
...
We are suggesting a straightforward fix to the problem, which 
is, roughly speaking, to choose primary guards in the order in which 
they were sampled.
This looks like a good solution to the issue -- the ordering of the
guards as we select them is proportional to their weight, so let's just
use them in the order that we selected them.
One of the tricky features of the prop#271 guard selection design is that
it won't just keep on choosing guards if many are unreachable, but rather
it will stop after a while, so a bad ISP can't totally control what guard
you pick. I think that feature is left untouched by your design change,
since we're choosing from among only the same set as before, just in a
different order. But please think about whether that is true.
Yes, so I think this patch is actually improving this goal since the bad
ISP has first to exhaust the sampled list to have a chance to get in.
...
...
We have created a patch implementing this fix for the 
case affecting our experiments, which would improve the current 
situation. We are further suggesting that Tor apply the technique 
throughout the guard-selection logic.
Can you help us make sure we think of all the places you've already
thought of? :)
Sure! This is mostly code cleanup (as the confirmed ordering is not
meaningful anymore), tests unit cleanup and a bit of spec rewriting (the
reverse order of steps might be preferable, though :)
I would be happy reviewing any final patch. I could also do the patch
myself after my current work is submitted.
...
<skip>
>  The design also reduces Tor's security by increasing the 
> number of clients that an adversary running small relays can observe. In 
> addition, an adversary has to wait less time than it should after it 
> starts a malicious guard to be chosen by a client. This weakness occurs 
> because the malicious guard only needs to enter the sampled list to have 
> a chance to be chosen as primary, rather than having to wait until all 
> previously-sampled guards have already expired.
This part makes me wonder about another angle to this problem: proper
load balancing when we choose our guards on one date but then make
decisions about them on a different date.
For example, if we sample all these guards on day 0, and then use
the first guard for a week, and then move to the second guard... but
the weights have changed in that time... what will that do to our
load balancing? One extreme case would be a relay that has a really
high weight for a while, and then later turns out to have much lower
bandwidth. It gets into a bunch of guard lists at first (but mostly not
#1 since that's how the probabilities work), and then slowly clients
shift load to it as their #1 guard goes away.
In an ideal world we would want to take into account current guard
weights, when we're shifting from one guard to the next, rather than
making that decision way earlier before we actually turn out to need
the guards. Maybe that argues for delaying more of the decisions?
Note that this question is about yet another improvement that could be
made to the guard part of path selection, and I think it's orthogonal
to the improvement you are proposing.
I guess the amount of work is also dependent of how far we want to go.
As you mention, there are still load-balancing problems when "we choose
our guards on one date but then make decisions about them on a different
date". It is possible to get this fixed by removing the sampled list
and, instead, keeping an history a previously sampled guards. When
moving to the next guard, we could consider *current* weights and make
the decision. The history should resist attacks that try to force
clients onto compromised guards, using relays that are part of the
history if they're still available (in sample order), and by tracking
its size.
That second improvement seems to be a deeper refactoring of the proposal
and the code, but it could be interesting, especially when hundreds of
relays get an EoL network-reject. There might be some domino effect at
play here, because some % of clients got a rotation of their EoL guard.
Is there any argument not to do both?
1) Applying this patch and extending the technique throughout the guard
selection logic should be straightforward and solve most of the problem
(should be fast to get).
2) Deeper refactoring of the proposal as you mention (note, this
shouldn't make the step 1 useless, as the logic should be reused here).
I've opened a trac ticket for 1)
https://trac.torproject.org/projects/tor/ticket/32088
Best,
Florentin

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Proposal 271 - improvements