Re: [tor-dev] [RFC] On new guard algorithms and data structures

20 Aug 2015

      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Also, we should choose a reasonable amount of retry attempts at
reasonable time periods for the Guards in primary_guard_set, for the
following reasons:
a) The network is not hostile and allows access just fine, but:
- - the user walked out the signal coverage area of a wi-fi hotspot and
left Tor running;
or
- - the network is just down due to ISP related problems outside the
control of the user;
or
- - the monthly traffic limit was hit and connection was frozen;
We shouldn't change Guards here and also shouldn't account failed
circuits as path bias attack.
b) user changed network / location and is subject to a different
gateway with other rules.
It's easier to cover these with a reasonable number of retries at
reasonable time intervals as opposite to trying to find a way to get
the network status from the OS, etc.
We should retry each Guard for at least 10 times, once every 20
minutes before giving up and changing our table. If we know that
sometime in the past (during a GUARD_ROTATION period) we were able to
connect to a Guard, double or triple the retries amount (???)  --
These numbers need adjustments.
On 8/20/2015 3:27 PM, s7r wrote:
...
Hi,
On 8/20/2015 2:28 PM, George Kadianakis wrote:
...
Hello there,
...
recently we've been busy specifying various important
improvements to entry guard security. For instance see proposals
250, 241 and ticket #16861.
...
Unfortunately, the current guard codebase is dusty and full of 
problems (see #12466, #12450). We believe that refactoring and 
cleaning up the entry guard code is essential before we proceed
to more advanced security improvements.
...
We've been working on new algorithms and data structures for
guard nodes as part of ticket #12595.
...
In this mail I include some pseudocode for this new algorithm
with the hope that it will act as a draft for implementing these
 changes. You can find the pseucode here:
...
https://gitweb.torproject.org/user/asn/tor.git/tree/src/or/guardlist.c?h=bug...
...
A short description of the algorithm is included on top, and then
...
various methods and functions are prototyped underneath to make 
the logic more concrete.
...
Apart from the comments and XXXs on the code, here are some more
 thoughts on this work:
...

This new design focuses on protecting against path bias

attacks, by slightly damaging our reachability.
...
Specifically, the old design is better at recovering in filtered
 networks, because it will keep on adding new nodes till one 
succeeds. In this new design, we will not try more than 80 relays
 per time. So if none of them passes the filtered network, bad
luck no Tor.
This number looks good to me. Could you make it dynamic, so in the 
future we don't have to change this code? Being optimistic here
about Tor's scale in the future. E.g. calculate: 
GUARDS_ATTEMPTED_THRESHOLD == 'total no of Guards in a consensus' *
0.05 and change update it in our 'State' every time we receive a
valid new consensus document which changes it. Should be slight
updates here, like maybe 78, maybe 82, etc. If the result of the
above calculation is not an even number, approximate with deduction
(e.g. if result = 81,6, set the limit to 81).
...
While this failure mode should not happen much, it's bad news for
 users behind FascistFirewalls which are actually quite frequent.
A quick fix here would be to always add an 80/443 guard on our
list, however as it stands only 30% of the guards are 80/443
guards, so this has bad anonymity consequences.
Bad idea for anonymity and also not a very good idea regarding to
load balancing (80/443 Guards might get hammered more). We do have
a torrc option for this, in case the should enable it so Tor will
only look for 80/443 Guards, or use bridges.
...

To improve our algorithm and make it more robust we need to

understand further what kind of path bias attacks are relevant 
here. The adversary here is a network adversary (like a gateway)
 that can block our connections to certain guards. What nasty 
attacks can this adversary do?
...
If we can't find bad attacks here, then maybe we should stop 
worrying about those path bias attacks so much.
...
For example a threat here with the old guard logic, is that if we
 used this evil gateway just for 10 minutes (in an airport), the
 adversary could launch a path bias attack and force us to
connect to her guard node. Then even after we left that airport,
we would still stick to the evil guard node which is bad.
That is why we have some primary guards which we retry for some
time, and not remove them from the list if we cannot connect to
them one or two times. Our network could be down or the Guard's
network could be down, etc.
...
Also, an adversary that manages to own our guard using path bias
 attacks, then has further possibilites for biasing the rest of
the circuit. What can this adversary do?
Would it make sense for Tor to change Guard if it fails more than
n circuits at a given time? If the attacker owns our guard and
wants to path bias attack the rest of the circuit, since the client
is the one who selects the path, it will cause a lot of circuit
failures on client side - we should use this as a metric to detect
this possibility and defend against it.
...

Notice that the pseudocode contains no logic about bridges. I'm

not sure how bridges should be handled here.
Prop#188 is very important for bridges, not sure what algorithm we 
could use here, since bridges are designed to be little bit hard
to get in unlimited quantities and manually fetched and added to
Tor.
...

I tried to keep the dirguard logic very simple, hoping that we

can eventually forget about dirguards entirely when #12538 is 
done.
Indeed, this is not so important particularly because a DirGuard
is way less dangerous than an Entry Guard. Just select 3 main
DirGuards and add more to the list until we get a valid consensus
document (which we verify ourselves anyway). After that, retry the
3 main DirGuards for some more and eventually replace them with the
DirGuards we were able to connect to. I suggest retrying a DirGuard
5 times, once very 20 minutes, until we replace it from the primary
DirGuard table.
We can remove this code when #12538 is done.
...
The main dirguard feature is that we assume that 
populate_live_entry_guards() and add_an_entry_guard() will return
 dirguards when the circuit is a directory circuit.
...
Maybe we should consider introducing the "primary dirguard" 
concept as well. And maybe also add some logic where Tor will
move on to the next dirguard if it failed to receive a document
from the current dirguard.
...

I used the ATTEMPTED_THRESHOLD concept of prop241, but did not

use the NET_THRESHOLD and CONNECTED_THRESHOLD ideas.
...
I removed NET_THRESHOLD because I increased the value of 
ATTEMPTED_THRESHOLD to the point that it can also be used as a 
network down indicator.
lgtm.
...
Also, I was not sure what CONNECTED_THRESHOLD was useful for, and
 there were certain engineering issues with it (Like, if that 
threshold is hit, we need a logic that will *only retry the 
successfully connected guards*, and not all guards).
...

There is no log message warning the user of path bias attacks

or bad network or anything.  That's because there is no way to
figure out what's the problem, and issuing an alarming log
message here would confuse and panic the user.
Good.
...
If we want to inform the user anyhow, maybe if the user is 
*actively* trying to visit a destination, and we've been cycling
 through our guard list for ages, maybe we should then issue a
log message telling the user that something is wrong with the
network.
If we are under an attack which tries to force us into using a
certain Guard, we need to exit after we try everything above and
log a message that there's something wrong with the network, Tor
cannot establish circuits.
...

In general, I tried to keep the number of heuristics and

kludges to the minimum to keep the logic simple. Unfortunately,
it seems that without a "network down" indicator (#16120) there
is no way to avoid edge cases and false positives here.
It's hard to tell the difference between network down (for real)
and gateway has a consensus document and drops packets sent to all
(or almost all) Guards. Nothing to do but follow up our protocol, 
eliminate all the options and exit with a log message. If
restarted, start again but consider the same selected primary
guards and other state data and follow the algorithm again, maybe
the network is fixed. Exit again if not.
...
We should try to fix all problems here that can occur frequently 
or have security consequences, but there will always be scenarios
 where Tor will end up thinking there is no network while it's 
actually on a filternet. For this reason, we should give plenty
of testing to this feature before we ship it to real users!
As I said above, trying to detect a network down will make it a
lot complicated for us and with little benefits since this can be 
trivially gamed. We are not an operating system, we don't care if 
network is down for real (for all destinations) - if network is
down for Tor (cannot establish connections to any Guard or most of
Guards [path bias attack]) for us it means network is down for Tor
== network is down for real period.
What is the difference from Tor's perspective if there is no link
on the internet interface or there is a link which only forwards
packets to xxx.xxx.xxx and yyy.yyy.yyy.yyy ?
Consensus document (and relays in the network) is public info. This
is just the limitation here, but not the end of the world.
...

Finally, all the constants & parameters in the pseudocode are

subject to change. I tried to motivate some of them, but others 
are just arbitrary.
...
Feedback is very welcome and please let me know of any issues
with security or reachability that you find! Or of how the
pseudocode should be altered to make it more useful for
implementors.
...
Cheers!
...PGP SIGNATURE...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBCAAGBQJV1djEAAoJEIN/pSyBJlsRLB4IAKxmG8DtfagYaUl1DL3V2dCZ
CO8S773mzTELWz8OfALk8dPrV75uLp0sq1WUD2VoFJHbvkXJV1TTGnlqFm3mKz2V
yNw9KtA5+5A2daA2Bj0oTHmN8bdJfN7QFBjHa9Pl2gNqJMtXRFyQ779a0bJtLgjs
DW2oruG1QhCvTkci0Xf9+N8POEiQWslNq5hQPxFiG7945xoswsUzhuP3/5V+oTR3
Ra5W3EXYWlTtPuDDKtNSJ5zK5bX1ZhPRMVNoWiw5NtaJZTSkHkjrmVfFykDU/tsX
AN2xDN/o1ELuEw0T42Ouu53Cy3rEmQik/65sV55fuDog5Y9s3/03nMHaId0YkrI=
=fkJ6
-----END PGP SIGNATURE-----

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] [RFC] On new guard algorithms and data structures