-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Hi isis,
I am also not sure if we should have DYSTOPIC_GUARDS and UTOPIC_GUARDS sets disjoint. It hurts the already fragile load balancing for Guards and will cause lighter load on FascistFirewall Guards (ports 80/443). I think usually users behind such firewalls know their condition and act accordingly (torrc option, bridges, etc). I agree with you that we should automate this somehow for the users who don't know, and make sure they try to connect to FascistFirewall Guards (80/443) before Tor gives up.
I suggest having a single guard list, created like this:
1. GUARDS_ATTEMPTED_THRESHOLD - consensus parameter, containing the maximum number of guards we will attempt to connect to. Currently ~5% from the total number of Guards in the consensus: 80.
2. GUARD_LIST - the list of guards we will attempt to connect to. It will contain exactly GUARDS_ATTEMPTED_THRESHOLD guards.
When we build this list, we do it like this: We will choose based on weighted bandwidth instead of number of routers for better load balancing. All numbers are dynamic and calculated based on consensus. Adjust to whole numbers if the result contains decimals.
a) DYSTOPIC_GUARDLIST_FRACTION - calculate what percent of the Guard bandwidth (consensus weight) belongs to FascistFirewall Guards (ports 80/443). For a simple example, let's assume the total Guard bandwidth in the last consensus is 10 GB/s and FascistFirewall Guard bandwidth is 2,8 GB/s = 28%.
b) UTOPIC_GUARDLIST_FRACTION - trivially determine the percent of the non-FascistFirewall Guard bandwidth: 100 - DYSTOPIC_GUARDLIST_FRACTION (28) = 72%.
c) Build final GUARD_LIST of a max length of GUARDS_ATTEMPTED_THRESHOLD:
- - 25% totally random (20 routers). Tor will choose these Guards candidates randomly, without considering FascistFirewall or non-FascistFirewall Guards.
- - the rest of 75% (60 routers): -> 28% DYSTOPIC_GUARDLIST (16 routers) -> 72% UTOPIC_GUARDLIST (44 routers)
The list cannot contain duplicates.
So, we have a single guard list, and we try the guards in any order (hash ring, weighted by bandwidth).
For step 2, we also need a maximum retry amount. Something like: - - try once every 20 minutes, maximum 15 retries. - - after that, try once every 1 hour, maximum 7 retries. - - after that, try once every 6 hours, maximum 3 retries. - - try one last time after 24 hours. Remove the guard permanently from PRIMARY_GUARDS if still unavailable.
* Counters should reset after each successful connection and start from 0. If Tor was shut down and the timestamp of last retry is > 48 hours, reset counters to 0.
This will give us about 2 days worth of retries. Increase the maximum retries if you think we should insist more.
If a Guard has been offline for > 24 hours, it probably won't have the Guard flag when it comes back, so we need to make an exception here and still use it if it was our guard before. Should we get rid of it if the guard flag is not regained after reasonable uptime?
On 10/30/2015 6:12 PM, George Kadianakis wrote:
It's interesting that these two sets DYSTOPIC_GUARDS and UTOPIC_GUARDS are disjoint. This means that there will be no 80/443 relays on the UTOPIC guardlist. This means that 80/443 guards will only be used by people under FascistFirewall; which makes them a bit like bridges in a way, and also has significant effects on our load balancing.
Are we sure we want these two sets to be disjoint?
I could imagine an alternative design where we still have the two guard lists, but we also allow 80/443 relays to be in UTOPIC_GUARDS. If you were lucky and you had 80/443 guards in your UTOPIC guard list you don't need to go down to the DYSTOPIC guard list. Or something like this.
I don't entirely understand why we prefer a hash ring over a simple list here for sampling guards. I was imagining that when we need a new guard, we would just put all guards in a list, and sample a random guard weighted by bandwidth. I think this is what the current code is doing in smartlist_choose_node_by_bandwidth_weights() and it seems easy!