George Kadianakis:
Hello Mike,
I'm finally getting out of the prop224/microdescriptor bug pile, and getting more time to start working on guard stuff like prop247 again.
I'm planning to spend a few days next week to regain knowledge on prop247. I'll check out the notes from the Wilmington hackfest, re-read my old simulator's code, etc.
I was not involved in with Prop271, so I am not deeply familiar with it. However, it has several things we do not need. In particular, the plan for prop247 still is to treat consensus information as the official notion of vanguard reachability, so there is no need to try to determine censorship, firewall, or local network reachability information. If a node is in the consensus, it stays in our vanguard set and does not get replaced until it actually leaves the consensus. This is consistent with how the consensus is currently used for interior hops, and mitigates path bias attacks.
I have not thought hard about what to do about nodes that leave the consensus while they are still in our vanguard sets and rejoin later. I am thinking that the simplest situation is to just pick a new node to replace them and not worry about it.
If it is dead-simple to use only the consensus uptime portions of prop271 without the reachability code, I could be convinced of that. But as it is, the rotation times do not need to be as long as guards, and the implementation simplification here is attractive. Plus, nodes that fall completely out of the consensus periodically like this are probably bad choices anyway..
What do you think?
I know you have thought more about prop247 the past months, and it would be great if you could brief me up on any updates that I should know about. Specifically I'm wondering if you have any new insights on how the proposed prop247 changes interact with Tor's guard algorithm (prop271)?
Also any other things I should know about from your work on the performance simulator? Perhaps ideas about performance, topology or path restrictions?
Yes. I have decided to simplify everything as much as possible. I am going with a mesh topology for the prop247 performance tests (via https://bugs.torproject.org/13837, https://bugs.torproject.org/23101 and https://bugs.torproject.org/24487). That is the simplest option to implement and test for performance, and intuitively seems to have almost as good security properties as the bin version (unless your security simulator tells us otherwise).
I am also aiming for these high-level design goals, most important first:
0. All service-side circuits use 3 hops of vanguards. 1. Hidden services should avoid trivially disclosing their third vanguard to a non-network adversary (ie one that is not running nodes but that is watching either HSDESCS or connecting to the service). This means their paths look like this: S - G - L2 - L3 - HSDIR S - G - L2 - L3 - Intro S - G - L2 - L3 - M - Rend 2. Clients should avoid revealing their third vanguard hop to services and to nodes that have information about which service they are accessing. This means that their paths look like this: C - G - L2 - L3 - M - HSDIR C - G - L2 - L3 - M - Intro C - G - L2 - L3 - Rend 3. Clients use 3 hops of vanguards for all hidden service circuits.
If we do all of these, it will mean that we will have long path lengths (8 hop rends), but it also means that it is easy to reason about linkability and information disclosure. My thinking is that we should do the performance tests with the safest option first (ie: all of these goals), and see exactly how bad it is, and then make compromises if it turns out to be much worse performance than status quo.
In the event of bad performance, I would alter property #3 before messing with property #2, and alter #2 before property #1, but I could be talked into a different strategy, or driven to one based on data.
In terms of pre-building and cannibalization (https://bugs.torproject.org/23101), for vanguard-enabled clients, I am going with the plan to create a special HS_GENERAL pre-built circuit set. HS_GENERAL circuits will be four hops long (3 vanguards plus a random middle), and will be used for all vanguard circuits except for service-side INTRO circuits (since those are already long-lived and pre-built, and don't need the extra middle). I have an implementation of this and have tested it lightly -- it seems to work.
One additional wrinkle is that we will need to reverse our path selection order, so that we do not leak information about earlier vanguards to later hops in the path. This is https://bugs.torproject.org/24487. For now, so I can have more of an apples-to-apples comparison in terms of vanguard set sizes, I simply allow the same vanguard to appear in multiple positions in the circuit, if the prototype is enabled. I do hope to get #24487 done for 0.3.3, though.
I have not written up the set of performance experiments I intend to run yet, but at a high level I want to measure two things for a few different L2 and L3 guard set sizes:
A. How does the average performance compare to existing onionperf data at https://metrics.torproject.org/torperf.html? B. What is that variance over time in performance with a fixed entry guard, as the L2 and L3 guards rotate? Is the variance measurably different than what happens on onionperf?
#A here will tell us if our paths are too long and seriously impact average performance, meaning we have to revisit goals #0-3.
#B will tell us how much a really bad L2 or L3 set can impact performance, and how often that happens. I expect that as we increase L2 and L3 sizes, variance in performance will go down, until we hit diminishing returns. The goal is to find that sweet spot for choosing L2 and L3 as small as possible for as little variance as possible.
It would be great if your security simulator can tell us which L2 and L3 values are worth considering, so I can gather more useful (and more detailed) performance data with fewer experiments.
I think that is it for now. As far as implementation goes, I am doing my best to keep https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorV up to date and stick with that timetable.
This means I want to merge all of the torrc options needed for the performance tests into 0.3.3 (by mid January), so that hidden service operators have the option of using the performance test controller to get vanguard behavior if they want. My assumption here is that we basically can all agree on the high level approach, and all agree it is an improvement over status quo, but we will want the extra time to actually make specific parameter choices and decide if we need to or want to live with shorter paths for some scenarios..