Hi Neel,

Thanks for this proposal.

On 26 Jun 2019, at 11:15, neel@neelc.org wrote:

I have a new proposal: A Tor Implementation of IPv6 Happy Eyeballs

This is to implement Tor IPv6 Happy Eyeballs and acts as an alternative to Prop299 as requested here: https://trac.torproject.org/projects/tor/ticket/29801

The GitHub pull request is here: https://github.com/torproject/torspec/pull/87

Here's the proposal content, with my comments:

Filename: 306-ipv6-happy-eyeballs.txt

Title: A Tor Implementation of IPv6 Happy Eyeballs

Author: Neel Chauhan

Created: 25-Jun-2019

Supercedes: 299

Status: Open

Ticket: https://trac.torproject.org/projects/tor/ticket/29801


1. Introduction


   As IPv4 address space becomes scarce, ISPs and organizations will deploy

   IPv6 in their networks. Right now, Tor clients connect to guards using

   IPv4 connectivity by default.


   When networks first transition to IPv6, both IPv4 and IPv6 will be enabled

   on most networks in a so-called "dual-stack" configuration. This is to not

   break existing IPv4-only applications while enabling IPv6 connectivity.

   However, IPv6 connectivity may be unreliable and clients should be able

   to connect to the guard using the most reliable technology, whether IPv4

   or IPv6.


   In ticket #27490, we introduced the option ClientAutoIPv6ORPort which

   lets a client randomly choose between IPv4 or IPv6. However, this

   random decision does not take into account unreliable connectivity

   or falling back to the competing IP version should one be unreliable

   or unavailable.


   One way to select between IPv4 and IPv6 on a dual-stack network is a

   so-called "Happy Eyeballs" algorithm as per RFC 8305. In one, a client

   attempts an IP family, whether IPv4 or IPv6. Should it work, the client

   sticks with the working IP family. Otherwise, the client attempts the

   opposing version. This means if a dual-stack client has both IPv4 and

   IPv6, and IPv6 is unreliable, the client uses IPv4, and vice versa.


   In Proposal 299, we have attempted a IP fallback mechanism using failure

   counters and preferring IPv4 and IPv6 based on the state of the counters.

   However, Prop299 was not standard Happy Eyeballs and an alternative,

   standards-compliant proposal was requested in [P299-TRAC] to avoid issues

   from complexity caused by randomness.


   This proposal describes a Tor implementation of Happy Eyeballs and is

   intended as a successor to Proposal 299.


2. Address Selection


   To be able to handle Happy Eyeballs in Tor, we will need to modify the

   data structures used for connections to guards, namely the extend info

   structure.


   The extend info structure should contain both an IPv4 and an IPv6 address.

   This will allow us to try IPv4 and the IPv6 addresses should both be

   available on a relay and the client is dual-stack.


   When parsing relay descriptors and filling in the extend info data

   structure, we need to fill in both the IPv4 and IPv6 address if they both

   are available. If only one family is available for a relay (IPv4 or IPv6),

   we should fill in the address for available family and leave the opposing

   family null.

When we implement this feature in tor, it would be a good idea to call the
two addresses "preferred" and "alternate" address. With this design,
the low-level connection code doesn't have to know about reachable
addresses, or IPv4/IPv6 preferences. It just has to try them in order.

3. Connecting To A Relay


   When a client connects to a guard using an extend info data structure, we

   should first check if there is an existing authenticated connection. If

   there is, we should use it.

Tor's code already does this check: we won't need to change it.

   If there is no existing authenticated connection for an extend info, we

   should attempt to connect using the first available, allowed, and preferred

   address. At the time of writing, this is IPv4.

That's not quite true: most clients use IPv4 by default, but they can be
configured to prefer IPv6, or only allow certain addresses. And bridge clients
automatically use IPv6 if they are configured with an IPv6 bridge.

   We should also schedule a timer for connecting using the other address

   should one be available and allowed, and the first attempted version

   fails. This should be higher than most client's successful TLS

   authentication time. I propose that the timer is 15 seconds. The reason

   for this is to accommodate high-latency connections such as dial-up and

   satellite.

In the worst case scenario, users see Tor Browser hang for 15 seconds
before it makes a successful connection. That's not acceptable.

Depending on their location, most tor clients authenticate to the first
hop within 0.5-1.5 seconds. So I suggest we use a 1.5 second delay:
https://metrics.torproject.org/onionperf-buildtimes.html

In RFC 8305, the default delay is 250 milliseconds, and the maximum
delay is 2 seconds. So 1.5 seconds is reasonable for TLS and tor link
authentication.
https://tools.ietf.org/html/rfc8305#section-8

(This delay will mainly affect initial bootstrap, because all of Tor's
other connections are pre-emptive, or re-used.)

A small number of clients may do wasted authentication.
That's ok. Tor already does multiple bootstrap and guard connections.

We have talked about this design in the team over the last few months.
Our key insights are that:
* TCP connections are cheap, but TLS is expensive
* most failed TCP connections fail immediately in the kernel, some
  fail quickly with a response from the router, and others are blackholed
  and time out
* it's unlikely that a client will fail to authenticate to a relay over one
  IP version, but succeed over the other IP version, because the directory
  authorities authenticate to each relay when they check reachability
* some censorship systems only break authentication over IPv4,
  but they are rare

So here are some alternative designs:

1. Tor connects to the preferred address and tries to authenticate.
   On failure, or after a 1.5 second delay, it connects to the alternate address
   and tries to authenticate.
   On the first successful authentication, it closes the other connection.

This design places the least connection load on the network, but might add
a bit of extra TLS load.

2. Tor connects via TCP to the preferred address.
   On failure, or after a 250 ms delay, it connects via TCP to the alternate
   address.
   On the first TCP success, tor attempts to authenticate immediately.
   On authentication failure, or after a 1.5 s delay, tor attempts to
   authenticate over the second TCP connection.
   On the first successful authentication, it closes the other connection.

This design is the most reliable for clients, but it also puts a bit more
connection load on dual-stack guards and authorities.

3. Tor connects via TCP to the preferred address.
   On failure, or after a 250ms delay, it connects via TCP to the alternate
   address.
   On the first TCP success, tor attempts to authenticate, and closes the
   other connection.

This design looks similar to a web browser's implementation of Happy
Eyeballs, because it closely follows the RFC. That might help hide tor
from censors. It adds some extra connection load, but no extra TLS load.

I suggest that we put all 3 alternative designs in the proposal, but start
by implementing and testing alternative 1.

When we implement this code, let's put the happy eyeballs part in a
separate module, as much as possible. That helps us review the code,
and make sure it has good test coverage. It also stops existing files and
functions getting too big.

4. Handling Connection Successes And Failures


   Should a connection to a guard succeed and is authenticated via TLS, we

   can then use the connection. In this case, we should cancel all other

   connection timers and in-progress connections. Cancelling the timers is

   so we don't attempt new unnecessary connections when our existing

   connection is successful, preventing denial-of-service risks.


   However, if we fail all available and allowed connections, we should tell

   the rest of Tor that the connection has failed. This is so we can attempt

   another guard relay.


5. Acknowledgments


   Thank you so much to teor for the discussion of the happy eyeballs proposal.

   I wouldn't have been able to do this has it not been for your help.


6. Appendix

   [P299-TRAC]: https://trac.torproject.org/projects/tor/ticket/29801


T