Hi Neel,
Thanks for this proposal.
On 26 Jun 2019, at 11:15, neel@neelc.org wrote:
I have a new proposal: A Tor Implementation of IPv6 Happy Eyeballs
This is to implement Tor IPv6 Happy Eyeballs and acts as an alternative to Prop299 as requested here: https://trac.torproject.org/projects/tor/ticket/29801
The GitHub pull request is here: https://github.com/torproject/torspec/pull/87
Here's the proposal content, with my comments:
Filename: 306-ipv6-happy-eyeballs.txt Title: A Tor Implementation of IPv6 Happy Eyeballs
Author: Neel Chauhan
Created: 25-Jun-2019
Supercedes: 299
Status: Open
Ticket: https://trac.torproject.org/projects/tor/ticket/29801
Introduction
As IPv4 address space becomes scarce, ISPs and organizations will deploy
IPv6 in their networks. Right now, Tor clients connect to guards using
IPv4 connectivity by default.
When networks first transition to IPv6, both IPv4 and IPv6 will be enabled
on most networks in a so-called "dual-stack" configuration. This is to not
break existing IPv4-only applications while enabling IPv6 connectivity.
However, IPv6 connectivity may be unreliable and clients should be able
to connect to the guard using the most reliable technology, whether IPv4
or IPv6.
In ticket #27490, we introduced the option ClientAutoIPv6ORPort which
lets a client randomly choose between IPv4 or IPv6. However, this
random decision does not take into account unreliable connectivity
or falling back to the competing IP version should one be unreliable
or unavailable.
One way to select between IPv4 and IPv6 on a dual-stack network is a
so-called "Happy Eyeballs" algorithm as per RFC 8305. In one, a client
attempts an IP family, whether IPv4 or IPv6. Should it work, the client
sticks with the working IP family. Otherwise, the client attempts the
opposing version. This means if a dual-stack client has both IPv4 and
IPv6, and IPv6 is unreliable, the client uses IPv4, and vice versa.
In Proposal 299, we have attempted a IP fallback mechanism using failure
counters and preferring IPv4 and IPv6 based on the state of the counters.
However, Prop299 was not standard Happy Eyeballs and an alternative,
standards-compliant proposal was requested in [P299-TRAC] to avoid issues
from complexity caused by randomness.
This proposal describes a Tor implementation of Happy Eyeballs and is
intended as a successor to Proposal 299.
Address Selection
To be able to handle Happy Eyeballs in Tor, we will need to modify the
data structures used for connections to guards, namely the extend info
structure.
The extend info structure should contain both an IPv4 and an IPv6 address.
This will allow us to try IPv4 and the IPv6 addresses should both be
available on a relay and the client is dual-stack.
When parsing relay descriptors and filling in the extend info data
structure, we need to fill in both the IPv4 and IPv6 address if they both
are available. If only one family is available for a relay (IPv4 or IPv6),
we should fill in the address for available family and leave the opposing
family null.
When we implement this feature in tor, it would be a good idea to call the two addresses "preferred" and "alternate" address. With this design, the low-level connection code doesn't have to know about reachable addresses, or IPv4/IPv6 preferences. It just has to try them in order.
Connecting To A Relay
When a client connects to a guard using an extend info data structure, we
should first check if there is an existing authenticated connection. If
there is, we should use it.
Tor's code already does this check: we won't need to change it.
If there is no existing authenticated connection for an extend info, we
should attempt to connect using the first available, allowed, and preferred
address. At the time of writing, this is IPv4.
That's not quite true: most clients use IPv4 by default, but they can be configured to prefer IPv6, or only allow certain addresses. And bridge clients automatically use IPv6 if they are configured with an IPv6 bridge.
We should also schedule a timer for connecting using the other address
should one be available and allowed, and the first attempted version
fails. This should be higher than most client's successful TLS
authentication time. I propose that the timer is 15 seconds. The reason
for this is to accommodate high-latency connections such as dial-up and
satellite.
In the worst case scenario, users see Tor Browser hang for 15 seconds before it makes a successful connection. That's not acceptable.
Depending on their location, most tor clients authenticate to the first hop within 0.5-1.5 seconds. So I suggest we use a 1.5 second delay: https://metrics.torproject.org/onionperf-buildtimes.html
In RFC 8305, the default delay is 250 milliseconds, and the maximum delay is 2 seconds. So 1.5 seconds is reasonable for TLS and tor link authentication. https://tools.ietf.org/html/rfc8305#section-8
(This delay will mainly affect initial bootstrap, because all of Tor's other connections are pre-emptive, or re-used.)
A small number of clients may do wasted authentication. That's ok. Tor already does multiple bootstrap and guard connections.
We have talked about this design in the team over the last few months. Our key insights are that: * TCP connections are cheap, but TLS is expensive * most failed TCP connections fail immediately in the kernel, some fail quickly with a response from the router, and others are blackholed and time out * it's unlikely that a client will fail to authenticate to a relay over one IP version, but succeed over the other IP version, because the directory authorities authenticate to each relay when they check reachability * some censorship systems only break authentication over IPv4, but they are rare
So here are some alternative designs:
1. Tor connects to the preferred address and tries to authenticate. On failure, or after a 1.5 second delay, it connects to the alternate address and tries to authenticate. On the first successful authentication, it closes the other connection.
This design places the least connection load on the network, but might add a bit of extra TLS load.
2. Tor connects via TCP to the preferred address. On failure, or after a 250 ms delay, it connects via TCP to the alternate address. On the first TCP success, tor attempts to authenticate immediately. On authentication failure, or after a 1.5 s delay, tor attempts to authenticate over the second TCP connection. On the first successful authentication, it closes the other connection.
This design is the most reliable for clients, but it also puts a bit more connection load on dual-stack guards and authorities.
3. Tor connects via TCP to the preferred address. On failure, or after a 250ms delay, it connects via TCP to the alternate address. On the first TCP success, tor attempts to authenticate, and closes the other connection.
This design looks similar to a web browser's implementation of Happy Eyeballs, because it closely follows the RFC. That might help hide tor from censors. It adds some extra connection load, but no extra TLS load.
I suggest that we put all 3 alternative designs in the proposal, but start by implementing and testing alternative 1.
When we implement this code, let's put the happy eyeballs part in a separate module, as much as possible. That helps us review the code, and make sure it has good test coverage. It also stops existing files and functions getting too big.
Handling Connection Successes And Failures
Should a connection to a guard succeed and is authenticated via TLS, we
can then use the connection. In this case, we should cancel all other
connection timers and in-progress connections. Cancelling the timers is
so we don't attempt new unnecessary connections when our existing
connection is successful, preventing denial-of-service risks.
However, if we fail all available and allowed connections, we should tell
the rest of Tor that the connection has failed. This is so we can attempt
another guard relay.
Acknowledgments
Thank you so much to teor for the discussion of the happy eyeballs proposal.
I wouldn't have been able to do this has it not been for your help.
Appendix
T