Hello list,
there has been lots of discussions about improving onion service availability under DoS conditions. Many approaches have been proposed [OOO] but only a few have been tried and even fewer show any real improvements to the availability of the service.
An approach that we've been considering is the use of anonymous credentials as a way to prioritize good clients from bad clients. The idea is that the service gives tokens to clients it believes to be good, and prioritizes client with tokens over clients without tokens whenever possible. This is a post to start a discussion of how such approaches could work and whether they are worth pursuing futher.
== Preliminaries ==
=== When should the access control take place? ===
Doing DoS defenses with anon credentials is all about enforcing access control at the right point of the protocol so that the amplification factor of evil clients gets cut as early as possible.
Very roughly the phases of the onion service protocol are: descriptor fetch phase, intro phase, rendezvous phase. Let's see how those look like for the purposes of access control:
- Doing the access control during the descriptor fetch stage is something worth considering because it's the first phase of the protocol and hence the earliest and best place to soak up any damage from evil clients. There is already a form of optional access control implemented here called "client authorization" and it's worth thinking of what's lacking to make it useful against DoS attackers. I'm gonna address this in section [CLIENTAUTH].
- Doing the access control during the introduction phase is another fruitful approach. Blocking bad clients during introduction means that they dont get to force the service to create a costly rendezvous circuit, and since services have a long-term circuit open towards the intro points it makes it easier for services to pass access control related data to the intro point. This is actually the approach we are gonna be talking most about in this post.
- Finally, doing the access control during the rendezvous phase is way too late since by that time the onion service has already spent lots of resources catering the evil client, so let's ignore that.
=== Entities of an anonymous credential system ===
Anonymous credential systems traditionally have three entities that concern us:
- The Issuer: the entity who issues the credentials/tokens - The Prover: the entity who collects tokens and uses them to get access - The Verifier: the entity who verifies that tokens are legit and grants/restricts access
In the world of onion services, the Issuer is naturally the onion service, and the Prover is the onion service client. The Verifier could either be the onion service itself or its introduction points. We will see below how this could work and the relevant tradeoffs.
+--------+ +------------+ +--------------------+ | Client |<-+-+-+-->|Intro point |<--+---+-->|Onion service | |(Prover)| |(Verifier?) | |(Issuer)(Verifier?) | +--------+ +------------+ +--------------------+
=== How do tokens get around? ===
A main question here is "How do good clients end up with tokens?". For the purposes of this post, we will assume that clients get these tokens in an out of band fashion. For example, a journalist can give tokens to her sources over Signal so they can use them with Securedrop. Or a forum operator can give tokens to old-time members of the forum to be used during a DoS.
A natural chicken-and-egg problem occurs here since how is an onion service supposed to give tokens to its users if it's unreachable because of a DoS? We realize this is a big problem and we are not sure exactly how to solve it. This problem naturally limits the use of anonymous credential solutions, and sorta makes them a second-layer of defense since it assumes a first-layer of defense that allows operators to pass tokens to the good people. A first-layer approach here could perhaps look like PrivacyPass where users get tokens by solving CAPTCHAs.
== Anonymous credentials ==
By surveying the anonymous credential literature we have found various types of anonymous credential schemes that are relevant for us:
- Discrete-logarithm-based credentials based on blind signatures:
This is a class of anon credential schemes that allow us to separate the verifier from the issuer. In particular this means that we can have the service issue the tokens, but the introduction point being the verifier.
They are usually based on blind signatures like in the case of Microsoft's U-Prove system [UUU].
- Discrete-logarithm-based credentials based on OPRF:
Another approach here is to use OPRF constructions based on the discrete logarithm problem to create an anonymous credential scheme like in the case of PrivacyPass [PPP]. The downside, IIUC, is that in PrivacyPass you can't have a different issuer and verifier so it's the onion service that needs to do the token verification restricting the damage soaking potential.
- KVAC (Keyed-Verification Anonymous Credentials):
KVAC-based credentails have comparable performance to Dlog-based credentials, and are also easier to create security proofs for due to them being based on symmetric primitives like algebraic MAC constructions [FFF]. The downside is that they assume that the verifier and the issuer is the same entity (the onion service). KVACs are used by Signal for protecting their group chat metadata [GGG].
Which construction and scheme we choose depends on our constraints and the parameters we are trying to minmax. So let's try to explore these constraints and parameters some more.
== Space constraints ==
Let's assume that a client found some anonymous credentials for a service. How will clients present their credentials to the verifier, being either the service or the intro point?
The natural approach here is for the client to include their token as part of the INTRODUCE1 cell.
=== How much space is available in INTRODUCE1 cells? [SPACE_CONSTRAINTS] ===
From a brief experiment, I see that the INTRODUCE1 cell payload total size is
498 bytes, and we are already using 310 bytes from it, so that leaves about 188 bytes for use. The cell is extensible using cell extensions that can go either to the intro point [III] or to the service directly [EEE].
=== How big are anonymous credentials redemptions? ===
From looking at the literature it seems unlikely that those anonymous
credentials can be presented given the size limitations from above. In particular here is the size needed for presenting/redeeming an anonymous credential in the various schemes:
- Privacy Pass [PPP]: Credential redemption takes 396 bytes - Signal [GGG]: Their paper does not say exactly, but by peeking into their code it seems like presenting the credential takes about 620 bytes [CCC].
So if I'm not mistaken, it seems like we will need another way to present anonymous credentials to the service or intro point.
=== How can we trasmit these credentials then? ===
The unfortunate fact here is that presenting those credentials does not fit into the remaining space of an INTRODUCE1 cell, but it also wouldn't even fit if we created an entirely new relay cell for this purpose (say INTRO_REDEEM) since the relay cell payload is about 498 bytes and all the credentials above, except from PrivacyPass (!), are bigger than that.
This means that we will either need to use a space-compact anonymous credential scheme, or to implement wide relay cells as suggested in ticket #33650 (which seems like lots of work!).
== Computational constraints [COMPUTATIONAL_CONSTRAINTS] ==
Another big topic for consideration here is the computational resources required for the various operations of the scheme. In particular, how hard it is to issue such tokens, present such tokens and to verify such tokens. Different schemes have different computational profiles. I'm not gonna attempt to summarize the literature here because it will take me more time than I have right now.
However, it's clear that our main priority here is a lightweight verification procedure, so that services or intro points spend as little energy as possible validating (potentially evil) tokens. It's imperative that verifying tokens is orders of magnitude more lightweight than the regular job that a service would do if credentials did not exist (i.e. doing path selection, creating a rendezvous circuit, doing the crypto, etc.)
Our second priority here is a lighweight token issuance procedure, so that services don't spend too much time issuing tokens, especially if this is done in an automatic real-time manner that can be abused by an adversary.
And finally, we don't particularly care about the token presentation procedure being lightweight, since this is a procedure that the attacker will also have to do and hence we want to incur as much costs to them as possible.
== Discussion ==
==== What's lacking from HSv3 client authorization [CLIENTAUTH] ====
It's really important to understand what's lacking from descriptor-level client authorization [QQQ] with regards to DoS defences before we spend time and energy implementing any anonymous credential approach.
To use the descriptor-level client authorization feature for DoS protection the onion service operator creates a mirror of the website that is only reachable using descriptor-level client auth and then they pass client auth tokens to the clients they trust to be good natured. The above system can also be automated and turned into a web service.
However this system has a bunch of drawbacks:
- Given the implicit and anonymous nature of the descriptor-level client authorization system, if an evil actor gets their hand in a client auth token and they start DoSing the service, then the operator has no means to distinguish that client from any other client. Hence there is no way to identify individual bad clients and revoke them. This is the biggest issue with descriptor-based client authorization for the purposes of DoS defense.
- Managing the access list (adding and removing clients) in the descriptor-level client authorization system is an expensive process because it requires uploading a new set of descriptors. Doing this too often (in an automated way) can cause lots of network traffic and also perhaps race conditions with clients who fetch descriptors.
- Because the authentication happens by adding new lines on the descriptor, there is a fundamental limit on how many authed clients it can support. In particular, with some back-of-the-envelope calculations [ZZZ] an onion service can support a max of 250 authed clients this way. However, there is nothing stopping the operator from passing the same token to multiple clients and managing them as groups.
The above issues would be addressed with intro-level client authorization since the authentication is explicit and done in real-time.
It's worth thinking of the above drawbacks more and how we can bypass them to do something useful with the current system before we jump into new complexities and tradeoffs.
---
Thanks for reading and looking forward to your feedback :)
[OOO]: https://trac.torproject.org/projects/tor/ticket/31223 [ZZZ]: https://lists.torproject.org/pipermail/tor-dev/2016-November/011658.html [QQQ]: https://eprint.iacr.org/2013/516.pdf [PPP]: https://www.petsymposium.org/2018/files/papers/issue3/popets-2018-0026.pdf [UUU]: https://www.microsoft.com/en-us/research/project/u-prove/ [FFF]: https://eprint.iacr.org/2013/516.pdf [GGG]: https://signal.org/blog/signal-private-group-system/ [III]: https://github.com/torproject/torspec/blob/f81b1e6cc53b91abb3ae206807bc371fa... [EEE]: https://github.com/torproject/torspec/blob/f81b1e6cc53b91abb3ae206807bc371fa... [CCC]: https://github.com/signalapp/zkgroup/blob/4294e428216113d81f5c0cb50f578797d1...