Hello list,
there has been lots of discussions about improving onion service availability under DoS conditions. Many approaches have been proposed [OOO] but only a few have been tried and even fewer show any real improvements to the availability of the service.
An approach that we've been considering is the use of anonymous credentials as a way to prioritize good clients from bad clients. The idea is that the service gives tokens to clients it believes to be good, and prioritizes client with tokens over clients without tokens whenever possible. This is a post to start a discussion of how such approaches could work and whether they are worth pursuing futher.
== Preliminaries ==
=== When should the access control take place? ===
Doing DoS defenses with anon credentials is all about enforcing access control at the right point of the protocol so that the amplification factor of evil clients gets cut as early as possible.
Very roughly the phases of the onion service protocol are: descriptor fetch phase, intro phase, rendezvous phase. Let's see how those look like for the purposes of access control:
- Doing the access control during the descriptor fetch stage is something worth considering because it's the first phase of the protocol and hence the earliest and best place to soak up any damage from evil clients. There is already a form of optional access control implemented here called "client authorization" and it's worth thinking of what's lacking to make it useful against DoS attackers. I'm gonna address this in section [CLIENTAUTH].
- Doing the access control during the introduction phase is another fruitful approach. Blocking bad clients during introduction means that they dont get to force the service to create a costly rendezvous circuit, and since services have a long-term circuit open towards the intro points it makes it easier for services to pass access control related data to the intro point. This is actually the approach we are gonna be talking most about in this post.
- Finally, doing the access control during the rendezvous phase is way too late since by that time the onion service has already spent lots of resources catering the evil client, so let's ignore that.
=== Entities of an anonymous credential system ===
Anonymous credential systems traditionally have three entities that concern us:
- The Issuer: the entity who issues the credentials/tokens - The Prover: the entity who collects tokens and uses them to get access - The Verifier: the entity who verifies that tokens are legit and grants/restricts access
In the world of onion services, the Issuer is naturally the onion service, and the Prover is the onion service client. The Verifier could either be the onion service itself or its introduction points. We will see below how this could work and the relevant tradeoffs.
+--------+ +------------+ +--------------------+ | Client |<-+-+-+-->|Intro point |<--+---+-->|Onion service | |(Prover)| |(Verifier?) | |(Issuer)(Verifier?) | +--------+ +------------+ +--------------------+
=== How do tokens get around? ===
A main question here is "How do good clients end up with tokens?". For the purposes of this post, we will assume that clients get these tokens in an out of band fashion. For example, a journalist can give tokens to her sources over Signal so they can use them with Securedrop. Or a forum operator can give tokens to old-time members of the forum to be used during a DoS.
A natural chicken-and-egg problem occurs here since how is an onion service supposed to give tokens to its users if it's unreachable because of a DoS? We realize this is a big problem and we are not sure exactly how to solve it. This problem naturally limits the use of anonymous credential solutions, and sorta makes them a second-layer of defense since it assumes a first-layer of defense that allows operators to pass tokens to the good people. A first-layer approach here could perhaps look like PrivacyPass where users get tokens by solving CAPTCHAs.
== Anonymous credentials ==
By surveying the anonymous credential literature we have found various types of anonymous credential schemes that are relevant for us:
- Discrete-logarithm-based credentials based on blind signatures:
This is a class of anon credential schemes that allow us to separate the verifier from the issuer. In particular this means that we can have the service issue the tokens, but the introduction point being the verifier.
They are usually based on blind signatures like in the case of Microsoft's U-Prove system [UUU].
- Discrete-logarithm-based credentials based on OPRF:
Another approach here is to use OPRF constructions based on the discrete logarithm problem to create an anonymous credential scheme like in the case of PrivacyPass [PPP]. The downside, IIUC, is that in PrivacyPass you can't have a different issuer and verifier so it's the onion service that needs to do the token verification restricting the damage soaking potential.
- KVAC (Keyed-Verification Anonymous Credentials):
KVAC-based credentails have comparable performance to Dlog-based credentials, and are also easier to create security proofs for due to them being based on symmetric primitives like algebraic MAC constructions [FFF]. The downside is that they assume that the verifier and the issuer is the same entity (the onion service). KVACs are used by Signal for protecting their group chat metadata [GGG].
Which construction and scheme we choose depends on our constraints and the parameters we are trying to minmax. So let's try to explore these constraints and parameters some more.
== Space constraints ==
Let's assume that a client found some anonymous credentials for a service. How will clients present their credentials to the verifier, being either the service or the intro point?
The natural approach here is for the client to include their token as part of the INTRODUCE1 cell.
=== How much space is available in INTRODUCE1 cells? [SPACE_CONSTRAINTS] ===
From a brief experiment, I see that the INTRODUCE1 cell payload total size is
498 bytes, and we are already using 310 bytes from it, so that leaves about 188 bytes for use. The cell is extensible using cell extensions that can go either to the intro point [III] or to the service directly [EEE].
=== How big are anonymous credentials redemptions? ===
From looking at the literature it seems unlikely that those anonymous
credentials can be presented given the size limitations from above. In particular here is the size needed for presenting/redeeming an anonymous credential in the various schemes:
- Privacy Pass [PPP]: Credential redemption takes 396 bytes - Signal [GGG]: Their paper does not say exactly, but by peeking into their code it seems like presenting the credential takes about 620 bytes [CCC].
So if I'm not mistaken, it seems like we will need another way to present anonymous credentials to the service or intro point.
=== How can we trasmit these credentials then? ===
The unfortunate fact here is that presenting those credentials does not fit into the remaining space of an INTRODUCE1 cell, but it also wouldn't even fit if we created an entirely new relay cell for this purpose (say INTRO_REDEEM) since the relay cell payload is about 498 bytes and all the credentials above, except from PrivacyPass (!), are bigger than that.
This means that we will either need to use a space-compact anonymous credential scheme, or to implement wide relay cells as suggested in ticket #33650 (which seems like lots of work!).
== Computational constraints [COMPUTATIONAL_CONSTRAINTS] ==
Another big topic for consideration here is the computational resources required for the various operations of the scheme. In particular, how hard it is to issue such tokens, present such tokens and to verify such tokens. Different schemes have different computational profiles. I'm not gonna attempt to summarize the literature here because it will take me more time than I have right now.
However, it's clear that our main priority here is a lightweight verification procedure, so that services or intro points spend as little energy as possible validating (potentially evil) tokens. It's imperative that verifying tokens is orders of magnitude more lightweight than the regular job that a service would do if credentials did not exist (i.e. doing path selection, creating a rendezvous circuit, doing the crypto, etc.)
Our second priority here is a lighweight token issuance procedure, so that services don't spend too much time issuing tokens, especially if this is done in an automatic real-time manner that can be abused by an adversary.
And finally, we don't particularly care about the token presentation procedure being lightweight, since this is a procedure that the attacker will also have to do and hence we want to incur as much costs to them as possible.
== Discussion ==
==== What's lacking from HSv3 client authorization [CLIENTAUTH] ====
It's really important to understand what's lacking from descriptor-level client authorization [QQQ] with regards to DoS defences before we spend time and energy implementing any anonymous credential approach.
To use the descriptor-level client authorization feature for DoS protection the onion service operator creates a mirror of the website that is only reachable using descriptor-level client auth and then they pass client auth tokens to the clients they trust to be good natured. The above system can also be automated and turned into a web service.
However this system has a bunch of drawbacks:
- Given the implicit and anonymous nature of the descriptor-level client authorization system, if an evil actor gets their hand in a client auth token and they start DoSing the service, then the operator has no means to distinguish that client from any other client. Hence there is no way to identify individual bad clients and revoke them. This is the biggest issue with descriptor-based client authorization for the purposes of DoS defense.
- Managing the access list (adding and removing clients) in the descriptor-level client authorization system is an expensive process because it requires uploading a new set of descriptors. Doing this too often (in an automated way) can cause lots of network traffic and also perhaps race conditions with clients who fetch descriptors.
- Because the authentication happens by adding new lines on the descriptor, there is a fundamental limit on how many authed clients it can support. In particular, with some back-of-the-envelope calculations [ZZZ] an onion service can support a max of 250 authed clients this way. However, there is nothing stopping the operator from passing the same token to multiple clients and managing them as groups.
The above issues would be addressed with intro-level client authorization since the authentication is explicit and done in real-time.
It's worth thinking of the above drawbacks more and how we can bypass them to do something useful with the current system before we jump into new complexities and tradeoffs.
---
Thanks for reading and looking forward to your feedback :)
[OOO]: https://trac.torproject.org/projects/tor/ticket/31223 [ZZZ]: https://lists.torproject.org/pipermail/tor-dev/2016-November/011658.html [QQQ]: https://eprint.iacr.org/2013/516.pdf [PPP]: https://www.petsymposium.org/2018/files/papers/issue3/popets-2018-0026.pdf [UUU]: https://www.microsoft.com/en-us/research/project/u-prove/ [FFF]: https://eprint.iacr.org/2013/516.pdf [GGG]: https://signal.org/blog/signal-private-group-system/ [III]: https://github.com/torproject/torspec/blob/f81b1e6cc53b91abb3ae206807bc371fa... [EEE]: https://github.com/torproject/torspec/blob/f81b1e6cc53b91abb3ae206807bc371fa... [CCC]: https://github.com/signalapp/zkgroup/blob/4294e428216113d81f5c0cb50f578797d1...
There is another component of the design space:
Do you want credentials to be movable from one introduction point to another?
If so, you can do this or not with both blind signatures and OPRFs by enabling or restricting their malleability properties, but probably not with anything symmetric. If tokens are movable, then this encourages users to use multiple introduction points, but doing so sounds unlikely, but worse gives DoS attackers parallel access to introduction points. I suppose no for this reason, but maybe it’s worth considering for the future..
On 23 Mar 2020, at 14:23, George Kadianakis desnacked@riseup.net wrote:
Discrete-logarithm-based credentials based on blind signatures:
This is a class of anon credential schemes that allow us to separate the verifier from the issuer. In particular this means that we can have the service issue the tokens, but the introduction point being the verifier.
They are usually based on blind signatures like in the case of Microsoft's U-Prove system [UUU].
We should mention that Fuchsbauer, et al. recently addressed the forgeability problem for blind Schnorr signatures in https://eprint.iacr.org/2019/877.pdf which should improve performance, but still costs more round trips than slower blind signature variants. I think the attacks were never relevant for DoS protection anyways though.
You need 64 bytes for the Schnorr signature plus whatever you require to identify the signing key, so 80-96 bytes .
Discrete-logarithm-based credentials based on OPRF:
Another approach here is to use OPRF constructions based on the discrete logarithm problem to create an anonymous credential scheme like in the case of PrivacyPass [PPP]. The downside, IIUC, is that in PrivacyPass you can't have a different issuer and verifier so it's the onion service that needs to do the token verification restricting the damage soaking potential.
Issuer and verifier must share secret key material, so not exactly the same thing as being the same. You must share some special public key material for the blind signatures.
I believe redemption could cost 64-96 bytes bytes, so a 32 byte curve point, a 16-32 bytes that identifies the issuing key, and a 16-32 bytes seed that gets hashed to the curve.
Jeff
George Kadianakis desnacked@riseup.net writes:
Hello list,
there has been lots of discussions about improving onion service availability under DoS conditions. Many approaches have been proposed [OOO] but only a few have been tried and even fewer show any real improvements to the availability of the service.
An approach that we've been considering is the use of anonymous credentials as a way to prioritize good clients from bad clients. The idea is that the service gives tokens to clients it believes to be good, and prioritizes client with tokens over clients without tokens whenever possible. This is a post to start a discussion of how such approaches could work and whether they are worth pursuing futher.
== Preliminaries ==
=== When should the access control take place? ===
Doing DoS defenses with anon credentials is all about enforcing access control at the right point of the protocol so that the amplification factor of evil clients gets cut as early as possible.
Very roughly the phases of the onion service protocol are: descriptor fetch phase, intro phase, rendezvous phase. Let's see how those look like for the purposes of access control:
Doing the access control during the descriptor fetch stage is something worth considering because it's the first phase of the protocol and hence the earliest and best place to soak up any damage from evil clients. There is already a form of optional access control implemented here called "client authorization" and it's worth thinking of what's lacking to make it useful against DoS attackers. I'm gonna address this in section [CLIENTAUTH].
Doing the access control during the introduction phase is another fruitful approach. Blocking bad clients during introduction means that they dont get to force the service to create a costly rendezvous circuit, and since services have a long-term circuit open towards the intro points it makes it easier for services to pass access control related data to the intro point. This is actually the approach we are gonna be talking most about in this post.
Finally, doing the access control during the rendezvous phase is way too late since by that time the onion service has already spent lots of resources catering the evil client, so let's ignore that.
=== Entities of an anonymous credential system ===
Anonymous credential systems traditionally have three entities that concern us:
- The Issuer: the entity who issues the credentials/tokens - The Prover: the entity who collects tokens and uses them to get access - The Verifier: the entity who verifies that tokens are legit and grants/restricts access
In the world of onion services, the Issuer is naturally the onion service, and the Prover is the onion service client. The Verifier could either be the onion service itself or its introduction points. We will see below how this could work and the relevant tradeoffs.
+--------+ +------------+ +--------------------+ | Client |<-+-+-+-->|Intro point |<--+---+-->|Onion service | |(Prover)| |(Verifier?) | |(Issuer)(Verifier?) | +--------+ +------------+ +--------------------+
=== How do tokens get around? ===
A main question here is "How do good clients end up with tokens?". For the purposes of this post, we will assume that clients get these tokens in an out of band fashion. For example, a journalist can give tokens to her sources over Signal so they can use them with Securedrop. Or a forum operator can give tokens to old-time members of the forum to be used during a DoS.
A natural chicken-and-egg problem occurs here since how is an onion service supposed to give tokens to its users if it's unreachable because of a DoS? We realize this is a big problem and we are not sure exactly how to solve it. This problem naturally limits the use of anonymous credential solutions, and sorta makes them a second-layer of defense since it assumes a first-layer of defense that allows operators to pass tokens to the good people. A first-layer approach here could perhaps look like PrivacyPass where users get tokens by solving CAPTCHAs.
== Anonymous credentials ==
By surveying the anonymous credential literature we have found various types of anonymous credential schemes that are relevant for us:
Discrete-logarithm-based credentials based on blind signatures:
This is a class of anon credential schemes that allow us to separate the verifier from the issuer. In particular this means that we can have the service issue the tokens, but the introduction point being the verifier.
They are usually based on blind signatures like in the case of Microsoft's U-Prove system [UUU].
After more discussions and investigations, it does seem blind-signature based anonymous credentials is what we are looking for here. In particular, the ability to separate the issuer and the verifier is essential to us because it gives us more flexibility allowing us to:
- Do the verification at either the intro point or the service. - Allow for third-party token issuers that issue tokens that can be later verified by the service or intro points. These third-party token issuers can even be on the clearnet which gives us more options when it comes to DoS defences.
For this reason I had a call with Michele Orrù today who helped me understand our options and constraints (Michele feel free to correct me wherever I'm wrong). In particular, it seems like we have the following options and none of them is perfect and none of them is ultra-terrible:
1) Blinded RSA signatures
Anon credentials schemes that use blinded RSA signatures are the most old-school approach we have in our arsenal: https://en.wikipedia.org/wiki/Blind_signature#Blind_RSA_signatures
They are the fastest option we have (in terms of verification performance -- our main performance concern), and the verification protocol can be competed in two messages which fit nicely into our introduction protocol.
The problem is that their tokens have the biggest size (in terms of token size). In particular, using tokens of this kind means that we will have to pass 2048-bit keys or 4096-bit keys around, plus their padding, plus potentially other stuff.
We would need to look more on whether we can fit these schemes into our constraints. It seems like Facebook has recently started using RSA blinded signatures into their own anonymous credentials system: https://github.com/siyengar/private-fraud-prevention
2) Blinded Schnorr signatures
Another approach would be to use blinded schnorr signatures over elliptic curves: https://eprint.iacr.org/2019/877.pdf
This results in a scheme with smaller tokens than RSA but worse performance. As Jeff mentioned, a token of this kind would be about 100 bytes, where 32 bytes would go to an identifier t, and another 64 bytes to an ed25519 signature. This is the scheme that Microsoft's UProve is using so perhaps we can use their benchmarks to measure its performance.
The main problem here is that this is a 3-message verification protocol where the first message starts from verifier. Furthermore, this first message needs to include a fresh nonce, so we could not just use the descriptor for this first message. More research required here to see if we can still fit it in somehow.
Another spicy thing here is that there is an attack called "ROS" that applies to these schemes. This obscure attack allows the adversary to obtain an extra issued token by opening multiple simultaneous issuance connections to the issuer. There is a fix (in section 5 of the 877.pdf paper above) with an equally weird countermeasure where users during token issuance execute two parallel runs of the issuance protocol but only finish one of them at random... As Jeff mentioned, it's likely that this attack won't apply or affect our DoS threat model, but it's something we should definitely keep in mind.
2) Blinded BLS signatures
The final approach here is doing blinded signatures using the BLS signature scheme: https://en.wikipedia.org/wiki/Boneh%E2%80%93Lynn%E2%80%93Shacham https://crypto.stackexchange.com/a/12832 (yes that's an SE link...) https://hackmd.io/@benjaminion/bls12-381
This results in a 2-message verification protocol with small tokens (about the size of blinded schnorr sigs), but worse performance than all of the above schemes (about 25ms eyeballed but we should look at https://crypto.stanford.edu/pbc/ for more information).
Furthermore, this is a pairing-base cryptosystem so it comes with the bleeding-edge-wow factor of pairings. It seems like various groups (including zcash and ethereum) have been using BLS signatures, but none of them have used it for blinded signatures, so we would be in for a ride.
All in all, these are the options we seem to have. There are tradeoffs to all of them, and potential blockers with some of them (e.g. 3-messages of Schnorr sigs). We should dig deeper into all these schemes, and also talk with more people in case there are schemes we are missing or things I misunderstood about these ones.
Finally, anonymous credentials based on the above schemes are lightweight "use-token-and-forget-it" token schemes. It's not as easy to encode complicated attributes into them, or make them accumulate reputation, or blacklist them. To do revocation/blacklisting here you basically should stop issuing new tokens to the bad parties, but they still get to use their old tokens since they are indistinguishable from any other tokens issued. It might still be possible to achieve some of that by splitting users into different groups by using different keys for each group, so that additional information can be encoded on the type of group used.
That's it for now.
Cheers!