Hello nice people of the Tor project!
I'm very interested in using the Google Summer of Code stipend to integrate the CONIKS key verification protocol into Tor Messenger.
So I wanted to say 'Hi!' and introduce myself: I'm a computer science student at Humboldt Universität zu Berlin in Berlin, Germany. The main focus of my studies lies on security, computer networks (such as the peer-to-peer ones) and privacy enhancing technologies. In the last years I mostly worked with C/C++, but these days I'm learning erlang – mostly for its benefits in concurrent & network scalable programming, but also to learn 'something different'.
This brings me to some questions regarding the project: If I understand correctly (after reading [1], there are three parts which should get implemented in course of the project:
- A server component which stores the tamper-resistant database and would be run by the identity providers. - An auditor module which tracks the states of the server and publishes its view, so users can check that theirs is consistent with it. - And a client side plugin for Tor Messenger written in JavaScript.
Concerning the two first parts: What would be the requirements concerning the language of the server? The Projects page list JavaScript and C as required languages, but would you also consider a server component written in erlang? I could do that in C/C++, but since I'm experimenting with erlang I thought I'd ask, especially, because I could imagine that the auditor functionality could be implemented into a XMPP server such as ejabberd or prosody. So, while the CONIKS provider would be more or less centralised for Tor Messenger, third parties like the XMPP server hosters could act as auditors by just loading up a plugin for their XMPP server. This Idea is based on the Q&A found in the ticket [1]. Do you think this would be a viable idea to roll out the auditor software?
This should be it for my first questions. I'll study the CONIKS paper more in depth in the next days and will come back at you if more questions come up concerning the project idea – if that's okay with you.
Best Wishes!
Elias Rohrer / _tnull @ irc
On Mar 5, 2016, at 8:25 AM, Elias Rohrer ml@tnull.de wrote:
Hello nice people of the Tor project!
I'm very interested in using the Google Summer of Code stipend to integrate the CONIKS key verification protocol into Tor Messenger.
So I wanted to say 'Hi!' and introduce myself: I'm a computer science student at Humboldt Universität zu Berlin in Berlin, Germany. The main focus of my studies lies on security, computer networks (such as the peer-to-peer ones) and privacy enhancing technologies. In the last years I mostly worked with C/C++, but these days I'm learning erlang – mostly for its benefits in concurrent & network scalable programming, but also to learn 'something different'.
Hello Elias. Nice to meet you and thanks for your interest.
This brings me to some questions regarding the project: If I understand correctly (after reading [1], there are three parts which should get implemented in course of the project:
- A server component which stores the tamper-resistant database and would be run by the identity providers.
- An auditor module which tracks the states of the server and publishes its view, so users can check that theirs is consistent with it.
- And a client side plugin for Tor Messenger written in JavaScript.
That's correct, though perhaps the project description should emphasize that the priority is on the server and client components.
Concerning the two first parts: What would be the requirements concerning the language of the server? The Projects page list JavaScript and C as required languages, but would you also consider a server component written in erlang?
Unfortunately, I don't think there's a lot of erlang expertise in the Tor community. In order to maintain the server going forward (despite any intention you'd have of staying on the project), I think it'd be best if we went with a language more developers here are accustomed to, such as golang.
I could do that in C/C++, but since I'm experimenting with erlang I thought I'd ask, especially, because I could imagine that the auditor functionality could be implemented into a XMPP server such as ejabberd or prosody.
Prosody is in lua, no?
So, while the CONIKS provider would be more or less centralised for Tor Messenger, third parties like the XMPP server hosters could act as auditors by just loading up a plugin for their XMPP server. This Idea is based on the Q&A found in the ticket [1]. Do you think this would be a viable idea to roll out the auditor software?
While I agree that would probably ease deployment for those running ejabberd, it's not very helpful in the general case. We'd like anyone to be able to run an auditor. And, again, I should stress that the other components are the priority and should make up the bulk of the work.
This should be it for my first questions. I'll study the CONIKS paper more in depth in the next days and will come back at you if more questions come up concerning the project idea – if that's okay with you.
Sounds great. Again, thanks for your interest in the project. Feel free to ping Marcela (masomel) and I (arlolra) on irc in tor-dev
Best Wishes!
Elias Rohrer / _tnull @ irc
Hello!
On 7 Mar 2016, at 23:06, Arlo Breault wrote:
On Mar 5, 2016, at 8:25 AM, Elias Rohrer ml@tnull.de wrote:
Hello nice people of the Tor project!
I'm very interested in using the Google Summer of Code stipend to integrate the CONIKS key verification protocol into Tor Messenger.
So I wanted to say 'Hi!' and introduce myself: I'm a computer science student at Humboldt Universität zu Berlin in Berlin, Germany. The main focus of my studies lies on security, computer networks (such as the peer-to-peer ones) and privacy enhancing technologies. In the last years I mostly worked with C/C++, but these days I'm learning erlang – mostly for its benefits in concurrent & network scalable programming, but also to learn 'something different'.
Hello Elias. Nice to meet you and thanks for your interest.
This brings me to some questions regarding the project: If I understand correctly (after reading [1], there are three parts which should get implemented in course of the project:
- A server component which stores the tamper-resistant database and
would be run by the identity providers.
- An auditor module which tracks the states of the server and
publishes its view, so users can check that theirs is consistent with it.
- And a client side plugin for Tor Messenger written in JavaScript.
That's correct, though perhaps the project description should emphasize that the priority is on the server and client components.
Ok. The CONIKS paper proposes that an identity provider A publishes its commitments to others and then the user can then query multiple identity providers if they all have the same picture of A's commitment. So, the functionality of auditing commitments would be a subset of the server anyhow, I guess then the auditors could also just use the server component which was configured to act only as an auditor?!
Concerning the two first parts: What would be the requirements concerning the language of the server? The Projects page list JavaScript and C as required languages, but would you also consider a server component written in erlang?
Unfortunately, I don't think there's a lot of erlang expertise in the Tor community. In order to maintain the server going forward (despite any intention you'd have of staying on the project), I think it'd be best if we went with a language more developers here are accustomed to, such as golang.
Ok, then I would probably fall back to implementing the server in C/C++. However, there is still some time until GSoC actually starts, maybe I can have a look at golang until then. (Which from what I heard could also spare one some of the pitfalls of C/C++).
I could do that in C/C++, but since I'm experimenting with erlang I thought I'd ask, especially, because I could imagine that the auditor functionality could be implemented into a XMPP server such as ejabberd or prosody.
Prosody is in lua, no?
Yup
So, while the CONIKS provider would be more or less centralised for Tor Messenger, third parties like the XMPP server hosters could act as auditors by just loading up a plugin for their XMPP server. This Idea is based on the Q&A found in the ticket [1]. Do you think this would be a viable idea to roll out the auditor software?
While I agree that would probably ease deployment for those running ejabberd, it's not very helpful in the general case. We'd like anyone to be able to run an auditor. And, again, I should stress that the other components are the priority and should make up the bulk of the work.
I see your point. As stated above: I'd try to design the server so it can be configured to only act as an auditor.
This should be it for my first questions. I'll study the CONIKS paper more in depth in the next days and will come back at you if more questions come up concerning the project idea – if that's okay with you.
Sounds great. Again, thanks for your interest in the project. Feel free to ping Marcela (masomel) and I (arlolra) on irc in tor-dev
Ok, thanks a lot, I will! Two questions came up in the last days:
1. How would the CONIKS client code integrate with the ctypes-otr add-on? It would probably be part of it? So should I also have a look to implement client functionality in a C library and then use the calls via ctypes?
2. While reading the paper I had some confusion about a minor point, namely about the way CONIKS calculates the index of a user identity in the merkle prefix tree. I know you neither wrote the paper nor the reference implementation, but maybe you can shed some light onto this issue, even if it is probably a very small one.
On Page 4 the authors state: "We can implement a VUF using any deterministic, existentially unforgeable signature scheme [42]. The signature scheme must be deterministic or else (in our application) the identity provider could insert multiple bindings for a user at different locations each with a valid authentication path. In practice, this could be a classic RSA signature [51] (using a deterministic padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which provide the best space efficiency. Many common discrete-log based signature schemes such as DSA [30] are not immediately applicable as they require a random nonce. In CONIKS, we generate the index for a user u as: i = h (sign_K (u)) where h() is a one-way hash function and sign() is a deterministic, existentially unforgeable signature scheme. The extra hash function is added because some bits of sign_K (u) might otherwise leak through binding validity proofs, and signature schemes are generally not unforgeable given some bits of the valid signature. The hash function prevents this by ensuring the final lookup index leaks no information about the value of sign_K (u)."
I understand that using the signature of the user identity ensures that no information indicating its existence on the server is leaked. But why are they hashing the signature again? RSA-signatures normally already consist of a signed hash-value. Why would it bad to leak bits of the signature? Also, if I see this correctly, they don't do it in their reference implementation (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s...), instead they just use SHA256withRSA... So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which would be the normal 'hash and sign' operation of RSA? Or does it make sense to rehash the signature and their own implementation is not consistent with it? This is probably no important point, however I want to make sure I'm not missing a piece concerning the security of the user identities.
Thank you & Best Wishes
Elias
Best Wishes!
Elias Rohrer / _tnull @ irc
Hi Elias!
Nice to meet you and thanks for your interest in this GSOC project! I’m Marcela, a mentor on this project as well. I was the main author on the CONIKS paper, so I can answer all of your paper-related questions. From some of the terminology you’re using and from some of the text you’ve quoted, it seems to me that you’ve been reading an older version of the paper, so I would refer you to our most recent and peer-reviewed paper published at USENIX Security 2015: https://www.cs.princeton.edu/~melara/static/pubs/sec15-paper-melara.pdf
I hope my explanations make sense, and please let us know if you have more questions!
Best, Marcela
That's correct, though perhaps the project description should emphasize that the priority is on the server and client components.
Ok. The CONIKS paper proposes that an identity provider A publishes its commitments to others and then the user can then query multiple identity providers if they all have the same picture of A's commitment. So, the functionality of auditing commitments would be a subset of the server anyhow, I guess then the auditors could also just use the server component which was configured to act only as an auditor?!
Yes, we propose that the identity providers can act as auditors and cross-verify each other in this way, and clients may query any auditor when they are checking for equivocation. However, I should note that we intended this to be the *ideal deployment* (and I apologize if this doesn’t come through clearly enough in the paper). CONIKS also allows third-party auditors whose sole job it is to check that the chain of signed tree roots is linear. In practice, we expect that auditors and identity providers will be separate in most cases (at least until there is a standard for CONIKS protocols and data formats, but we’re not there yet).
It’s not accurate to conflate auditing with being a key server, or say that auditing is a subset of the key server’s functionality. Auditors only need to keep records of identity providers’ signed tree roots, and they never download or record any key directory contents, so it doesn’t really make sense to equip auditors with key server functionality and “turn it off” when you’re not acting as a key server. Key servers have the more complex and rich functionality needing to maintain significantly larger data structures and perform more cryptographic computations. This means that key servers have very different space and availability requirements than auditors do.
While I agree that would probably ease deployment for those running ejabberd, it's not very helpful in the general case. We'd like anyone to be able to run an auditor. And, again, I should stress that the other components are the priority and should make up the bulk of the work.
I see your point. As stated above: I'd try to design the server so it can be configured to only act as an auditor.
See my point above. Auditing and maintaining keys isn’t just about configuring a server differently.
Sounds great. Again, thanks for your interest in the project. Feel free to ping Marcela (masomel) and I (arlolra) on irc in tor-dev
Ok, thanks a lot, I will! Two questions came up in the last days:
- How would the CONIKS client code integrate with the ctypes-otr add-on? It would probably be part of it? So should I also have a look to implement client functionality in a C library and then use the calls via types?
Arlo may have a better answer to this question.
- While reading the paper I had some confusion about a minor point, namely about the way CONIKS calculates the index of a user identity in the merkle prefix tree. I know you neither wrote the paper nor the reference implementation, but maybe you can shed some light onto this issue, even if it is probably a very small one.
How the lookup index is calculated is actually not a minor point. It’s pretty important that this is done in a very specific way to ensure that authentication paths in the CONIKS Merkle tree can’t allow an adversary to easily determine which other usernames have key mappings in the tree. This is one of the main privacy feature of CONIKS.
On Page 4 the authors state: "We can implement a VUF using any deterministic, existentially unforgeable signature scheme [42]. The signature scheme must be deterministic or else (in our application) the identity provider could insert multiple bindings for a user at different locations each with a valid authentication path. In practice, this could be a classic RSA signature [51] (using a deterministic padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which provide the best space efficiency. Many common discrete-log based signature schemes such as DSA [30] are not immediately applicable as they require a random nonce. In CONIKS, we generate the index for a user u as: i = h (sign_K (u)) where h() is a one-way hash function and sign() is a deterministic, existentially unforgeable signature scheme. The extra hash function is added because some bits of sign_K (u) might otherwise leak through binding validity proofs, and signature schemes are generally not unforgeable given some bits of the valid signature. The hash function prevents this by ensuring the final lookup index leaks no information about the value of sign_K (u).”
I understand that using the signature of the user identity ensures that no information indicating its existence on the server is leaked. But why are they hashing the signature again? RSA-signatures normally already consist of a signed hash-value. Why would it bad to leak bits of the signature?
The answer to this question is in the portion of the paper you quote: "signature schemes are generally not unforgeable given some bits of the valid signature.” There’s some crypto-jargon here, which you may not be familiar with if you haven't taken a crypto course. What this sentence is trying to say is that, given some bits of a valid signature, an adversary could be able to find some message or data and a corresponding digital signature which is valid and which has not been generated by the legitimate signer (in this case the identity provider). This is called an existential forgery, and it means that an adversary can create forged lookup indices if the signature isn’t hashed. The hash on top of the signature also ensures that the lookup index can’t be easily computed from the VUF.
Also, if I see this correctly, they don't do it in their reference implementation (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java), instead they just use SHA256withRSA... So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which would be the normal 'hash and sign' operation of RSA? Or does it make sense to rehash the signature and their own implementation is not consistent with it? This is probably no important point, however I want to make sure I'm not missing a piece concerning the security of the user identities.
Unfortunately, the reference implementation is still a work in progress (although I’ve had to take some time off from working on it due to other research projects), and is missing several features described in the paper. The SignatureOps class in the reference implementation simply implements wrappers around Java signature functions, so that’s not where the user lookup index calculation happens. The lookup index calculation happens in the unameToIndex() function in conks_server.ServerUtils. Now, I will note that the reference implementation currently *isn’t* doing the proper index calculation as it only currently hashes the username, so thank you for making me go back and double-check my code! But it would be relative straight-forward to change this and to add the digital signature computation that needs to be done before the hashing.
Hello Marcela! Nice to meet you, too, and thank you for that interesting Paper! It seems I made a lot of wrong assumptions in my last mail, ugh, thats kind of embarrassing..
On 8 Mar 2016, at 20:57, Marcela Melara wrote:
Hi Elias!
Nice to meet you and thanks for your interest in this GSOC project! I’m Marcela, a mentor on this project as well. I was the main author on the CONIKS paper, so I can answer all of your paper-related questions. From some of the terminology you’re using and from some of the text you’ve quoted, it seems to me that you’ve been reading an older version of the paper, so I would refer you to our most recent and peer-reviewed paper published at USENIX Security 2015: https://www.cs.princeton.edu/~melara/static/pubs/sec15-paper-melara.pdf
I indeed was reading another version (never trust paper search engines in lieu of direct links...). Thanks for the link, I'll re-read the current version!
I hope my explanations make sense, and please let us know if you have more questions!
Best, Marcela
That's correct, though perhaps the project description should emphasize that the priority is on the server and client components.
Ok. The CONIKS paper proposes that an identity provider A publishes its commitments to others and then the user can then query multiple identity providers if they all have the same picture of A's commitment. So, the functionality of auditing commitments would be a subset of the server anyhow, I guess then the auditors could also just use the server component which was configured to act only as an auditor?!
Yes, we propose that the identity providers can act as auditors and cross-verify each other in this way, and clients may query any auditor when they are checking for equivocation. However, I should note that we intended this to be the *ideal deployment* (and I apologize if this doesn’t come through clearly enough in the paper). CONIKS also allows third-party auditors whose sole job it is to check that the chain of signed tree roots is linear. In practice, we expect that auditors and identity providers will be separate in most cases (at least until there is a standard for CONIKS protocols and data formats, but we’re not there yet).
It’s not accurate to conflate auditing with being a key server, or say that auditing is a subset of the key server’s functionality. Auditors only need to keep records of identity providers’ signed tree roots, and they never download or record any key directory contents, so it doesn’t really make sense to equip auditors with key server functionality and “turn it off” when you’re not acting as a key server. Key servers have the more complex and rich functionality needing to maintain significantly larger data structures and perform more cryptographic computations. This means that key servers have very different space and availability requirements than auditors do.
Ah, yes, that is true. Just turning of features of the server to get an auditor might indeed be a bad idea. But wouldn't there be a lot of overlapping functionality in the server, auditor and even the clients? E.g. the crypto primitives for building and checking the tree? Maybe it would be viable to create a libCONIKS which could be used by all three components?
While I agree that would probably ease deployment for those running ejabberd, it's not very helpful in the general case. We'd like anyone to be able to run an auditor. And, again, I should stress that the other components are the priority and should make up the bulk of the work.
I see your point. As stated above: I'd try to design the server so it can be configured to only act as an auditor.
See my point above. Auditing and maintaining keys isn’t just about configuring a server differently.
Sounds great. Again, thanks for your interest in the project. Feel free to ping Marcela (masomel) and I (arlolra) on irc in tor-dev
Ok, thanks a lot, I will! Two questions came up in the last days:
- How would the CONIKS client code integrate with the ctypes-otr add-on? It would probably be part of it? So should I also have a look to implement client functionality in a C library and then use the calls via types?
Arlo may have a better answer to this question.
- While reading the paper I had some confusion about a minor point, namely about the way CONIKS calculates the index of a user identity in the merkle prefix tree. I know you neither wrote the paper nor the reference implementation, but maybe you can shed some light onto this issue, even if it is probably a very small one.
How the lookup index is calculated is actually not a minor point. It’s pretty important that this is done in a very specific way to ensure that authentication paths in the CONIKS Merkle tree can’t allow an adversary to easily determine which other usernames have key mappings in the tree. This is one of the main privacy feature of CONIKS.
Yes, you're of course completely right. I thought my misunderstanding was no big deal, but of course the details matter. I'm glad I asked.
On Page 4 the authors state: "We can implement a VUF using any deterministic, existentially unforgeable signature scheme [42]. The signature scheme must be deterministic or else (in our application) the identity provider could insert multiple bindings for a user at different locations each with a valid authentication path. In practice, this could be a classic RSA signature [51] (using a deterministic padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which provide the best space efficiency. Many common discrete-log based signature schemes such as DSA [30] are not immediately applicable as they require a random nonce. In CONIKS, we generate the index for a user u as: i = h (sign_K (u)) where h() is a one-way hash function and sign() is a deterministic, existentially unforgeable signature scheme. The extra hash function is added because some bits of sign_K (u) might otherwise leak through binding validity proofs, and signature schemes are generally not unforgeable given some bits of the valid signature. The hash function prevents this by ensuring the final lookup index leaks no information about the value of sign_K (u).”
I understand that using the signature of the user identity ensures that no information indicating its existence on the server is leaked. But why are they hashing the signature again? RSA-signatures normally already consist of a signed hash-value. Why would it bad to leak bits of the signature?
The answer to this question is in the portion of the paper you quote: "signature schemes are generally not unforgeable given some bits of the valid signature.” There’s some crypto-jargon here, which you may not be familiar with if you haven't taken a crypto course. What this sentence is trying to say is that, given some bits of a valid signature, an adversary could be able to find some message or data and a corresponding digital signature which is valid and which has not been generated by the legitimate signer (in this case the identity provider). This is called an existential forgery, and it means that an adversary can create forged lookup indices if the signature isn’t hashed. The hash on top of the signature also ensures that the lookup index can’t be easily computed from the VUF.
I actually had a crypto course some time ago, so crypto jargon should be fine.. However it seems that part of my brain needs some more training to get reactivated ;) I want to get this into my brain so here are some more (probably pretty naive) questions and thoughts on the topic: - When bits of a valid signature leak, an external attacker could forge a colliding index for a different user? This would allow an attacker to register the same public key to a different username they control? So therefore a hash is used? - What still confuses me: When I recall correctly, 'existential forgery' is the lowest level of forgery since the attacker may not have control over the contents of the signed message, but may forge a valid signature for that message. You propose to use a existentially unforgeable signature scheme for your VUF. But: - As you explicitly use an existentially unforgeable signature scheme, why can the index then be forged if bits of this (existentially unforgeable) signature are leaked to an attacker? Is the hash needed when an external attacker has no knowledge of the private key used to sign the index? - You later propose RSA for the VUF, but isn't RSA *the* example for which existential forgery can be done? - So am I right to assume that the attacker model shifts from an external one to a malicious identity provider? As he knows his private key, the signed data and the signature, what is the hash function hiding here?
Also, if I see this correctly, they don't do it in their reference implementation (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java), instead they just use SHA256withRSA... So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which would be the normal 'hash and sign' operation of RSA? Or does it make sense to rehash the signature and their own implementation is not consistent with it? This is probably no important point, however I want to make sure I'm not missing a piece concerning the security of the user identities.
Unfortunately, the reference implementation is still a work in progress (although I’ve had to take some time off from working on it due to other research projects), and is missing several features described in the paper. The SignatureOps class in the reference implementation simply implements wrappers around Java signature functions, so that’s not where the user lookup index calculation happens. The lookup index calculation happens in the unameToIndex() function in conks_server.ServerUtils. Now, I will note that the reference implementation currently *isn’t* doing the proper index calculation as it only currently hashes the username, so thank you for making me go back and double-check my code! But it would be relative straight-forward to change this and to add the digital signature computation that needs to be done before the hashing.
Ok, good to know that, I'll keep it in mind when reading the reference code! However, this is embarrassing, I only had a glance an the reference implementation, sorry for linking to an unrelated spot in the code and making assumptions...
I think my next steps will be to read the peer-reviewed version of your Paper and to refresh my crypto-knowledge a bit. I'd appreciate if you had some pointers to resources relevant to this project for me (other than classic textbooks). I may even try to have a look at the Micali/Rabin paper on VRFs.
And I'll try to find my way into the ctypes-otr code in the next days.
Best Wishes!
Elias
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hi Elias,
On Mar 9, 2016, at 5:31 AM, Elias Rohrer ml@tnull.de wrote:
Hello Marcela! Nice to meet you, too, and thank you for that interesting Paper! It seems I made a lot of wrong assumptions in my last mail, ugh, thats kind of embarrassing..
I really appreciate your interest in our paper and CONIKS, thanks :) No worries, there are a lot of subtle aspects about CONIKS, so it can take some time to fully see and understand them.
Yes, we propose that the identity providers can act as auditors and cross-verify each other in this way, and clients may query any auditor when they are checking for equivocation. However, I should note that we intended this to be the *ideal deployment* (and I apologize if this doesn’t come through clearly enough in the paper). CONIKS also allows third-party auditors whose sole job it is to check that the chain of signed tree roots is linear. In practice, we expect that auditors and identity providers will be separate in most cases (at least until there is a standard for CONIKS protocols and data formats, but we’re not there yet).
It’s not accurate to conflate auditing with being a key server, or say that auditing is a subset of the key server’s functionality. Auditors only need to keep records of identity providers’ signed tree roots, and they never download or record any key directory contents, so it doesn’t really make sense to equip auditors with key server functionality and “turn it off” when you’re not acting as a key server. Key servers have the more complex and rich functionality needing to maintain significantly larger data structures and perform more cryptographic computations. This means that key servers have very different space and availability requirements than auditors do.
Ah, yes, that is true. Just turning of features of the server to get an auditor might indeed be a bad idea. But wouldn't there be a lot of overlapping functionality in the server, auditor and even the clients? E.g. the crypto primitives for building and checking the tree? Maybe it would be viable to create a libCONIKS which could be used by all three components?
So it might seem at first glance that there’s a lot of overlapping functionality between the server, auditor and clients, but once you start implementing each one, it’s not really the case. Think about the different tasks of each component: the server is generating a bunch of digital signatures (e.g. signed tree roots, lookup indices) and hashes (Merkle tree, lookup indices, signed tree roots) to build the Merkle trees and hash chain of signed tree roots. Clients, on the other hand, do a bunch of signature and hash verifications (which, yes to be fair requires hashing data as well, but for different reasons) to verify individual authentication paths and signed tree roots. So there’s really no overlap in functionality with the server. Similarly, the auditors need to verify the signed tree roots so there’s some signature and hash verification as well, but in a different way than clients need to do this.
While generation and verification of these various crypto primitives go hand in hand, I would caution against putting them all in a single library if different components of the system will need different portions. If you look at the reference implementation, the only common code is the definition of the data formats. When I first started writing the reference implementation, I actually thought that it would be useful (and possible) to build a single libCONIKS, but sharing underlying crypto primitives isn’t reason enough to put all of the functionality in one place since the application of those primitives is what really matters. So I really think it makes more sense to make separate modules for server, client and auditor.
On Page 4 the authors state: "We can implement a VUF using any deterministic, existentially unforgeable signature scheme [42]. The signature scheme must be deterministic or else (in our application) the identity provider could insert multiple bindings for a user at different locations each with a valid authentication path. In practice, this could be a classic RSA signature [51] (using a deterministic padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which provide the best space efficiency. Many common discrete-log based signature schemes such as DSA [30] are not immediately applicable as they require a random nonce. In CONIKS, we generate the index for a user u as: i = h (sign_K (u)) where h() is a one-way hash function and sign() is a deterministic, existentially unforgeable signature scheme. The extra hash function is added because some bits of sign_K (u) might otherwise leak through binding validity proofs, and signature schemes are generally not unforgeable given some bits of the valid signature. The hash function prevents this by ensuring the final lookup index leaks no information about the value of sign_K (u).”
I understand that using the signature of the user identity ensures that no information indicating its existence on the server is leaked. But why are they hashing the signature again? RSA-signatures normally already consist of a signed hash-value. Why would it bad to leak bits of the signature?
The answer to this question is in the portion of the paper you quote: "signature schemes are generally not unforgeable given some bits of the valid signature.” There’s some crypto-jargon here, which you may not be familiar with if you haven't taken a crypto course. What this sentence is trying to say is that, given some bits of a valid signature, an adversary could be able to find some message or data and a corresponding digital signature which is valid and which has not been generated by the legitimate signer (in this case the identity provider). This is called an existential forgery, and it means that an adversary can create forged lookup indices if the signature isn’t hashed. The hash on top of the signature also ensures that the lookup index can’t be easily computed from the VUF.
I actually had a crypto course some time ago, so crypto jargon should be fine.. However it seems that part of my brain needs some more training to get reactivated ;) I want to get this into my brain so here are some more (probably pretty naive) questions and thoughts on the topic:
- When bits of a valid signature leak, an external attacker could forge a colliding index for a different user? This would allow an attacker to register the same public key to a different username they control? So therefore a hash is used?
So the problem with leaked bits isn’t about an attacker forging an index. The problem is actually about privacy. Let’s go back and think about why the server transforms usernames into these lookup indices. The main purpose is so that the proofs of inclusion (i.e. the Merkle authentication paths) for one user don’t leak information about *any other* users in the tree. So if I receive the proof of inclusion for Alice, I should not be able to figure out who else might be in the tree. Remember that the proof of inclusion gives us the prefix of Alice’s lookup index, so in other words, we want to make sure that an attacker can’t find a name that results in an index with a shared prefix with Alice since this would be a privacy violation. If you use just a hash function, an attacker can reasonably hash a large set of usernames until it finds one that shares its prefix with Alice's index and determine whether this user exists by querying the server for the corresponding key.
Using only the digital signature doesn’t solve this problem since there are attacks that allow you to recover the rest of the signature bits given some of the first bits (I don’t know which specific attacks do this, this is the explanation I received from my collaborator). So given Alice’s index prefix, an attacker could try to guess if the index for a specific user (e.g. Bob) shares the same prefix, recover the remaining bits of the signature, and verify — using the provider's index public verification key -- whether this is a valid signature for Bob’s name. This is also a privacy violation. So by hashing the signature, anyone who receives the authentication path for Alice cannot determine based on this path, which other usernames exist in the tree since this would require finding the signatures whose hash shares a prefix with Alice’s index, a much more difficult problem.
- What still confuses me: When I recall correctly, 'existential forgery' is the lowest level of forgery since the attacker may not have control over the contents of the signed message, but may forge a valid signature for that message.
You propose to use a existentially unforgeable signature scheme for your VUF. But:
- As you explicitly use an existentially unforgeable signature scheme, why can the index then be forged if bits of this (existentially unforgeable) signature are leaked to an attacker? Is the hash needed when an external attacker has no knowledge of the private key used to sign the index?
- You later propose RSA for the VUF, but isn't RSA *the* example for which existential forgery can be done?
We realized the potential dangers of using RSA, so we designed a different VUF scheme. I think it’s best if you read the new paper to see the description.
- So am I right to assume that the attacker model shifts from an external one to a malicious identity provider? As he knows his private key, the signed data and the signature, what is the hash function hiding here?
See my explanation of the VUF above. CONIKS tries to protect against external *and* internal attackers. As I explained above, the hash function in the private lookup index is used to protect users’ privacy from external attackers. Like you said, the key server knows can know who all has name-to-key mappings in its own key directory. CONIKS deals with the internal attackers (i.e. malicious or compromised key server) using the tamper-evident data structure and signed tree roots to leave evidence of fake keys and equivocation.
Also, if I see this correctly, they don't do it in their reference implementation (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java<https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java>), instead they just use SHA256withRSA... So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which would be the normal 'hash and sign' operation of RSA? Or does it make sense to rehash the signature and their own implementation is not consistent with it? This is probably no important point, however I want to make sure I'm not missing a piece concerning the security of the user identities.
Unfortunately, the reference implementation is still a work in progress (although I’ve had to take some time off from working on it due to other research projects), and is missing several features described in the paper. The SignatureOps class in the reference implementation simply implements wrappers around Java signature functions, so that’s not where the user lookup index calculation happens. The lookup index calculation happens in the unameToIndex() function in conks_server.ServerUtils. Now, I will note that the reference implementation currently *isn’t* doing the proper index calculation as it only currently hashes the username, so thank you for making me go back and double-check my code! But it would be relative straight-forward to change this and to add the digital signature computation that needs to be done before the hashing.
Ok, good to know that, I'll keep it in mind when reading the reference code! However, this is embarrassing, I only had a glance an the reference implementation, sorry for linking to an unrelated spot in the code and making assumptions...
I think my next steps will be to read the peer-reviewed version of your Paper and to refresh my crypto-knowledge a bit. I'd appreciate if you had some pointers to resources relevant to this project for me (other than classic textbooks). I may even try to have a look at the Micali/Rabin paper on VRFs.
As for good pointers for this project, I’m not sure that the Micali/Rabin paper on VRFs specifically is really going to be all that helpful for implementing CONIKS because the paper describes VRFs at a level of detail that isn’t needed to understand why it may be better to use an EC-based VUF construction than RSA. I apologize about the confusion arising from the older version of our paper
If you’re interested in the very formal crypto details of VRFs, you should read the paper, but for this project it’s more important to understand the security and privacy problems that CONIKS tries to solve and how it achieves those goals, and to understand how CONIKS fits into the workflow of the existing Tor Messenger service. Hopefully, our Usenix paper will explain most of the security and privacy design decisions, but definitely feel free to ask more questions! Another good resource might be to read about equivocation and fork consistency (David Mazieres’ paper on SUNDR first introduced this concept), and certificate transparency (there’s a good ACM Queue article) for more background on the problems CONIKS tries to solve and the design decisions we made. Goldberg et al’s paper on preventing DNS zone enumeration is a fairly recent paper that also shows how to use VUFs to prevent enumeration of sensitive information, so that might be another good resource. For figuring out how CONIKS can be integrated into Tor Messenger, I would look at the portions of the Tor Messenger code that currently handle encryption keys (e.g. key generation, registration, key lookups), and come up with a plan of how CONIKS will affect the existing workflow.
I hope this helps! Best, Marcela
Hello Marcela!
On 11 Mar 2016, at 2:55, Marcela Melara wrote:
Hi Elias,
On Mar 9, 2016, at 5:31 AM, Elias Rohrer ml@tnull.de wrote:
Hello Marcela! Nice to meet you, too, and thank you for that interesting Paper! It seems I made a lot of wrong assumptions in my last mail, ugh, thats kind of embarrassing..
I really appreciate your interest in our paper and CONIKS, thanks :) No worries, there are a lot of subtle aspects about CONIKS, so it can take some time to fully see and understand them.
Yes, we propose that the identity providers can act as auditors and cross-verify each other in this way, and clients may query any auditor when they are checking for equivocation. However, I should note that we intended this to be the *ideal deployment* (and I apologize if this doesn’t come through clearly enough in the paper). CONIKS also allows third-party auditors whose sole job it is to check that the chain of signed tree roots is linear. In practice, we expect that auditors and identity providers will be separate in most cases (at least until there is a standard for CONIKS protocols and data formats, but we’re not there yet).
It’s not accurate to conflate auditing with being a key server, or say that auditing is a subset of the key server’s functionality. Auditors only need to keep records of identity providers’ signed tree roots, and they never download or record any key directory contents, so it doesn’t really make sense to equip auditors with key server functionality and “turn it off” when you’re not acting as a key server. Key servers have the more complex and rich functionality needing to maintain significantly larger data structures and perform more cryptographic computations. This means that key servers have very different space and availability requirements than auditors do.
Ah, yes, that is true. Just turning of features of the server to get an auditor might indeed be a bad idea. But wouldn't there be a lot of overlapping functionality in the server, auditor and even the clients? E.g. the crypto primitives for building and checking the tree? Maybe it would be viable to create a libCONIKS which could be used by all three components?
So it might seem at first glance that there’s a lot of overlapping functionality between the server, auditor and clients, but once you start implementing each one, it’s not really the case. Think about the different tasks of each component: the server is generating a bunch of digital signatures (e.g. signed tree roots, lookup indices) and hashes (Merkle tree, lookup indices, signed tree roots) to build the Merkle trees and hash chain of signed tree roots. Clients, on the other hand, do a bunch of signature and hash verifications (which, yes to be fair requires hashing data as well, but for different reasons) to verify individual authentication paths and signed tree roots. So there’s really no overlap in functionality with the server. Similarly, the auditors need to verify the signed tree roots so there’s some signature and hash verification as well, but in a different way than clients need to do this.
While generation and verification of these various crypto primitives go hand in hand, I would caution against putting them all in a single library if different components of the system will need different portions. If you look at the reference implementation, the only common code is the definition of the data formats. When I first started writing the reference implementation, I actually thought that it would be useful (and possible) to build a single libCONIKS, but sharing underlying crypto primitives isn’t reason enough to put all of the functionality in one place since the application of those primitives is what really matters. So I really think it makes more sense to make separate modules for server, client and auditor.
On Page 4 the authors state: "We can implement a VUF using any deterministic, existentially unforgeable signature scheme [42]. The signature scheme must be deterministic or else (in our application) the identity provider could insert multiple bindings for a user at different locations each with a valid authentication path. In practice, this could be a classic RSA signature [51] (using a deterministic padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which provide the best space efficiency. Many common discrete-log based signature schemes such as DSA [30] are not immediately applicable as they require a random nonce. In CONIKS, we generate the index for a user u as: i = h (sign_K (u)) where h() is a one-way hash function and sign() is a deterministic, existentially unforgeable signature scheme. The extra hash function is added because some bits of sign_K (u) might otherwise leak through binding validity proofs, and signature schemes are generally not unforgeable given some bits of the valid signature. The hash function prevents this by ensuring the final lookup index leaks no information about the value of sign_K (u).”
I understand that using the signature of the user identity ensures that no information indicating its existence on the server is leaked. But why are they hashing the signature again? RSA-signatures normally already consist of a signed hash-value. Why would it bad to leak bits of the signature?
The answer to this question is in the portion of the paper you quote: "signature schemes are generally not unforgeable given some bits of the valid signature.” There’s some crypto-jargon here, which you may not be familiar with if you haven't taken a crypto course. What this sentence is trying to say is that, given some bits of a valid signature, an adversary could be able to find some message or data and a corresponding digital signature which is valid and which has not been generated by the legitimate signer (in this case the identity provider). This is called an existential forgery, and it means that an adversary can create forged lookup indices if the signature isn’t hashed. The hash on top of the signature also ensures that the lookup index can’t be easily computed from the VUF.
I actually had a crypto course some time ago, so crypto jargon should be fine.. However it seems that part of my brain needs some more training to get reactivated ;) I want to get this into my brain so here are some more (probably pretty naive) questions and thoughts on the topic:
- When bits of a valid signature leak, an external attacker could
forge a colliding index for a different user? This would allow an attacker to register the same public key to a different username they control? So therefore a hash is used?
So the problem with leaked bits isn’t about an attacker forging an index. The problem is actually about privacy. Let’s go back and think about why the server transforms usernames into these lookup indices. The main purpose is so that the proofs of inclusion (i.e. the Merkle authentication paths) for one user don’t leak information about *any other* users in the tree. So if I receive the proof of inclusion for Alice, I should not be able to figure out who else might be in the tree. Remember that the proof of inclusion gives us the prefix of Alice’s lookup index, so in other words, we want to make sure that an attacker can’t find a name that results in an index with a shared prefix with Alice since this would be a privacy violation. If you use just a hash function, an attacker can reasonably hash a large set of usernames until it finds one that shares its prefix with Alice's index and determine whether this user exists by querying the server for the corresponding key.
Using only the digital signature doesn’t solve this problem since there are attacks that allow you to recover the rest of the signature bits given some of the first bits (I don’t know which specific attacks do this, this is the explanation I received from my collaborator). So given Alice’s index prefix, an attacker could try to guess if the index for a specific user (e.g. Bob) shares the same prefix, recover the remaining bits of the signature, and verify — using the provider's index public verification key -- whether this is a valid signature for Bob’s name. This is also a privacy violation. So by hashing the signature, anyone who receives the authentication path for Alice cannot determine based on this path, which other usernames exist in the tree since this would require finding the signatures whose hash shares a prefix with Alice’s index, a much more difficult problem.
Ok, thanks for your elaborations! I think I understand the problem and how you solved it now. I'm still a bit curious though, which specific forgery attacks would be able to recover the signature. ;)
- What still confuses me: When I recall correctly, 'existential
forgery' is the lowest level of forgery since the attacker may not have control over the contents of the signed message, but may forge a valid signature for that message. You propose to use a existentially unforgeable signature scheme for your VUF. But:
- As you explicitly use an existentially unforgeable signature
scheme, why can the index then be forged if bits of this (existentially unforgeable) signature are leaked to an attacker? Is the hash needed when an external attacker has no knowledge of the private key used to sign the index?
- You later propose RSA for the VUF, but isn't RSA *the* example for
which existential forgery can be done?
We realized the potential dangers of using RSA, so we designed a different VUF scheme. I think it’s best if you read the new paper to see the description.
- So am I right to assume that the attacker model shifts from an
external one to a malicious identity provider? As he knows his private key, the signed data and the signature, what is the hash function hiding here?
See my explanation of the VUF above. CONIKS tries to protect against external *and* internal attackers. As I explained above, the hash function in the private lookup index is used to protect users’ privacy from external attackers. Like you said, the key server knows can know who all has name-to-key mappings in its own key directory. CONIKS deals with the internal attackers (i.e. malicious or compromised key server) using the tamper-evident data structure and signed tree roots to leave evidence of fake keys and equivocation.
Also, if I see this correctly, they don't do it in their reference implementation (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java<https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java>), instead they just use SHA256withRSA... So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which would be the normal 'hash and sign' operation of RSA? Or does it make sense to rehash the signature and their own implementation is not consistent with it? This is probably no important point, however I want to make sure I'm not missing a piece concerning the security of the user identities.
Unfortunately, the reference implementation is still a work in progress (although I’ve had to take some time off from working on it due to other research projects), and is missing several features described in the paper. The SignatureOps class in the reference implementation simply implements wrappers around Java signature functions, so that’s not where the user lookup index calculation happens. The lookup index calculation happens in the unameToIndex() function in conks_server.ServerUtils. Now, I will note that the reference implementation currently *isn’t* doing the proper index calculation as it only currently hashes the username, so thank you for making me go back and double-check my code! But it would be relative straight-forward to change this and to add the digital signature computation that needs to be done before the hashing.
Ok, good to know that, I'll keep it in mind when reading the reference code! However, this is embarrassing, I only had a glance an the reference implementation, sorry for linking to an unrelated spot in the code and making assumptions...
I think my next steps will be to read the peer-reviewed version of your Paper and to refresh my crypto-knowledge a bit. I'd appreciate if you had some pointers to resources relevant to this project for me (other than classic textbooks). I may even try to have a look at the Micali/Rabin paper on VRFs.
As for good pointers for this project, I’m not sure that the Micali/Rabin paper on VRFs specifically is really going to be all that helpful for implementing CONIKS because the paper describes VRFs at a level of detail that isn’t needed to understand why it may be better to use an EC-based VUF construction than RSA. I apologize about the confusion arising from the older version of our paper
If you’re interested in the very formal crypto details of VRFs, you should read the paper, but for this project it’s more important to understand the security and privacy problems that CONIKS tries to solve and how it achieves those goals, and to understand how CONIKS fits into the workflow of the existing Tor Messenger service. Hopefully, our Usenix paper will explain most of the security and privacy design decisions, but definitely feel free to ask more questions! Another good resource might be to read about equivocation and fork consistency (David Mazieres’ paper on SUNDR first introduced this concept), and certificate transparency (there’s a good ACM Queue article) for more background on the problems CONIKS tries to solve and the design decisions we made. Goldberg et al’s paper on preventing DNS zone enumeration is a fairly recent paper that also shows how to use VUFs to prevent enumeration of sensitive information, so that might be another good resource. For figuring out how CONIKS can be integrated into Tor Messenger, I would look at the portions of the Tor Messenger code that currently handle encryption keys (e.g. key generation, registration, key lookups), and come up with a plan of how CONIKS will affect the existing workflow.
I hope this helps!
Yes, thank you, your elaborations helped me very much! I'll send in my proposal for this project on Monday and will have a look at the promising resources you gave me over the next few weeks.
Best Wishes!
Elias
Best, Marcela_______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hi Elias,
Ok, thanks for your elaborations! I think I understand the problem and how you solved it now. I'm still a bit curious though, which specific forgery attacks would be able to recover the signature. ;)
Great, I’m glad I was able to answer your questions!
Yes, thank you, your elaborations helped me very much! I'll send in my proposal for this project on Monday and will have a look at the promising resources you gave me over the next few weeks.
I look forward to reading your application! Please let me know if you have any questions about the other resources you look at.
Best, Marcela
On Mar 8, 2016, at 7:28 AM, Elias Rohrer ml@tnull.de wrote:
- How would the CONIKS client code integrate with the ctypes-otr add-on? It would probably be part of it? So should I also have a look to implement client functionality in a C library and then use the calls via ctypes?
Yes, the client code will be part of the add-on but, ideally, written in JavaScript. There will however be some interaction with libotr through the ctypes bindings, in order to update fingerprint verification status and whatnot.