This used to be an appendix in my drafts for proposal 224, but I split it into its own document since it's orthogonal to the rest of 224.
See section 1 for appropriate provisos and disclaimers. I'm writing this as a proposal mainly so other researchers can write better proposals for this kind of thing.
Filename: 225-strawman-shared-rand.txt Title: Strawman proposal: commit-and-reveal shared rng Author: Nick Mathewson Created: 2013-11-29 Status: Draft
1. Introduction
This is a strawman proposal: I don't think we should actually build it. It's just a simple writeup of the more trivial commit-then-reveal protocol for generating a shared random value. It's insecure to the extent that an adversary who controls b of the authorities gets to choose among 2^b outcomes for the result of the protocol.
See proposal 224, section HASHRING for some motivation of why we want one of these in Tor.
Let's do better!
[TODO: Are we really stuck with Tor's nasty metaformat here?]
2. The protocol
Here's a protocol for producing a shared random value. It should run less frequently than the directory consensus algorithm. It runs in these phases.
1. COMMITMENT 2. REVEAL 3. COMPUTE SHARED RANDOM
It should be implemented by software other than Tor, which should be okay for authorities.
Note: This is not a great protocol. It has a number of failure modes. Better protocols seem hard to implement, though, and it ought to be possible to drop in a replacement here, if we do it right.
At the start of phase 1, each participating authority publishes a statement of the form:
shared-random 1 shared-random-type commit signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C signature (made with commitment key; see proposal 220)
The signing key is the one used for consensus votes, signed by the directory authority identity key. The commitment key is used for this protocol only. The signature is made with the commitment key. The period-start value is the start of the period for which the shared random value should be in use. The attempt value starts at 1, and increments by 1 for each time that the protocol fails.
The other fields should be self-explanatory.
The commitment value C is a base64-encoded SHA-512 hash of a 256-bit random value R.
During the rest of phase 1, every authority collects the commitments from other authorities, and publishes them to other authorities, as they do today with directory votes.
At the start of phase 2, each participating authority publishes:
shared-random 1 shared-random-type reveal signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) received-commitment ID sig received-commitment ID sig published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C reveal R signature (made with commitment key; see proposal 220)
The R value is the one used to generate C. The received-commitment lines are the signatures on the documents from other authorities in phase 1. All other fields are as in the commitments.
During the rest of phase 2, every authority collects the reveals from other authorities, as above with commitments.
At the start of phase 3, each participating authority either has a reveal from every authority that it received a commitment from, or it does not. Each participating authority then says
shared-random 1 shared-random-type finish signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) received-commitment ID sig R received-commitment ID sig R ... published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT consensus C signature (made with commitment key; see proposal 220)
Where C = SHA256(ID | R | ID | R | ID | R | ...) where the ID values appear in ascending order and the R values appear after their corresponding ID values.
See [SHAREDRANDOM-REFS] for more discussion here.
(TODO: should this be its own spec? If so, does it have to use our regular metaformat or can it use something less sucky?)
30.11.2013 02:28, Nick Mathewson:
[...]
Trimming to shorten the length.
Filename: 225-strawman-shared-rand.txt Title: Strawman proposal: commit-and-reveal shared rng Author: Nick Mathewson Created: 2013-11-29 Status: Draft
- Introduction
This is a strawman proposal: I don't think we should actually build it. It's just a simple writeup of the more trivial commit-then-reveal protocol for generating a shared random value. It's insecure to the extent that an adversary who controls b of the authorities gets to choose among 2^b outcomes for the result of the protocol.
See proposal 224, section HASHRING for some motivation of why we want one of these in Tor.
Let's do better!
[TODO: Are we really stuck with Tor's nasty metaformat here?]
- The protocol
Here's a protocol for producing a shared random value. It should run less frequently than the directory consensus algorithm. It runs in these phases.
1. COMMITMENT 2. REVEAL 3. COMPUTE SHARED RANDOM
It should be implemented by software other than Tor, which should be okay for authorities.
Note: This is not a great protocol. It has a number of failure modes. Better protocols seem hard to implement, though, and it ought to be possible to drop in a replacement here, if we do it right.
At the start of phase 1, each participating authority publishes a statement of the form:
shared-random 1 shared-random-type commit signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C signature (made with commitment key; see proposal 220)
The signing key is the one used for consensus votes, signed by the directory authority identity key. The commitment key is used for this protocol only. The signature is made with the commitment key. The period-start value is the start of the period for which the shared random value should be in use. The attempt value starts at 1, and increments by 1 for each time that the protocol fails.
The other fields should be self-explanatory.
The commitment value C is a base64-encoded SHA-512 hash of a 256-bit random value R.
During the rest of phase 1, every authority collects the commitments from other authorities, and publishes them to other authorities, as they do today with directory votes.
At the start of phase 2, each participating authority publishes:
shared-random 1 shared-random-type reveal signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) received-commitment ID sig received-commitment ID sig published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C reveal R signature (made with commitment key; see proposal 220)
The R value is the one used to generate C. The received-commitment lines are the signatures on the documents from other authorities in phase 1. All other fields are as in the commitments.
During the rest of phase 2, every authority collects the reveals from other authorities, as above with commitments.
(Still haven't read any paper linked from the "bitcoin hash thread".)
By definition each subverted authority will try to be the last one that reveals. It would try to calculate whenever or not it is better to reveal or fake an error, based on the result it knows.
How about specifying some sort of order for revealing? 8 Authorities will be split into 7 for A and 1 for B. Authority B can only reveal its commit after all A's revealed theirs, with some timeout, where its vote would count if others are missing. Each time they get together for voting a different Authority will be B.
With one Authority under my control I could try to game it once instead of trying to do this all the time. This works with a fixed pattern, where Authority 1, 2, 3, 4, 5, 6, 7 and 8 rotate each time it is to be decided about the string. I'm not sure if there would be an improvement if that would be more random to avoid being able to predict what authority will be allowed to be the last one too far ahead.
Regards, Sebastian G. (bastik)
Relative to what's in the draft it's an improvement with no obvious downside.
If you were going to do this it would probably be best to use a full commitment order instead of only selecting the final committer. That is, instead of having an n'th committer and n-1 chaotically ordered committers you have a first committer, second, third, ....
In the semi-ordered approach if an adversary had control over d authorities, 1 of which was the final committer for that round then for that round they could still easily choose from 2^d values. On the other hand if you had a full, strict commit ordering then it would be more difficult because the adversary would need to get all their controlled authorities into the correct position in the ordering rather than just one.
How they would be ordered is an interesting question. Making the ordering unpredictable would amount to solving the same problem that this shared random value calculation idea sets out to solve. https://trac.torproject.org/projects/tor/ticket/8244
On Sat, Nov 30, 2013 at 4:24 PM, Sebastian G. <bastik.tor> bastik.tor@googlemail.com wrote:
30.11.2013 02:28, Nick Mathewson:
[...]
Trimming to shorten the length.
Filename: 225-strawman-shared-rand.txt Title: Strawman proposal: commit-and-reveal shared rng Author: Nick Mathewson Created: 2013-11-29 Status: Draft
- Introduction
This is a strawman proposal: I don't think we should actually build it. It's just a simple writeup of the more trivial commit-then-reveal protocol for generating a shared random value. It's insecure to the extent that an adversary who controls b of the authorities gets to choose among 2^b outcomes for the result of the protocol.
See proposal 224, section HASHRING for some motivation of why we want one of these in Tor.
Let's do better!
[TODO: Are we really stuck with Tor's nasty metaformat here?]
- The protocol
Here's a protocol for producing a shared random value. It should run less frequently than the directory consensus algorithm. It runs in these phases.
1. COMMITMENT 2. REVEAL 3. COMPUTE SHARED RANDOM
It should be implemented by software other than Tor, which should be okay for authorities.
Note: This is not a great protocol. It has a number of failure modes. Better protocols seem hard to implement, though, and it ought to be possible to drop in a replacement here, if we do it right.
At the start of phase 1, each participating authority publishes a statement of the form:
shared-random 1 shared-random-type commit signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C signature (made with commitment key; see proposal 220)
The signing key is the one used for consensus votes, signed by the directory authority identity key. The commitment key is used for this protocol only. The signature is made with the commitment key. The period-start value is the start of the period for which the shared random value should be in use. The attempt value starts at 1, and increments by 1 for each time that the protocol fails.
The other fields should be self-explanatory.
The commitment value C is a base64-encoded SHA-512 hash of a 256-bit random value R.
During the rest of phase 1, every authority collects the commitments from other authorities, and publishes them to other authorities, as they do today with directory votes.
At the start of phase 2, each participating authority publishes:
shared-random 1 shared-random-type reveal signing-key-certification (certification here; see proposal 220) commitment-key-certification (certification here; see proposal 220) received-commitment ID sig received-commitment ID sig published YYYY-MM-DD HH:MM:SS period-start YYYY-MM-DD HH:MM:SS attempt INT commitment sha512 C reveal R signature (made with commitment key; see proposal 220)
The R value is the one used to generate C. The received-commitment lines are the signatures on the documents from other authorities in phase 1. All other fields are as in the commitments.
During the rest of phase 2, every authority collects the reveals from other authorities, as above with commitments.
(Still haven't read any paper linked from the "bitcoin hash thread".)
By definition each subverted authority will try to be the last one that reveals. It would try to calculate whenever or not it is better to reveal or fake an error, based on the result it knows.
How about specifying some sort of order for revealing? 8 Authorities will be split into 7 for A and 1 for B. Authority B can only reveal its commit after all A's revealed theirs, with some timeout, where its vote would count if others are missing. Each time they get together for voting a different Authority will be B.
With one Authority under my control I could try to game it once instead of trying to do this all the time. This works with a fixed pattern, where Authority 1, 2, 3, 4, 5, 6, 7 and 8 rotate each time it is to be decided about the string. I'm not sure if there would be an improvement if that would be more random to avoid being able to predict what authority will be allowed to be the last one too far ahead.
Regards, Sebastian G. (bastik) _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Hello. I would like to suggest this as an alternative. It's not complete, but the basic idea is there.
Filename: - Title: VSS shared rng Author: Kang Created: 2013-12-03 Status: Draft, Needs-Research
1. Overview & Motivation
This protocol and document are works in progress.
A method of generating a shared random value between a set of directory authorities. The purpose of the shared random value is to act as a salt during HSDir (Hidden Service Directory) selection/lookup calculations. A new shared random would be calculated periodically and served by the directory authorities.
The protocol is designed around VSS (Verifiable Secret Sharing). Appendix [SECRET_SHARING] gives a very basic overview of what secret sharing is. Secret sharing is used to prevent backing out, where ``backing out'' means changing the final result by refusing to participate any further in the protocol [for instance withholding your contribution]. Verifiable secret sharing must be used because it allows participants to verify that shares they receive are valid and consistent. That is, it ensures that each participant commits to a single value which they can't change, and that dishonest participants can't modify or lie about shares they've received.
The motivation for the shared random solution is that it allows solving of the problem in ticket #8244 without a complete redesign of the HSDir system. https://trac.torproject.org/projects/tor/ticket/8244 Details of how the shared random is actually used to solve that problem are not covered here, only calculation of the shared random value.
While this protocol is designed specifically for the Tor project it should be applicable for calculating shared random values in almost any situation. It is still rather rough around the edges, so feel free to point out any flaws or mistakes, or suggest an improvement or better method.
Please note that ``participant'' and ``(directory) authority'' will be used interchangeably. ``secret'' and ``contribution'' will also be used interchangeably, although I will try to stick to ``contribution''. n will be taken to mean the number of participants (directory authorities) taking part in the protocol.
2. Design
The basic idea behind the protocol is as follows. Each participant (authority) generates a secret contribution. These contributions are then distributed simultaneously among all participants using a modified VSS scheme and a threshold of t = floor((n-1)/2) + 1. The shared random is then calculated as some function of all those contributions, for instance the XOR of each contribution together.
The threshold t used in the VSS scheme is chosen such that no adversarial participant (or adversarial group of less than 50% of all participants) can calculate the secret early. If a group of adversarial participants were able to calculate the secret earlier than the honest participants they would be able to alter the final result either by withholding their shares (backing out) or by causing the protocol to restart (and hence a completely new value calculated). Appendix [THRESHOLD_CHOICE] discusses the choice of threshold.
The exact VSS scheme to be used is not set in stone. The one referenced while designing the basic protocol is Feldman's VSS, which is described in the paper: Feldman, Paul. "A practical scheme for non-interactive verifiable secret sharing." Foundations of Computer Science, 1987., 28th Annual Symposium on. IEEE, 1987.
The protocol is split into several phases. Each phase can be summarised as follows:
--INIT phase: The initial phase. Just wait until we either reach our scheduled start time or we receive a DISTRIBUTE message, triggering the shared random calculation. There should probably be some sort of time bound here, since we don't want it to be possible to trigger creation of a new random an arbitrary amount of times; an adversary could easily abuse that by calculating random values until a desirable one is created. An example time bound could be ``if a message is received within 10 minutes of our scheduled start time then begin the protocol, else drop the message [or try to get the authorities to resync their schedules]''.
--DISTRIBUTE phase: Perform the sharing phase for each of the n instances of the chosen VSS scheme. Create our shares and commitments/proofs, then distribute those among the other participants. Each participant should be sent a single share, plus any information they need to validate that share. This phase doesn't end until we've both distributed all n-1 of our shares and we've received a share from each participant (received n-1) [or error / timeout of course]. Whenever a participant receives a DISTRIBUTE message they must validate the share it contains (using the methods provided by the VSS scheme). If validation fails the participant must notify the other participants of the error and then the protocol restarts from the beginning.
--SYNC phase: When a participant transitions into the SYNC phase they send a SYNC message to all other participants; this contains no shares, however may contain validation information. When a participant has received n-1 SYNC messages they may transition into the REVEAL phase; additionally we may transition into the REVEAL phase if we receive t commitments and then the phase time-out timer goes off [This requires more analysis. It _might_ be possible to exploit such extra behaviour].
The purpose of the SYNC phase is to ensure that all participants have successfully completed the DISTRIBUTE phase before anybody moves to the REVEAL phase. We need to ensure that everybody has n-1 valid shares; 1 share from each secret (not counting their own secret). This is very important for preventing backing-out; it prevents anybody from REVEALing prematurely. Additionally this phase can be used for some parts of the validation (depending on the VSS method selected), specifically comparing commitment/proof information.
To calculate the result prematurely here an adversary would require control over t or more of the total participants (a majority). As long as there is an honest majority there is no danger in restarting the protocol from the beginning during this phase or earlier [other than delaying calculations].
--REVEAL phase: At this point everybody distributes the list of n-1 shares they received during the DISTRIBUTE phase [they _don't_ redistribute their own shares]. When a participant has enough shares (has received t-1 REVEAL messages) they reconstruct each of the secret contributions. When each secret contribution has been reconstructed by the participant they XOR'd them together to produce the shared random [alternative combination methods are possible too]. If an error occurs during this phase (or later) we should avoid restarting the protocol if at all possible; restarting opens up the potential for backing out.
--CONCLUSION phase: In this phase we prepare the random secret for publishing. That is, gather information so any Tor client can verify that the shared random was agreed upon by some majority of the authorities. This likely won't require any additional exchanges between participants as you can probably just publish the shares and validation data used during the calculation.
3. Specification
TODO: A more detailed specification will go here when I have a sufficiently ironed out one.
4. Compatibility & Implementation
This protocol could be implemented as part of the Tor software or could be implemented as a completely separate application (provided input and output are well defined).
5. Performance and scalability notes
Each participant sends and receives approximately n messages during each phase. The total number of messages sent during execution of the protocol should be O(n^2). This should be fine considering it's intended to be used among < 20 or so authorities.
Appendix A. Choice of threshold t [THRESHOLD_CHOICE]
For the threshold used in secret sharing we chose a value of t = floor((n-1)/2) + 1. The purpose of this section is to explain and justify this choice.
Notation: t = secret sharing threshold = number of shares required to reconstruct a secret. n = number of participants (participating directory authorities). d = number of collaborating adversary participants. n-d = the number of honest participants.
It is possible for a single adversary to cause changes in the result blindly; but doing so is pointless because they're merely exchanging one unknown value with another. For instance, they could refuse to share their contribution, causing either the protocol to restart (and an entirely new result calculated) or for them to be ignored (and the result is calculated without their input). Additionally they could cause a fatal error which results in a full restart (and full recalculation).
If an adversary has control over a large enough subset d of the participants, however, it becomes possible to prematurely calculate the result. This is where things get dangerous; the collaborating adversaries could make informed decisions when modifying the result. For that reason careful selection of the secret sharing threshold t is important. We want to prevent all these kinds of attacks while also maximising the number of adversaries we can defend against.
There are two main points where premature calculation of the result can be done; the first is during the DISTRIBUTE or SYNC phase, and the second is during the REVEAL phase.
--In the DISTRIBUTE or SYNC phase During the DISTRIBUTE phase each participant generates a secret, splits it into n-1 shares, and then distributes one share to each other participant.
Assume the set of d adversaries simply waits at the start of this phase; they wait and collect shares received from the honest participants. When the honest participants have all completed distributing their shares the set of adversaries can pool all their received shares and get a list of d*(n-d) honest shares. That is, the adversaries have a list of d shares from each honest participant's secret. If the number of adversaries d is larger than the threshold t then they can use these shares to reconstruct all of the honest contributions and prematurely calculate the result.
Therefore to prevent this we require that t > d, that the threshold is larger than the number of adversaries.
--In the REVEAL phase Assume that the whole first half of the protocol completes without issue; the adversaries behave themselves and we reach the REVEAL phase.
At the start of the reveal phase every participant should have received n-1 shares; 1 share from each other participant's secret. If there are d adversarial participants then they have collected a total of d*(n-d) honest shares; d shares from each of the honest contributions. The honest participants (if they pooled their shares) have collected a total of (n-d)*d shares from the adversaries; (n-d) shares from each of the adversary contributions.
Since the honest participants are honest they will abide by the protocol and each REVEAL all (n-d) shares they have learned to every other participant. The adversarial participants just wait and collect shares as they did in the previous method. After all honest participants have REVEALed the adversaries pool together what they know to get (n-1)*(n-d); every share from every honest contribution. The adversaries can now reconstruct all honest contributions and calculate the final result.
If n-d < t then the honest participants can't reconstruct any of the secret contributions without at least one adversary REVEALing. Since the adversaries know the final result but the honest participants don't they can alter it by all refusing to REVEAL. Therefore to prevent this we require that n-d >= t, that the number of honest participants is larger than the threshold.
--Protecting against both To protect against both attacks we need to satisfy: t > d AND n-d >= t
We can change that to: n > 2d AND d < t <= n - d
n > 2d, so n/2 > d, and since we're dealing with integers here floor((n-1)/2) >= d. That is, d = floor((n-1)/2) = the largest number of adversaries we can handle while defending against both attacks.
This gives us: floor((n-1)/2) < t <= n - floor((n-1)/2)
Example: n = 10 d = floor((10 - 1)/2) = 4 4 < t <= 6 t in {5, 6}
n = 11 d = floor((11 - 1)/2) = 5 5 < t <= 6 t in {6}
n = 12 d = floor((12 - 1)/2) = 5 5 < t <= 7 t in {6, 7}
n = 13 d = floor((13 - 1)/2) = 6 6 < t <= 7 t in {7}
...
We'll choose the smaller value for t when more than one is possible, giving us: t = d + 1 = floor((n-1)/2) + 1
[TODO: ensure the smaller value is indeed the better choice]
Appendix B. Very basic overview of secret sharing [SECRET_SHARING]
Secret sharing algorithms are algorithms that take some secret value as input, then split that into a set of _shares_. On their own shares are useless, however, if you combine them you can reconstruct the original secret value.
Secret sharing algorithms are often combined with a threshold scheme, where the original secret value can be reconstructed from any t shares. These tend to be denoted as (t, n)-threshold schemes; where t is the minimum number of shares required for reconstruction, and n is the total number of shares created.
The creator of the shares is often referred to as the _dealer_. Generally the dealer splits their secret into n shares which are then distributed among n _participants_. Any t of those participants can then work together to reconstruct the secret.
In regular secret sharing it is possible to create shares such that different combinations of t shares will produce different reconstructions of the secret. Verifiable secret sharing (VSS) is a more secure type of secret sharing. Verifiable secret sharing methods allow holders of shares to verify that their shares are valid and consistent; they protect against dishonest dealers creating invalid or inconsistent shares, and against participants modifying or lying about their shares.
As it currently is this suffers from something like the Byzantine general's problem. Attacks may be performed based on the fact that participants don't necessarily transition between states at the same moment. Error handling must be carefully considered and the SYNC round made more robust to compensate.
For instance if an adversary is able to convince an honest participant to restart while the rest of the participants keep going they could drop the number of honest participants below the secret sharing threshold and the protocol loses all security benefit. Additionally, if restarts can be caused after _any_ honest participant has revealed then that's equally exploitable; an attacker could wait for the first honest reveals, calculate the result, and then cause an error that triggers a restart if they didn't like it [provided they're fast enough]. These are possible because participants aren't psychic so they don't immediately know if somebody has revealed or reported an error.