-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Hello,
Saw the content of this section in master was corrected, yet the subtitle is little confusing:
4.1.6. Including the ed25519 shared randomness key in votes [SRKEY]
- From the content of this section I understand that we are going to include the ed25519 medium term signing key, certificate and master identity key. The content is clear, but maybe we should change the subtitle too, since there's no SR key:
4.1.6. Including the ed25519 medium term signing key and master identity key in votes [ED25519ID]
Edge cases are the main reason I suggested in my previous emails to require at least 2 or 3 reveal rounds in order to allow a dirauth to participate in the shared randomness calculation for that day. However this won't help in case a dirauth needs to vote at 01:00 UTC and doesn't know anything.
The idea of adding flags in the votes so each dirauth can advertise if it is participating (has an opinion for the <current> SR or not) is great and helps us build more defenses, probably make it easier in the future too if we decide to change anything.
What if the consensus for SR calculation would define majority based on dirauths actually participating (and advertising so with a flag in the vote). Also, the participating or not participating flag should be used per vote/consensus and split into:
a) we know current SR value for today so we vote it or we know previous SR value and we know for sure if we should follow the disaster protocol or not (in case we are about to vote at 01:00 UTC). so We participate in the vote for <current SR>.
b) we are able to participate in this protocol run which will calculate the SR value for next day (after 00:00 UTC) so we send our commits/reveals.
This is useful in case we are a dirauth that joined at 00:30 UTC and we couldn't get the _latest_ consensus (to find out if the 00:00 UTC consensus was created, and if not, previous SR value so we can follow the disaster procedure) we will not have an opinion for the <current> SR value at 01:00 UTC, but we can start participating in the protocol run for the next day - send our commit values. Once we decided on a <current> SR value for that day we save it and vote normally next time.
So, if we have 5 dirauths running/signing consensus in total, out of which only 4 participate in the shared randomness protocol, the 4 participating ones should be able to create a valid consensus themselves with the insurance that the 5th one won't break consensus.
One way to do this is: the dirauth which is not participating will take the SR value voted by the majority of the participating dirauths and include that in its consensus and sign. We need at least 3 dirauths agreeing on a SR value in order to accept it.
Is this crazy? It shouldn't open the door new attacks, since this doesn't allow a single actor to game it, only the majority could game it.
Some more comments inline.
On 11/12/2015 4:25 PM, George Kadianakis wrote:
Hello there believers of prop250,
you can find the latest version of the proposal in the upstream torpec repo: https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-c...
Implementation is also constantly moving forward as you can see in the ticket:
https://trac.torproject.org/projects/tor/ticket/16943
Now that we have ironed out the whole voting procedure, it's time to finish this up by figuring out the last details of the shared random value calculation.
The logic in the current proposal seems reasonable (see section 3.3), but I have some doubts that I wanted to communicate with you.
- I'm afraid of edge cases where different authorities will
calculate different shared random values. This is bad because if it happens at the wrong moment, it might break the consensus.
For example, imagine that there are 5 authorities that are doing the prop250 protocol. Since there are 5 auths, they can create a valid consensus on their own with SR info in it.
Now imagine that one of them, Alice, has a different view of the previous_SRV than the others (maybe because she _just_ booted up before voting and she doesn't have a previous consensus). In this case, Alice will calculate a different SRV than the other 4, and hence the consensus will break because 5 signatures could not be collected.
Is this something likely to happen?
If yes, a way to protect against it is to add a flag on the votes denoting support for the SR protocol, and have dirauths toggle it off if they are voting for 00:00UTC and they don't know the previous_SRV or if they don't have a consensus. This can also be used as a torrc-enabled killswitch, if the SR protocol bugs out completely and we need to disable it for the sake of the network. What do you think?
- Another bothersome thing is the disaster SRV calculation.
Specifically, the proposal says:
=============================== prop 250: ==================================== If the consensus at 00:00UTC fails to be created, then there will be no fresh shared random value for the day.
In this case, and assuming there is a previous shared random value, directory authorities should use the following construction as the shared random value of the day:
SRV = HMAC(previous_SRV, "shared-random-disaster")
where "previous_SRV" is the previous shared random value.
this logic is not implemented in the current code, and it's not straightforward to implement. Again because the previous_SRV is blended in the formula, but also because it's not easy to know that "the consensus at 00:00UTC fails to be created".
For example, if you are a dirauth that just started up at 00:30UTC, and you asked for the previous consensus and you were given the 23:00UTC consensus, then you won't know that the 00:00 consensus was not created and that you need to do the disaster procedure. This will again break the consensus.
Not sure if this is a likely scenario as well, and if we should protect against it. What do you think?
It depends on the logic that authorities have for fetching consensuses. Do they ensure that they always have the latest consensus? Do we need to add such logic as part of prop250? :/
I don't think I understand this the way I should. If we join at 00:30 UTC, instead of asking for previous consensus, why don't we ask for _latest_ consensus from every other dirauth? And if we are given the 23:00 UTC consensus at 00:30 UTC, we know the consensus at 00:00 UTC was not created and we need to follow the disaster procedure.
If we have the 23:00 UTC consensus, we know the previous SR value so we can participate. If we couldn't get it, we advertise that we are not participating and sign whatever the participating majority agrees so we don't break consensus.
A way to protect against it is to use the "SR support" vote flag I talked about before, and toggle it off if you are a dirauth and you don't have the latest consensus. Terrible? Would that even allow dirauths to bootstrap SR?
I don't see why would this be terrible. What's a plausible thing that can happen so a dirauth can't get the latest consensus?
- Another interesting part of prop250 is:
=============================== prop 250: ==================================== If the shared random value contains reveal contributions by less than 3 directory authorities, it MUST NOT be created. Instead, the old shared random value should be used as specified in section [SRDISASTER]. ===============================================================================
do you think this is useful?
The fact that we use consensus methods, ensures us that at least 5 dirauths understand the SR protocol, otherwise we don't do it. Should we care about the number of reveal values? And why should the constant be 3, and not 5 or 2?
Yes, I think this is useful and 3 is a fair constant, especially combined with the participating or not participating flags. I guess the argument here is that it should be quite hard to have 3 dirauths colluding for an attack.
Those were my doubts.
Sorry for the extra confusion, but I'm currently reading the proposal and trying to minimize the amount of edge cases that can happen in the implementation (especially if they result in breaking the consensus for everyone). Maybe I'm trying too hard, and there are already multiple such edge cases in the consensus protocol that just never happen and are not worth fixing.
Your concerns make sense to me, however this could also be true. I don't know enough to confirm or infirm it, but looking forward for more comments.
In any case, the shared random value calculation seems to be the last piece of the puzzle here, so let's figure it out and finish up!
Thanks!