Hello there believers of prop250,
you can find the latest version of the proposal in the upstream torpec repo: https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-c... Implementation is also constantly moving forward as you can see in the ticket: https://trac.torproject.org/projects/tor/ticket/16943
Now that we have ironed out the whole voting procedure, it's time to finish this up by figuring out the last details of the shared random value calculation.
The logic in the current proposal seems reasonable (see section 3.3), but I have some doubts that I wanted to communicate with you.
- I'm afraid of edge cases where different authorities will calculate different shared random values. This is bad because if it happens at the wrong moment, it might break the consensus.
For example, imagine that there are 5 authorities that are doing the prop250 protocol. Since there are 5 auths, they can create a valid consensus on their own with SR info in it.
Now imagine that one of them, Alice, has a different view of the previous_SRV than the others (maybe because she _just_ booted up before voting and she doesn't have a previous consensus). In this case, Alice will calculate a different SRV than the other 4, and hence the consensus will break because 5 signatures could not be collected.
Is this something likely to happen?
If yes, a way to protect against it is to add a flag on the votes denoting support for the SR protocol, and have dirauths toggle it off if they are voting for 00:00UTC and they don't know the previous_SRV or if they don't have a consensus. This can also be used as a torrc-enabled killswitch, if the SR protocol bugs out completely and we need to disable it for the sake of the network. What do you think?
- Another bothersome thing is the disaster SRV calculation. Specifically, the proposal says:
=============================== prop 250: ==================================== If the consensus at 00:00UTC fails to be created, then there will be no fresh shared random value for the day.
In this case, and assuming there is a previous shared random value, directory authorities should use the following construction as the shared random value of the day:
SRV = HMAC(previous_SRV, "shared-random-disaster")
where "previous_SRV" is the previous shared random value. ===============================================================================
this logic is not implemented in the current code, and it's not straightforward to implement. Again because the previous_SRV is blended in the formula, but also because it's not easy to know that "the consensus at 00:00UTC fails to be created".
For example, if you are a dirauth that just started up at 00:30UTC, and you asked for the previous consensus and you were given the 23:00UTC consensus, then you won't know that the 00:00 consensus was not created and that you need to do the disaster procedure. This will again break the consensus.
Not sure if this is a likely scenario as well, and if we should protect against it. What do you think?
It depends on the logic that authorities have for fetching consensuses. Do they ensure that they always have the latest consensus? Do we need to add such logic as part of prop250? :/
A way to protect against it is to use the "SR support" vote flag I talked about before, and toggle it off if you are a dirauth and you don't have the latest consensus. Terrible? Would that even allow dirauths to bootstrap SR?
- Another interesting part of prop250 is:
=============================== prop 250: ==================================== If the shared random value contains reveal contributions by less than 3 directory authorities, it MUST NOT be created. Instead, the old shared random value should be used as specified in section [SRDISASTER]. ===============================================================================
do you think this is useful?
The fact that we use consensus methods, ensures us that at least 5 dirauths understand the SR protocol, otherwise we don't do it. Should we care about the number of reveal values? And why should the constant be 3, and not 5 or 2?
Those were my doubts.
Sorry for the extra confusion, but I'm currently reading the proposal and trying to minimize the amount of edge cases that can happen in the implementation (especially if they result in breaking the consensus for everyone). Maybe I'm trying too hard, and there are already multiple such edge cases in the consensus protocol that just never happen and are not worth fixing.
In any case, the shared random value calculation seems to be the last piece of the puzzle here, so let's figure it out and finish up!
Thanks!