Re: [tor-dev] Shared random value calculation edge cases (proposal 250)

17 Nov 2015

      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hello,
Saw the content of this section in master was corrected, yet the
subtitle is little confusing:
4.1.6. Including the ed25519 shared randomness key in votes [SRKEY]
- From the content of this section I understand that we are going to
include the ed25519 medium term signing key, certificate and master
identity key. The content is clear, but maybe we should change the
subtitle too, since there's no SR key:
4.1.6. Including the ed25519 medium term signing key and master
identity key in votes [ED25519ID]
Edge cases are the main reason I suggested in my previous emails to
require at least 2 or 3 reveal rounds in order to allow a dirauth to
participate in the shared randomness calculation for that day. However
this won't help in case a dirauth needs to vote at 01:00 UTC and
doesn't know anything.
The idea of adding flags in the votes so each dirauth can advertise if
it is participating (has an opinion for the <current> SR or not) is
great and helps us build more defenses, probably make it easier in the
future too if we decide to change anything.
What if the consensus for SR calculation would define majority based
on dirauths actually participating (and advertising so with a flag in
the vote). Also, the participating or not participating flag should be
used per vote/consensus and split into:
a) we know current SR value for today so we vote it
or
we know previous SR value and we know for sure if we should follow the
disaster protocol or not (in case we are about to vote at 01:00 UTC).
so
We participate in the vote for <current SR>.
b) we are able to participate in this protocol run which will
calculate the SR value for next day (after 00:00 UTC) so we send our
commits/reveals.
This is useful in case we are a dirauth that joined at 00:30 UTC and
we couldn't get the _latest_ consensus (to find out if the 00:00 UTC
consensus was created, and if not, previous SR value so we can follow
the disaster procedure) we will not have an opinion for the <current>
SR value at 01:00 UTC, but we can start participating in the protocol
run for the next day - send our commit values. Once we decided on a
<current> SR value for that day we save it and vote normally next time.
So, if we have 5 dirauths running/signing consensus in total, out of
which only 4 participate in the shared randomness protocol, the 4
participating ones should be able to create a valid consensus
themselves with the insurance that the 5th one won't break consensus.
One way to do this is: the dirauth which is not participating will
take the SR value voted by the majority of the participating dirauths
and include that in its consensus and sign. We need at least 3
dirauths agreeing on a SR value in order to accept it.
Is this crazy? It shouldn't open the door new attacks, since this
doesn't allow a single actor to game it, only the majority could game it.
Some more comments inline.
On 11/12/2015 4:25 PM, George Kadianakis wrote:
...
Hello there believers of prop250,
you can find the latest version of the proposal in the upstream
torpec repo: 
https://gitweb.torproject.org/torspec.git/tree/proposals/250-commit-reveal-c...
Implementation is also constantly moving forward as you can see in the
ticket:
...
https://trac.torproject.org/projects/tor/ticket/16943
Now that we have ironed out the whole voting procedure, it's time
to finish this up by figuring out the last details of the shared
random value calculation.
The logic in the current proposal seems reasonable (see section
3.3), but I have some doubts that I wanted to communicate with
you.

I'm afraid of edge cases where different authorities will

calculate different shared random values. This is bad because if it
happens at the wrong moment, it might break the consensus.
For example, imagine that there are 5 authorities that are doing
the prop250 protocol. Since there are 5 auths, they can create a
valid consensus on their own with SR info in it.
Now imagine that one of them, Alice, has a different view of the
previous_SRV than the others (maybe because she _just_ booted up
before voting and she doesn't have a previous consensus). In this
case, Alice will calculate a different SRV than the other 4, and
hence the consensus will break because 5 signatures could not be
collected.
Is this something likely to happen?
If yes, a way to protect against it is to add a flag on the votes
denoting support for the SR protocol, and have dirauths toggle it
off if they are voting for 00:00UTC and they don't know the
previous_SRV or if they don't have a consensus. This can also be
used as a torrc-enabled killswitch, if the SR protocol bugs out
completely and we need to disable it for the sake of the network.
What do you think?

Another bothersome thing is the disaster SRV calculation.

Specifically, the proposal says:
=============================== prop 250:
==================================== If the consensus at 00:00UTC
fails to be created, then there will be no fresh shared random
value for the day.
In this case, and assuming there is a previous shared random value,
directory authorities should use the following construction as the
shared random value of the day:
SRV = HMAC(previous_SRV, "shared-random-disaster")
where "previous_SRV" is the previous shared random value.
this logic is not implemented in the current code, and it's not 
straightforward to implement. Again because the previous_SRV is
blended in the formula, but also because it's not easy to know that
"the consensus at 00:00UTC fails to be created".
For example, if you are a dirauth that just started up at 00:30UTC,
and you asked for the previous consensus and you were given the
23:00UTC consensus, then you won't know that the 00:00 consensus
was not created and that you need to do the disaster procedure.
This will again break the consensus.
Not sure if this is a likely scenario as well, and if we should
protect against it. What do you think?
It depends on the logic that authorities have for fetching
consensuses. Do they ensure that they always have the latest
consensus?  Do we need to add such logic as part of prop250? :/
I don't think I understand this the way I should. If we join at 00:30
UTC, instead of asking for previous consensus, why don't we ask for
_latest_ consensus from every other dirauth? And if we are given the
23:00 UTC consensus at 00:30 UTC, we know the consensus at 00:00 UTC
was not created and we need to follow the disaster procedure.
If we have the 23:00 UTC consensus, we know the previous SR value so
we can participate. If we couldn't get it, we advertise that we are
not participating and sign whatever the participating majority agrees
so we don't break consensus.
...
A way to protect against it is to use the "SR support" vote flag I
talked about before, and toggle it off if you are a dirauth and you
don't have the latest consensus. Terrible? Would that even allow
dirauths to bootstrap SR?
I don't see why would this be terrible. What's a plausible thing that
can happen so a dirauth can't get the latest consensus?
...

Another interesting part of prop250 is:

=============================== prop 250:
==================================== If the shared random value
contains reveal contributions by less than 3 directory authorities,
it MUST NOT be created. Instead, the old shared random value should
be used as specified in section [SRDISASTER]. 
===============================================================================
do you think this is useful?
The fact that we use consensus methods, ensures us that at least 5
dirauths understand the SR protocol, otherwise we don't do it.
Should we care about the number of reveal values? And why should
the constant be 3, and not 5 or 2?
Yes, I think this is useful and 3 is a fair constant, especially
combined with the participating or not participating flags. I guess
the argument here is that it should be quite hard to have 3 dirauths
colluding for an attack.
...
Those were my doubts.
Sorry for the extra confusion, but I'm currently reading the
proposal and trying to minimize the amount of edge cases that can
happen in the implementation (especially if they result in breaking
the consensus for everyone). Maybe I'm trying too hard, and there
are already multiple such edge cases in the consensus protocol that
just never happen and are not worth fixing.
Your concerns make sense to me, however this could also be true. I
don't know enough to confirm or infirm it, but looking forward for
more comments.
...
In any case, the shared random value calculation seems to be the
last piece of the puzzle here, so let's figure it out and finish
up!
Thanks!
...PGP SIGNATURE...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (MingW32)

iQEcBAEBCAAGBQJWS2aOAAoJEIN/pSyBJlsRPCQIALUKqo1nvVTYV0WQqrlvnRpm
ilSulg+WZNuiyB/uxxTfk6DtmBz6oqwsO2hPwr5BPzJO8SYBHm7jSGxalTOUh0nR
MEgVbjRYMOJZGqECsioxjhdOqoB7p8oK+rhnSRmBy/HxTVqb6FkkGr5Psil+RrQL
JPOlkm6r0ptF10Fg+lVbYXyiM2GGB4Ggup76MOX4MZ0Lr12aWJmrLk17JUhXk2r5
k7akAREBhwmsHnkJ1XA27lVMcBYX9gz1IR85wDUgBFdf8WI3FDVck2MPUTsp2eai
xeLs6XAfvBfKcaQMolxsJ01rxUps0V8no8sjqOH4McdYJhXDfpdLnObFqoSj3no=
=wRgo
-----END PGP SIGNATURE-----

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Shared random value calculation edge cases (proposal 250)

where "previous_SRV" is the previous shared random value.