Re: [tor-dev] Revisiting prop224 time periods and HS descriptor upload/downloads

11 Apr 2016

      On 11 Apr (14:42:02), George Kadianakis wrote:
...
David Goulet dgoulet@ev0ke.net writes:
...
[ text/plain ]
On 04 Apr (19:13:39), George Kadianakis wrote:
...
Hello,
during March we discussed the cell formats of prop224:
  https://lists.torproject.org/pipermail/tor-dev/2016-March/010534.html
The prop224 topic for this month has to do with the way descriptors get
uploaded and downloaded, how this is scheduled using time periods and how the
shared randomness subsystem interacts with all that.
Here are some discussion topics. Lots of text on the first two, less text on the rest:
<snip>
In any case, this is how this might look like:
 +------------------------------------------------------------------+
 |                                                                  |
 | 00:00      12:00       00:00       12:00       00:00       12:00 |
 | SRV#1      TP#1        SRV#2       TP#2        SRV#3       TP#3  |
 |                                                                  |
 |   $         |-----------$-----======|-----------$-----======|    |
 |                            overlap12               overlap23     |
 |                                                                  |
 +------------------------------------------------------------------+

                                  Legend:    [TP#1 = Time Period #1]
                                             [SRV#1 = Shared Random Value #1]

<snip>

So now that we have ironed out the time period stuff slightly, let's discuss
the behavior that hidden services, clients and HSDirs should inherit.
This email is quite long already so I'm going to go with examples, instead of
formal specification. However, this stuff needs to go formally in the
proposal IMO, so any help in formalizing it would be great.

Hidden Service behavior:
Example 1: Our hidden service boots up at 14:00 of TP#1. In this case, we
 are nowhere close to the overlap period, so the hidden service should just
 publish its TP#1 descriptor to the HSDir hash ring using SRV#1 (which at
 that point should be in consensuses as "shared-rand-current-value").
The hidden service might also want to calculate its overlap OFFSET (as
 specified in [TIME-OVERLAP]) and schedule a time callback for publishing
 its TP#2 descriptors.
Example 2: Our hidden service boots up at 03:00 of TP#1. That's outside of
 the overlap period again, but this time the hidden service needs to use the
 SRV from "shared-rand-previous-value" because the SRV was rotated at midnight.
Example 3: Our hidden service boots up at 09:00 of TP#1. That's inside the
 overlap period, so the hidden service should calculate its overlap
 OFFSET and compare it with the current time.
If it has not passed, then we are in the exact same case as Example 2.
If the overlap OFFSET _has_ passed, then the hidden service needs to act
 as Example 2, and _also_ publish its TP#2 descriptors to a second set of
 HSDirs using SRV#2.
I think these are all the cases for the hidden service, but I would like to
formalize this in a way that can be written in the spec. Particularly, I'm
not sure how to formalize which SRV to pick at a given time point.

It sounds simple as:
"If we are before to the overlap period, use the time period shared random
value (TP1 == SRV1). If we are in the overlap period, upload two descriptors
using _both_ SRVs."
Plausible?
I'm not sure it's so simple. As it is now, there is no indicator connecting
time periods with shared random values, so "TP1 == SRV1" might make sense to us
but it's not something that can be implemented. How does the client know
whether to use "shared-rand-previous-value" or "shared-rand-current-value"?
Well, that's not entirely true. We know that TP and SRV have a 12h difference.
You also know, with the consensus valid-after time, when the SRV was created.
For instance, take the 03:00 valid-after consensus time, I can compute:
shared-rand-current-value:  created 3 hours ago.
shared-rand-previous-value: created 27 hours ago.
With the 12h shift between TP and SRV, it makes an SRV "lifetime" be 36 hours.
Here how: TP1 uses SRV1 12h after the SRV1 creation and will stop using it 24h
after thus 36h.
As a client, I get my 03:00 consensus and I want to know which SRV should I
use. I know that the previous SRV is 27 hours old which is < 36h so I should
use it.
New example, as a client, I get the 12:00 consensus, I can compute the
following:
shared-rand-current-value:  created 12 hours ago.
shared-rand-previous-value: created 36 hours ago.
The following doesn't match: previous SRV is 36 hours old < 36h lifetime
needed for the TP. So, I use the current SRV (in our example SRV2 for the
TP2).
(For the HS, you would simply need to take into account the overlap period
and use both SRV).
...
Here is an idea:
"A hidden service uploading its normal descriptor using a consensus with
 valid-after between 12:00UTC (inclusive) and 00:00UTC (exlusive), uses the
 _current_ SRV. A hidden service uploading its normal descriptor using a
 consensus with valid-after between 00:00UTC (inclusive) and 12:00UTC
 (exclusive), uses the _previous_ SRV.
A hidden service uploading its overlap descriptor, always uses the current SRV
 (assumming that the HS descriptor overlap period starts after midnight UTC)."
And the client equivalent:
"A client fetching a hidden service descriptor using a consensus with
 valid-after between 12:00UTC (inclusive) and 00:00UTC (exclusive), uses the
 _current_ SRV. A client fetching a hidden service descriptor using a consensus
 with valid-after between 00:00UTC (inclusive) and 12:00UTC (exclusive), uses
 the _previous_ SRV."
In both sections above, if the right SRV is missing from the consensus,
entities are supposed to use a fallback SRV value generated as specified in
section 2.3.1 of prop224.
FWIW, I don't like how I had to use hardcoded time values in the above
sections. That's because 12:00UTC is the $TIME_PERIOD_ROTATION_TIME and
00:00UTC is the $SHARED_RANDOM_VALUE_GENERATION_TIME. Maybe we could do this
without hardcoding $SHARED_RANDOM_VALUE_GENERATION_TIME, by adding expiration
times to the SRVs in the consensus and using those to choose the right SRV.
How else could we simplify this logic?
It seems simple enough. Maybe the algorithm I sketched out above makes it
simpler? Maybe not!... It's basically the _same_ end results as you.
The logic I sketched out above makes it that we would need parameters (from
the consensus) like so (or hardcode them):
- TIME_PERIOD_ROTATION_TIME (currently 12:00)
- TIME_PERIOD_[LIFETIME | SPAN | DURATION] (currently 24h)
- SHARED_RANDOM_VALUE_[CREATION | ROTATION]_TIME (currently 00:00)
- SHARED_RANDOM_VALUE_[LIFETIME | SPAN | DURATION] (currently 24h)
I doubt we can go simpler than that. Both algorithms have one single check
ending in two outcomes that is either use previous or current.
...
...
...
<snip>

HSDir behavior
Currently the spec says the following:
Hidden service directories should accept descriptors at least [TODO:
  how much?] minutes before they would become valid, and retain them
  for at least [TODO: how much?] minutes after the end of the period.
After discussion with David, we thought of chopping off the first part of
that paragraph and not imposing any such weak restrictions for accepting
descriptors (see #18332).
We still have not decided about the second part of that paragraph, that is
how long descriptors should be retained after the end of the period. We
currently think clock skew is the only thing that can bring clients to the
wrong HSDir after the end of the period. Maybe an hour is OK? David
suggested 12 hours. The current Tor is doing 48 hours... Any ideas?

It should at least be 24 hours (maximum possible) with an adjustment of at the
_very_ least the overlap period. If the overlap period is 6 hours, we can then
add the "maximum clock skew" we think is reasonable and we would end up with
an OK value imo.
Descriptor maximum lifetime:    24 hours
Overlap period span:            6 hours (taken from your diagram)
Maximum acceptable clock skew:  6 hours (dgoulet opinion!)
Thus we are talking of a 36 hours lifetime in the cache. Let's work with that
as a baseline :).
Hm, I see you are calculating the total lifetime here. How often do hidden
services refresh (reupload) their descriptor in this case? I think in the
current system, hidden services do so every hour. Do we keep this feature?
I think we can re-upload only when needed that is key rotation, IP rotation,
etc... No need to do that every hour (maybe).
...
Let's consider a hidden service that uploads a single descriptor during its
overlap period and then disappears completely: should the HSDir keep and serve
that descriptor for 36 hours? It's unlikely that the HS is still up and
maintaining its intro circuits if it can't keep on refreshing its descriptor.
The issue here is for the HSDir to notice that the HS might be gone? And we
can't rely on RendPostPeriod value since it's service side. So an operator
could litterally have set that to 7 hours meaning we might not see any new
revision counter for that period and still unable to tell if the HS is gone or
not.
This is why our best bet is to compute a "maximum crazy time" that descriptor
could be valid.
An other option is to add a valid-until field in the cleartext part of the
descriptor and the HSDir could use that to expire entries plus a clock skew
delta.
...
Also consider that whatever "maximum acceptable clock skew" we choose, the
hidden service needs to keep its introduction circuits up for that time as
well, otherwise the descriptor will be useless to the clock skewed clients.
Yup! This is why I think above 6 hours of clock skewed you won't do much as a
client... maybe even less!
...

FWIW, I'm personally not sure how to choose the best "maximum acceptable clock skew"
value here. My intuition tells me to choose a big number so that even very
skewed clients can visit hidden services. I see the following two negatives here:

Hidden services need to retain their old intro circuits for the duration of
the acceptable clock skew.

I pretty sure we don't do that currently. However, we could start doing that
and collect stats on how frequent it is and with how much skew! That would be
a very useful information to have imo.
...

HSDirs need to cache hidden service descriptors for the duration of the
acceptable clock skew.

Is there anything else I'm missing?
Cheers!
David

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Revisiting prop224 time periods and HS descriptor upload/downloads