Re: [tor-dev] Revisiting prop224 client authorization

3 Nov 2016

      teor wrote:
...
...
On 3 Nov. 2016, at 10:37, s7r s7r@sky-ip.org wrote:
I am very happy with the torspec patch.
Not quoting entirely, only want to add something wrt randomizing the
value for fake clients based on David's and teor's comments:
David Goulet wrote:
[SNIP]
...

I think "superencrypted" -> "super-encrypted" would be nicer as everything

in the descriptor as that separation of word. Or even "client-encrypted" if
 we want to add extra semantic. No strong opinion apart from the "-" :).
I prefer super-encrypted vs. client-encrypted.
...

[XXX consider randomization of the value 16]

If it's fixed, we basically create bucket so a client can know that there
 are 0-16 clients or 16-32 clients and so on.
If we randomize that value and let's say it's 7 then we have bucket of 7. If
 that value is randomized _every_ new descriptor, we create multiple size of
 buckets but over time someone could deduce (maybe) the low bound of clients
 by observing all random values and thus assume there are 0-<low bound>.
I'm uncertain here what's best but seems that in any case, bucketing is
 happening as we pad with fake "auth-client". So I would assume here, out of
 my head to be safe, that we might want _all_ services to kind of look the
 same thus a fixed value would make sense following that train of thought.
I'm liking the rest here! We'll have to think also on some padding in the
INTRODUCE1 cell to avoid leaking client auth is being used.
This is true, we create buckets no matter what, but I think it's better
if one has to watch a hidden service for a lot more time to determine
the probable number rather than being able to tell from the first
descriptor that there are 0-16 clients, 16-32 clients and so on.
I fully agree that randomizing _every_ new descriptor does not help and
probably in short time someone could deduce a possible number, but I am
slightly uncomfortable with a global fixed value for this. One more
idea, if it's not helpful we can just go ahead with a fixed value of 16.
I think it's better if we pick a random number between 8 and 32 fake
clients and remember the picked value so it will be used for every new
descriptor until something in our setup changes or enough time has
passed. In order to know when to reset it, we save it (in our state)
along with:

The number of real authorized clients when the random value was picked.
Timestamp when the random value was picked + an end of life for the

random value.
We reset the random value of fake authorized clients and also its end of
life when:
a) number of real authorized clients in torrc changes from what we have
in our state.
b) end of life for the random value is reached. End of life will be
timestamp + a random period between 30 and 90 days.
c) obvious case when Tor is re-installed and old state is lost.
We call this function on every HUP and (re)start. We can tune the
numbers 8 - 32 and period 30 - 90 days as you like.
This way there are a lot of buckets and significantly more time needed
for an observer to deduce a probable number. It is quite possible one
can never deduce a "probable enough" number.
We combine this with faking extra if needed in the encrypted portion to
the next multiple of 10k bytes.
It's true that it won't help if the hidden service operator changes the
number of authorized clients every hour for a long period but in
practice this doesn't happen - number of authorized clients changes
rarely. And even in this scenario it still makes things a lot more
confusing.
Compared to other parts of prop 224, this is easy to code and should be
worth the effort. What do you think?
If you want to do it this way, with noise and buckets, ask someone who is
good at differential privacy to do the numbers for you, rather than guessing.
You'll need to know the level of activity you want to hide.
T
As I said the numbers can be changed - I was illustrating an example. I
guessed some numbers that seamed reasonable to me so I could give an
example, and also because it's not a critical part. We only try to hide
the number of real authorized clients, or make it as hard as possible
for an observer to deduce a number close to the realistic number of
authorized clients, that's all.
Simply using the numbers that were guessed without deep knowledge in
differential privacy is a lot better than using a global fixed value of
16, but as I said this doesn't need to be a debate because I am not
against the fixed value, only saying it's better to randomize, if the
solution exists.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Revisiting prop224 client authorization