"A. Johnson" aaron.m.johnson@nrl.navy.mil writes:
Hi George,
Hello!
I recommend a change to the way that these statistics are obfuscated. The problem is that new noise is used every day, and from the distribution of the reported bins, the exact location within the bin (assuming the stat stats constant) can be reported.
Assuming that the underlying value is constant and since our Laplace distribution is public, the adversary can observe which bin is reported each time and get a probability distribution for the underlying value.
This indeed seems plausible under the powerful assumption that the underlying stat is constant.
So instead of this
+--------------+ +--------------------+
actual value -> |additive noise| -> |round-up obfuscation| -> public statistic +--------------+ +——————————+
I recommend that you flip the order, so that it is like this +--------------+ +--------------------+ actual value -> |round-up obfuscation| -> |additive noise| -> public statistic +--------------+ +——————————+
“Additive noise” in the context of bins is actually just a distribution over bins. You can think of it in two ways:
- Add Laplace noise to the bin center, and then report the bin of the resulting number.
Hm, you mean something like this, right?
+--------------+ +--------------------+ +--------------+ actual value -> | binning | -> | addditive noise | -> | binning | -> public statistic +--------------+ +——————————----------+ +--------------+
where the additive noise is applied to the center of the first bin?
I can see how this is better, since the underlying value gets immediately smoothed by binning. However, it does give me a weird hacky feeling...
Is this construction something that has been used before?
Thanks for the feedback!