Re: [tor-dev] Proposal 247: Alternate Path Lengths

21 Jan 2016

      Mike Perry mikeperry@torproject.org writes:
...
George Kadianakis:
...
Mike Perry <mikeperry at torproject.org> writes:
<snip>
I have mixed feelings about this.

If client guard discovery is the main reason we are doing this, I think

we
  should first look into these guard discovery vectors individually and
figure
  out how concerning they are and if there is anything else we can do to
block
  them,
I agree this is worthwhile, if only to better understand the design
space.  However, I think we're going to find that most applications we
envision can be induced into violating many of the ad-hoc mitigations we
try to bake in.
OK. Let's see. I feel that these guard discovery attacks can be blocked with:
a) If an IP listed on an HS descriptor tells you that it doesn't know the HS,
   then ignore it for this hidden service today.
b) If an HSDir that should have an HS descriptor tells you that it doesn't have
   it, then don't ask it again this hour.
I think we do both checks right now in the Tor codebase and we also have caches
so that we don't retry the same nodes. If we are serious, we could even write
those caches on disk.
I feel that if an application restarts Tor or flushes those caches because a
hidden service does not work, then the application is doing it wrong.
Also even with client vanguards I think the checks above will still have to be
implemented. I could imagine an application that flushes all the DataDirectory
if the hidden service stops working, and then even vanguards won't save them.
In general, I'm not sure how much sanity we can assume from third-party
applications.
...
...
before complicating path selection even more.
I feel like you're actually going to end up complicating the
implementation more with this position. If we have to have separate path
selection modes for service side and client side, we then have to
maintain three different path selection mechanisms in Tor: normal exit,
onion services, and onion clients.
If we gave the same options for both hidden services and clients, we are
at least down to two systems (exit vs non-exit), with some minor options
for each.
Hmmm maybe. But onion clients would look very much like normal exit, but they
would connect to RPs/IPs, instead of exits. Just like the code is now.
Also, with vanguards if we end up doing something like:
HSDir: C - L - S - E - HSDir
                IP: C - L - S - E - IP
                        Rend: C - L - M - RP -- S - M - L - HS
we have three different path types here. We would need to write very beautiful
interfaces if we want this to be done by the same code.
...
...

Also, I like symmetry myself, but I wouldn't change path selection and
security just for that _if I can help it_.

<snip>
...
Hsdir post/fetch:

C - L - M - S - E - HC - L - M - S - E - H
C - L - S - E - H
C - L - S - H

Intro:

C - L - M - S - E -- I   - S - M - L - H
C - L - S - E     -- I   - S - L - H

*3. C - L - S         -- I&S - L - H     (* IP Intersection attack!)
Rend:

C - L - M - S - R -- E - S - M - L - H
C - L - S - R     -- E - S - L - H
C - L - R&S       -- S - L - H

What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every
variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated
with any publicly viewable information, I don't think it is as big of a
problem. At best its a linkability risk for the client. But maybe I
missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on
the L node. So for example, in the crazy edge case where only one client
conncets to hidden services through R&S over L, then R&S could count "Ah this
client has done 42 rendezvous through me in the past 5 hours". And if that's a
ricochet client with 42 contacts maybe it's a selector. But I think this is a
pretty far fetched example...
Another _big_ gotcha here is that let's say we end up doing:
HSDir: C - L - M - S - E - HSDir
                IP: C - L - M - S - E - IP
                        Rend: C - L - S - RP -- S - M - L - HS
and all the 'S' nodes are taken from the same pool, then the 'L' node will be
able to learn 'M' by looking at the IP circuits, and learn 'S' by looking at
the
rend circuit. So it will basically be able to derive the full circuit.
We need to be very careful about which paths we pick, and which "guardsets" we
get the nodes from.
...
...
...
Looking at these, we can see that we sacrifice the middle guards in the
second option, which will come at the cost of one less compromise attack
(but still the need to compromise the long-lived guard). We also lose
the unlinkability in the third option, and this actually bites us in
Intro 3: the hidden service L guard can perform a long-term intersection
attack, watching for published intro points and matching that to the
circuits that H makes to them. So that path length probably should not
be used.
<snip>
...
However, I still have mixed feelings about changing client path selection
as
part of proposal 247:

My main issue is that I think figuring out the right client path

selection
  will require a _heavy_ amount of security analysis that will delay
prop247
  even more.  I was hoping that we could treat the client-side as an
orthogonal
  problem and tackle it in the future separately. But maybe I'm totally
wrong
  and should be more patient and these two problems should be handled
together.
I think patience is best, because if we don't understand this problem
really well, we're liable to miss something. Or cement ourselves off
from a potential future of interactive HS voice+video. Neither one is a
great failure mode.
Agreed.
...
I think for many applications (esp the browser and ricochet), we're
going to find that we need to protect the client just as much as the
server.
...

If the above changes only happen to HS circuits, we make it harder to

make HS
  circuits indistinguishable from normal circuits on the face of traffic
  analysis. But maybe we have already lost this game.
We already lost that game until we have multihop padding. Proposal
247 already outlines how to use it in section 4.1 to help conceal
vanguard usage.
It is also worth pointing out that if we fail to conceal the HS vanguard
fingerprint entirely with padding, it will be especially valuable to
have more than just 30k service-side instances with the vanguard
fingerprint. Far better to have all the clients in that anonymity set,
too, I think.
Yes that's true. This seems to be the main argument for doing client vanguards
right now for me.
However, to actually achieve any sort of confusion here, we need to ensure that
the paths between clients and HSes are symmetric. So for example if we end up
doing:
C - L - S - E -- IP  - S - M - L - H
then the L guard could distinguish clients from HSes by looking at whether the
second hop is short lived ('S') or medium lived ('M').
Woohoo! Anonymity!

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Proposal 247: Alternate Path Lengths