David Goulet:
On 21 Jan (10:28:16), Mike Perry wrote:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis: > I have mixed feelings about this. > > - If client guard discovery is the main reason we are doing this, > I think we should first look into these guard discovery vectors > individually and figure out how concerning they are and if there > is anything else we can do to block them,
<snip> > > > > > > Hsdir post/fetch: > > 1. C - L - M - S - E - H > > 2. C - L - S - E - H > > 3. C - L - S - H > > > > Intro: > > 1. C - L - M - S - E -- I - S - M - L - H > > 2. C - L - S - E -- I - S - L - H > > *3. C - L - S -- I&S - L - H (* IP Intersection attack!) > > > > Rend: > > 1. C - L - M - S - R -- E - S - M - L - H > > 2. C - L - S - R -- E - S - L - H > > 3. C - L - R&S -- S - L - H > > > > What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
<snip>
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Another issue with Rend#3 is that the hidden service will be able to link client visits (for a Short while) using the client R&S as a selector.
I am inclined to accept this risk, since really as you said it is not a sure shot. You need a lot of connections before the S set is reused enough to indicate it's the same client, and then even with each subsequent individual visit all you have is probability bias, not a sure sign. I'm inclined to think that this partial linkability leak is is acceptable enough risk to say "Well yeah, lower security for better latency. You're getting what you're paying for."
Here is my two cents on Rend#3 (hopefully if I understand correctly the R&S concept).
Currently, a client does reuse a RP circuit for the same .onion. So, let's say TBB opens blah.onion and has 25 HTTP GET to do for blah.onion, one single RP circuit will be used. Then I _think_ that circuit is closed after a while by timeout if unused. Then user goes back after an hour to blah.onion and will reopen a circuit with a _new_ R&S which is well below the 12 hour rotation period. And those that for some times but before rotation of R&S happens (btw, I do that _ALL_ the times, I open an HS and refresh it like once every 30 minutes which I think makes me use a new RP node every time.)
This means that if L is malicious, it will see from C some connections always going to the same subset of relays (prop247 defines S as 4*4 nodes). This is _small_ and easy to know that it's indeed a client connecting to a low security HS. I mean just inducing circuit failure here at L will be enough without having a legitimate use case that actually also allow the linkability.
Ah, yes, I think you're right here.
However, that could be mitigated by an application keeping the RP circuit alive by sending heartbeats for instance (Ricochet?) since even though the set of R&S rotates after 12 hours, a circuit using an R&S node that is being rotated should NOT be killed in any circumstances making it "long lived" which makes it much more difficult for L to learn anything. But this is putting lots and lots of responsibility on the application side :S but again it's low security and should only be used by people knowing what the hell they are doing. Is it what we want?
I don't know. I think pushing solutions off to the application layer is dangerous. One option that might make it safer is to make these addresses single-use or limited-use only (for ex: for a voice call or file xfer negotiated over a higher security long-term circuit), and enforce that in Tor itself.
Thus, I'm incline to say that Rend#3 is fine but in a controlled environment... although the fact that the HS doesn't use an E before connecting to R could be dicy. As an attacker client controlling R, I make your HS fail every circuit until you go through all your S. Chances are I do not control one of your S at first but then 12h later you rotate so I can do that for days (<90 days) until you pick a S I control and game over, I learn L.
S size is 4*4 and rotation happens twice a day so 160 rotations for 80 days (leaving a 10 days to raid your L) == potentially 2560 nodes being tested. That is an awful big chance of you picking my S. Multiple ways to mitigate that but it comes down to special cases in the code :S.
What I really like about Rend#2 though is symmetry. Implementation wise it's much more easier (using mike's argument) and we treat both the client and HS at the same security level. However, tradeoff is that if anything happens to our design in the future and both side will be affected probably meaning deanonymizing C and H with one single technique (currently the case anyway).
After thinking about it, I think I am also leaning towards Rend#2 for low-security also for traffic analysis reasons. If we have 3 hops on each side, we get to make use of padding to the middle in the same way for these circuits as we would for everything else.
OTOH, with only 2 hops, a malicious guard and a malicious R&S get to know where you are going with some probability that is a function of the base rate of connections from the service's S to the client's R&S node. This base rate may be actually very low in practice, making it more certain that a particular extend is for a certain low-sec service rend connection. This is compounded by the fact that it is possible to at least know the set of S's that correspond to a given low-sec hidden service using lots of connections/circuit failure, as you said.
If we restrict Rend#3 to single-use, though, we *might* be able to disguise the rend handshake as another circuit extension from the service side, but the client side doesn't have this option.
Is single-use (or limit to N uses) too much of a restriction for low-sec services? Or do we defer it to later, and only implement Rend#1 and Rend#2 for now? I suppose if we do go with single-use, we still have to think about how long to keep the vanguards for it around. That also seems complicated and maybe application-dependent, unless we make the usage limits global or something..
In total, it seems like analysis simplicity, implementation simplicity, and safety are favoring Rend#1 and/or Rend#2 right now. We could implement the Rend#3-yolo option later when we understand a bit more about how to work with vanguards, or just make it available to controllers by exposing HS path construction to their control, should they want it?
FWIW, I am similarly pessimistic about combining s7r's detector based on the current proposal text, since even if full DoS is not possible, it is certainly possible to influence probability of RP selection if that proposal's detector is able to influence routes (and also to evade detection when the discovery attack is actually mounted). It feels a lot like the path bias detector but with worse properties due to adversarial circuit+node selection control... And even the original path bias detector really was only meant as a stopgap until something like https://gitweb.torproject.org/torspec.git/tree/proposals/261-aez-crypto.txt can be implemented.