On 21 Jan (10:28:16), Mike Perry wrote:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
<snip> > > > > > > Hsdir post/fetch: > > 1. C - L - M - S - E - H > > 2. C - L - S - E - H > > 3. C - L - S - H > > > > Intro: > > 1. C - L - M - S - E -- I - S - M - L - H > > 2. C - L - S - E -- I - S - L - H > > *3. C - L - S -- I&S - L - H (* IP Intersection attack!) > > > > Rend: > > 1. C - L - M - S - R -- E - S - M - L - H > > 2. C - L - S - R -- E - S - L - H > > 3. C - L - R&S -- S - L - H > > > > What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
<snip>
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?
Another issue with Rend#3 is that the hidden service will be able to link client visits (for a Short while) using the client R&S as a selector.
I am inclined to accept this risk, since really as you said it is not a sure shot. You need a lot of connections before the S set is reused enough to indicate it's the same client, and then even with each subsequent individual visit all you have is probability bias, not a sure sign. I'm inclined to think that this partial linkability leak is is acceptable enough risk to say "Well yeah, lower security for better latency. You're getting what you're paying for."
Here is my two cents on Rend#3 (hopefully if I understand correctly the R&S concept).
Currently, a client does reuse a RP circuit for the same .onion. So, let's say TBB opens blah.onion and has 25 HTTP GET to do for blah.onion, one single RP circuit will be used. Then I _think_ that circuit is closed after a while by timeout if unused. Then user goes back after an hour to blah.onion and will reopen a circuit with a _new_ R&S which is well below the 12 hour rotation period. And those that for some times but before rotation of R&S happens (btw, I do that _ALL_ the times, I open an HS and refresh it like once every 30 minutes which I think makes me use a new RP node every time.)
This means that if L is malicious, it will see from C some connections always going to the same subset of relays (prop247 defines S as 4*4 nodes). This is _small_ and easy to know that it's indeed a client connecting to a low security HS. I mean just inducing circuit failure here at L will be enough without having a legitimate use case that actually also allow the linkability.
However, that could be mitigated by an application keeping the RP circuit alive by sending heartbeats for instance (Ricochet?) since even though the set of R&S rotates after 12 hours, a circuit using an R&S node that is being rotated should NOT be killed in any circumstances making it "long lived" which makes it much more difficult for L to learn anything. But this is putting lots and lots of responsability on the application side :S but again it's low security and should only be used by people knowing what the hell they are doing. Is it what we want?
Thus, I'm incline to say that Rend#3 is fine but in a controlled environment... altough the fact that the HS doesn't use an E before connecting to R could be dicy. As an attacker client controlling R, I make your HS fail every circuit until you go through all your S. Chances are I do not control one of your S at first but then 12h later you rotate so I can do that for days (<90 days) until you pick a S I control and game over, I learn L.
S size is 4*4 and rotation happens twice a day so 160 rotations for 80 days (leaving a 10 days to raid your L) == potentially 2560 nodes being tested. That is an awful big chance of you picking my S. Multiple ways to mitigate that but it comes down to special cases in the code :S.
What I really like about Rend#2 though is symmetry. Implementation wise it's much more easier (using mike's argument) and we treat both the client and HS at the same security level. However, tradeoff is that if anything happens to our design in the future and both side will be affected probably meaning deanonymizing C and H with one single technique (currently the case anyway).
I'm also really expecting these low security addresses to be most useful in P2P, where linkability is already kinda out the window (but can still be maintained if the addresses are ephemeral/short lived).
We should keep thinking about other issues, because unless there are other, additional problems with R&S, I don't think this one kills it.
One thing worth noting is that you definitely want separate vanguard sets for high security and low security services, of course. Prop#247 was already leaning towards separate vanguard sets for each service service-side address, in Section 4.2. That seems excessive now for the high security services, due to the addition of ephemeral hops, but keeping the low and high sets separate is necessary, as you pointed out with your other attack.
Here's another application-layer issue: if low security services exist, then the application will need some way to differentiate them before being induced to connect to them, especially repeatedly over time. For example, Ricochet should forbid you from accidentally using a low security onion address for a contact addr, and the browser should probably forbid all low security HS addresses from being used as content elements, unless the url bar is also a low security HS address. Both apps should probably specifically generate low security addrs for WebRTC/voice calls, though.
I think a similar argument could be made for also differentiating RSOS/SOS addresses, post-224.
For these reasons, I'm still really liking the idea of using those spare 224 address bits to indicate HS security level (and to additionally differentiate between RSOS/SOS). That would make it much easier for the application to avoid being tricked into using weaker addresses in bad situations. In fact, the Tor client could even default to forbidding the use of them unless the application specifically turns on support.
That could be a useful way of using those bits indeed! Versionning was also in list for this but that also can be fixed by using one more byte anyway...
David
-- Mike Perry
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev