On 15 Mar (18:13:02), George Kadianakis wrote:
Hello,
I took a look at proposal 224 again, with the aim of revisiting the cell logic and format.
Here are some matters that require discussion:
- Should we keep backwards compability with old introduction and rendezvous points?
Currently, proposal 224 actually tries to maintain backwards compatbility but at the cost of complicating the design. Specifically, to achieve backwards compatibility we need at least the following functionality:
- Hidden services need to generate and publish an extra encryption key for each legacy introduction point (legacy enc-key). - Hidden services need a new cell subtype to register themselves to legacy introduction points (LEGACY_EST_INTRO). - Clients need a new cell to introduce themselves through legacy introduction points (LEGACY-INTRODUCE1)
The above features are not extremely hard to implement, but because of their hacky backwards-compatible nature they do complicate the protocol and the code. Also, at some point when the network has upgraded we will have to rip this code off our codebase, otherwise it will just rot there. Also also, we will have to write non-trivial chutney tests to ensure the correctness of the backwards compatibility logic.
Alternatively, we could choose to completely drop backwards compatibility with old introduction and rendezvous points. This means, that we will have to wait till a good part of the network has upgraded before we enable prop224 support for clients and services. This means that we will probably have to wait a whole release cycle (till the relay-side prop224 code becomes stable; is that 6 months?) before we can globally enable the client/HS prop224 functionality (although we can use the network ourselves for testing prop224 till then).
Ok, I'm against backward compatibility but let's think about it for a minute in terms of how it affects the time frame of having 224 deployed and used.
Let's assume we drop it in this scenario. We would need to have the relay _and_ HSDir support by 0.2.9 (September-ish) and then client and service support by 0.2.10 or later. This means that cell and hsdir would need to be incredibly tested and we have very little room for mistakes (we can rely on minor release for some fixes though).
Looking at: https://metrics.torproject.org/versions.html?start=2015-01-01&end=2016-0...
... we can see how fast ~half of the network upgrades to latest stable, roughly 6 months. This means that by 0210, in the above scenario, we'll maybe have already half of relays being able to be used for 224. This could maybe be a partitioning concern if the HS traffic only goes to half the nodes for a somehow "long" period of time.
Even if we would wait until 0.2.11 for instance (two versions after 029, the initial relay support release), the metrics graph shows us that 0.2.4 is still quite in use (vs 0.2.7 latest stable)...
Now, let's keep in mind also that whatever decision, the HSDir side _needs_ to be supported for 224. There are no backward compat for that so in either cases, we'll have a subset of relays that can be used for 224 at first for HSDir. Which means that backward compat or not, we'll need to at the very least have the HSDir support code deployed in a first stable release so when the full client/service supports is ready, around half of the network is ready to store descriptors.
Not dropping the backward compat. will make us use all relays for IP purposes, that's the upside but we would still need HSDir support in the first place using half of the network. And most of the traffic goes through the RP for which we can use all relays since there is no change on that front.
To summarize, if we are concern about IP partitioning, I would say go backward compat. and if not, we should drop it at the expense of releasing relay cell in a previous stable release and then client/service support in a later release (which is what we have to do for HSDir anyway...).
Right now, because of the HSDir partitioning, I'm not to concern for the IPs... (I could be very wrong here, more eyes on that is required.)
What do you people think we should do here?
Paradoxically, I'm currently thinking of _keeping_ the backwards compatibility design. Looking at the spec it seems like a medium difficulty engineering issue for us (maybe an 8% of the total prop224 task size), which sucks, but at least we don't have to worry about doing proper incremental deployment of prop224 on the network and worrying about release cycles. Also, as we move towards implementing prop224 cells, we can reevaluate our position here. I'm not confident about my position here, so feedback would be helpful.
- I'd like to simplify the ESTABLISH_INTRO logic.
Currently, ESTABLISH_INTRO seems like a needlessly _complex_ cell that is also _incomplete_.
It's _complex_ because it takes 3 different forms depending on the value of its first byte. This complexity is caused partially by our backwards compability needs (see above), but also because we tried to cram the MAINT_INTRO message into this cell.
It's _incomplete_ because it does not actually contain the "introduction point encryption key", so hidden services are forced to send the encryption key right after the initial ESTABLISH_INTRO cell using a second ESTABLISH_INTRO cell that is actually a MAINT_INTRO/UPDATE-KEYS-SUBCMD message.
I have two suggestions here:
- Let's include the intro point encryption key in the ESTABLISH_INTRO cell, so that hidden services can establish intro with a single cell (not for performance, but for simplicity).
And that encryption key is useful for the IP only if we want to use the feature introduced (UPDATE-KEYS-SUBCMD) in 224 which is for an HS to be able to update that key while maintaining the IP circuit so client don't have to download a new descriptor.
I'm still puzzled about this feature. I *think* it's there because we planned an hypothetical load balancing feature that could use that in the future? IP circuit (client side) are very short live so that means when a client shows up with the wrong encryption key, the HS would have to send that MAINT_INTRO "cell" (arbitrarly?) which will make the client re-INTRODUCE itself with the new key and then the dance continues. I'm a tad concern about this because it allows a client to trigger distinctive cell behavior from the service (yet another one...). Also, I don't think we should put more load on the IP/Service side instead of the HSDir/Client (the need to download a new descriptor). I think the INTRODUCE dance should be as light as possible because it's the entry point for client which ends off the rest of the protocol to the HS after that (where we can have load balacing strategies at the RP step).
In other words, instead of doing a 3 hop circuits to the HSDir, getting the new descriptors and then establishing the IP with the correct keys (for the next 24h or so), we prefer doing two round trips of INTRODUCE (3 hops to IP + 3 hops to HS) to avoid a new descriptor being downloaded by the client...? Which will be the exact same behavior the next time that client connects? Or is the client should remember that key in a yet-another-cache? Unless I missed something, I don't think we win by adding that layer of complexity...
- Let's introduce a new cell type for MAINT_INTRO instead of cramming it into ESTABLISH_INTRO. Or at least, let's make it an extension of ESTABLISH_INTRO instead of using the first byte of the cell to get the cell subtype.
I honestly need more arguments for this cell (see comment above basically) and it pretty much covers 3) which would definitely drop the ENC_KEYID in the INTRODUCE1 cell.
Thanks! David
What do you think?
Also, this brings me to the next topic which is:
- What is UPDATE-KEYS-SUBCMD good for? And why do intro points need to know the
intro point encryption key?
UPDATE-KEYS-SUBCMD seems to be the only use of MAINT_INTRO currently. It seems to be able to update the encryption keys of an introduction point circuit on the fly.
But why does the introduction point need to know the encryption key in the first place? That key is only used by clients and hidden services to encrypt stuff end-to-end to each other.
After discussing with dgoulet, the only reason I can think of is that so that the IP is aware of the encryption key, and if an incoming client Alice does not know the correct encryption key, then the IP can send it to her using an INTRODUCE_ACK message with [00 02] (and then Alice does not need to refetch the descriptor).
But why would a client know the authentication key but not the encryption key? Do they have different rotation times? Why would the encryption key rotate before the authentication key?
Maybe all these things are not necessary for now and we can just ditch UPDATE-KEYS-SUBCMD completely, assuming that both of those keys have the same rotation lifetime? And maybe even the IP does not need to know the encryption key at all?
Am I missing something?
Cheers! _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev