On Mon, Oct 28, 2013 at 07:40:12PM +0000, Christopher Baines wrote:
On 28/10/13 13:19, Matthew Finkel wrote:
This is a proposal I wrote to implement scalable hidden services. It's by no means finished (there are some slight inconsistencies which I will be correcting later today or tomorrow) but I want to make it public in the meantime. I'm also working on some additional security measures that can be used, but those haven't been written yet.
Great, I will try to link this in to the earlier thread for some continuity.
Sounds good. For those just joining this discussion, the previous thread can be found at https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html
It seems to me that this is a description of "Alternative 3" in Nick's email. Multiple instances, with multiple sets of introduction points, somehow combined in to one service descriptor?
Yes, mostly. The proposal describes Nick's "Alternative 3", but there is no technical reason why the instances can not coordinate their IPs and share some subset, assuming some additional modicications to the HS design. This obviously was not included in the prop, but it would be easy to extend the protocol to include this.
I haven't managed to fully comprehend your proposal yet, but I though I would try and continue the earlier discussion.
That's fine, we can work through it, but you seem to understand it pretty well
So, going back to the goals, this "alternative" can have master nodes, but, can have you can also just have this "captain" role dynamically self assigned.
It's not really self-assigned, more like "assigned by the operator but it is a result of previous node failures". I think we can think of it as an "awareness" rather than an "assignment", in this scenario.
Why did you include an alternative here, do you see these being used differently? It seems like the initial mode does not fulfil goal 2 or 3?
Yes, I see the Master-Slave design as an alternative to the Peer-Nodes design. I think they do satisfy Nick's goal 3, but they obviously don't satisfy goal 2.
One of the differences between the alternatives that keeps coming up, is who (if anyone) can determine the number of nodes. Alternative 3 can keep this secret to the service operator by publishing a combined descriptor. I also discussed in the earlier thread how you could do this in the "Alternative 4: Single hidden service descriptor, multiple service instances per intro point." design, by having the instances connect to each introduction point 1, or more times, and possibly only connecting to a subset of the introduction points (possibly didn't consider this in the earlier thread).
Out of the 4 designs in the proposal, I think section 4.2, the Homogeneous shared-key peer-nodes design, is the best and is the most versatile (but the most complex, as a result). So, technically, our two proposals can be merged without much difficulty. However, there are still some issues that I'm having some trouble solving in a sane way. When you make the introduction point the focal point there are some tradeoffs. I'm still not sure if these are the right tradeoffs just to disguise the size of the hidden service.
Another recurring point for comparison, is can anyone determine if a particular service instance is down.
Absolutely. This is a problem I hope we can solve.
Alternative 4 can get around this by hiding the instances behind the introduction points, and to keep the information from the introduction points, each instance (as described above) can keep multiple connections open, occasionally dropping some to keep the introduction point guessing. I think this would work, providing that the introduction point cannot work out what connections correspond with what instances.
True, but this sounds extremely risky and error prone. I really hope we can do better than this to solve the problem.
If each instance has a disjoint set of introduction points, of which some subset (possibly total) is listed in the descriptor, it would be possible to work out both if a instance goes down, and what introduction points correspond to that instance, just by repeatedly trying to connect through all the introduction points? If you start failing to connect for a particular subset of the introduction points, this could suggest a instance failure. Correlating this with power or network outages could give away the location of that instance?
Sure, but as the proposal describes, the security of the multi-node hidden service is reduced to the security of the single-node hidden service. As such, the location-via-correlation attack you describe (there's probably a real/better name for it) is a result of the design, and I decided not to fix it in fear of introducing another, more dangerous, attack.
Also, compared to the setup of other alternatives, this seems more complex for the hidden service operator. Both in terms of understanding what they are doing, and debugging failures?
It is more complex, no argument there, but I don't think that it is unfair to impose this on the operator (nothing in the world is free). If an op wants to setup a multi-node hidden service, then it will not be much more effort than setting up a single-node service. If the application running behind the hidden service can support scaling, then its configuration will likely be a lot more complicated than configuring a multi-node HS. The way the proposal describes it, it's very repetitive. So, on the bright side, the operator will be very familiar with the torrc config lines when their they're done. :)
I think it would be good to partition the goals (as there are quite a lot (not inherently bad)). In particular, one subset of goals would be as follows:
Operator (the person, or people controlling the service) Usability
- Simple Initial Setup
From the proposal, every node will contain two hidden services, one for
the public hidden service, and one used for inter-node communication. I don't think this will be a barrier for entry as the operator will likely be following step-by-step directions, in any case.
- Simple Addition of new Instances
As soon as the management hidden service has been created on the new node, the operator simply appends it to the HiddenServicePeers line on every node. I agree that this will require more work than most people want to spend on something like this, but there are ways we can solve this, with a little more complexity in the protocol. For example, we can allow peers to sync configs with each other and then rewrite portions of the torrc. This is something I am very hesitant to do, though.
- Simple Removal of Instances
The opposite of addition, just remove the hidden service address from the HiddenServicePeers line on each node.
- Graceful behaviour regarding instance failure (with respect to the
operator)
- Well defined failure modes
- If something doesn’t work, it should be possible for the operator
to work out what, and how to resolve it.
This is tricky but doable within the scope of this proposal. The tricky part arises when returning any type of error code/message in the event of an authentication/validation failure. But we can still produce useful information that an operator can use to troubleshoot the configuration.
As an example, say we have a multi-node configuration that contains nodes A and B. The HiddenServicePeers line in A's torrc contains the hidden service for B but the HiddenServicePeers line in B's torrc does not contain A, as a result when A tries to connect to B the circuit is destroyed during authentication. The two failure reasons here are that B doesn't have A's hidden service descriptor or A is not in B's torrc. It is very easy for an operator to rule out the latter case, and log messages from B should allow the operator to determine any other problem.
I think it's a given that this configuration is much more complex than the one you proposed, but the failure mode is not much worse because in both proposals *some node*, somewhere, will always publish a descriptor. It will certainly be difficult for the operator to determine which node actually published it (without adding a new mechanism to Tor for this) and the load balancing may not work as expected until the operator investigates and fixes it. But, as far as I can tell, neither of them fail closed, for better or worse. I'm reanalyzing my design to see if I missed a case that should require it.
Now, obviously, these are minor compared to the more technical goals, but I think they are worth considering, as we have a few technically workable proposals on the table.
As for what I am doing on this currently, I have been reading lots of the related, and not so related papers. I hope to begin doing some more Hidden Service related Chutney stuff this or next week, such that I have something to test with when I start implementing something (note: what I implement, might not be adopted/adoptable by the project, but I am doing it anyway as part of my degree). I am available on #tor-dev as cbaines for any and all discussion.
Awesome! Are you planning on revising your proposal and sending it to tor-dev again? I know I am interested in seeing what changes you decided to make.
Thanks for your feedback and thoughts on information leakage!