On 06 Apr (17:08:12), Mike Perry wrote:
[snip]
I think you'll be bound by the amount of data a connection inbuf can take which has an upper bound of 32 cells each read event.
Then tor will have to empty at once the inbuf, queue all INTRODUCE2 cells (at most 32) in that priority queue and once done, we would process it until we return to handling the connection inbuf.
In other words, the queue size, with tor's architecture, is bound to the number of cells upper bound you can get when doing a recv() pass which is 32 cells.
Nevertheless, that limit is weirdly hardcoded in tor so you should definitely think of a way to upper bound the queue and just drop the rest. A good starting point would be that 32 cells number?
dgoulet and I think the following will work better than always doing 32 cells at a time. Basically, the idea is to split our INTRO2 handling into "top-half" and "bottom-half" handlers.
Top-half handling:
- read 32 cells off inbuf as usual
- do AES relay cell decryption as usual
- Parse relay headers, handle all cells as usual, except: a) in hs_service_receive_introduce2(), add to pqueue and return without further processing of those
- Return to rest of mainloop
Agree with the above. It is trivial to do this today so very low engineering cost.
Then, separately, also in mainloop, do the bottom half. (TODO: What group priority?)
Bottom-half handling: I) pop a single intro2 off of pqueue (ie: max difficulty in queue) II) Compare this difficulty to desc difficulty. If lower, lower desc difficulty III) Parse it and launch RP circuit (as per current bottom half of hs_service_receive_introduce2()) IV) trim pqueue elements, if queue "too big" (TODO: how to trim?) V) Compare trim point difficulty to descriptor difficulty, if trim point was higher than descriptor value, raise desc difficulty VI) return to libevent/mainloop again
I would maybe try to convince you that we could dequeue more than 1 cell here because this behavior is changing quite a bit the current state of HS.
Right now, we would get 32 cells out of the inbuf and one at a time, process it then go back to mainloop.
This new algorithm means that we would process 1 single cell at each mainloop event instead of 32. This is quite a decrease. Ok, it is not that exact ration because maybe dequeue the inbuf without processing the decrypted INTRO2 is fast but it is still a full mainloop run per cell is clearly slower than right now.
We need some sort of performance measurements here to make an informed decision but my guts feeling tells me that we might want to don't know process 5 or 10 cells instead of 1 per mainloop round.
We _should_ run timing measurement here to see how much delaying INTRO2 processing to another mainloop event affects the overall rate of introduction.
But let say a full mainloop run takes 100msec, we will process 50 introductions per second... that looks quite low? But could be already what we do now, unknown.
The astute reader will note that even without PoW, the above can provide almost the exact same functionality as the naive rate limiting currently done at intropoints - just cut the queue arbitrarily. Basically PoW and the pqueue just gives us a smarter way to decide who to reply to.
However, it still has the following potential issues: A) AES will bottleneck us at ~100Mbit-300Mbit at #2 in top-half above B) Extra mainloop() iterations for INTRO2s may be expensive (or not?)
Possibly, from the above, some analysis should happen. I can easily do that once we get the tracing API upstream.
For A, onionbalance will still help by adding new back-end instances providing intropoints via separate back-end Tor daemons, either on the same box or different boxes.
But it will only help up to a point. A HSDesc maxes out at 30k, so at some point we'll run out of space to list more intropoints in a single descriptor. At that point, we can still list different intropoints at each HSDir position, but after that, we are screwed.
Small correctino. HSDesc max at 50k for v3 and 20k for v2. But lets just consider v3 for the forseable future :D.
[snip]
I would _love_ to but could be too early for that if we consider that we are still unsure that this defense will be useful or not (according to Mike as a discussion on IRC).
As described above, multithreading still provides a multiplier in the AES bottleneck case, even over onionbalance.
But, there may be more bottlenecks than just AES crypto, so this is a further argument for not jumping the gun just yet, and trying v1 first (or even a simple prototype without pow, that just cuts the queue arbitrarily), and getting some more profiling data.
As an initial step, I agree. Onionbalance provides an easy way for service to outsource client introduction to more CPUs.
But, in my opinion, onionbalance is a power user solution and thus usually a small percentage of our .onion users that can take advantage of it. As .onion move more and more in the mobile sphere and client to client applications (onionshare, ricochet), it ain't much of an option :S.
Without using more CPUs in Tor, I have a _hard_ time seeing tor scale over time especially for services. As long as we keep that in mind with our designs, I'm good :).
Next steps (not necessarily in order):
a) pqueue plan review + more detailed description b) Figure out pqueue trim mechanism - can we do better than O(n)?
Initially, we could simply go with an upper limit and just drop cells as you queue them if you are above limit? As in drop back() if you reach the limit everytime you queue?
Else, we can get into more complicated schemes with queueing rate versus processing rate and come down with a golden number to strike a memory and CPU balance...?
c) test using randomx as a hash function in various ways, esp wrt key usage, cache setup, and VM infos d) test pqueue refactoring, maybe without pow involved yet e) specify a v1.5 that works at intropoint (to avoid AES bottleneck) f) Merge this thread back into a single proposal document
We could put the engineering details could be in an Annexe of the proposal if we don't want to loose track of it and not dispersed on a mailing list :)?
I'm happy to help with this, let me know.
g) other stuff we forgot, XXX's, TODOs, etc.
Cheers! David