Yes, the initial plan is to maintain ordering within a circuit. Although, even this is not strictly necessary if the application wishes to handle out of order packets. For example, a web server could theoretically handle requests for different resources in any order. But that's a different story.
Yes, we are aware of TCP-over-DTLS and Defenestrator. I'm not familiar with PCTCP. I'll read up on that.
What you're saying about HOL blocking in the output queue for a relay makes sense if the receive window fills up, but I didn't explain how uTCP actually works. uTCP (and paired with uTLS) is a kernel patch that will expose to the application (in this case, Tor) segments that have arrived even if they are NOT the next logical segments in the stream. uTCP breaks TCP's delivery semantics, without affecting the on-the-wire protocol. Thus, it prevents HOL blocking from occurring in the TCP receive buffer. Now, it should be clear why the queue is important: I need to keep unordered cells in the queue if they are out of order within a circuit, but if they are the "next expected cell" for its circuit, then I can process it, regardless of whether it came in order or out of order from TCP.
I'm implementing this by changing the wire format to include a sequence number for each cell. The sequence number increments within a circuit. The receiver increments its "expected" sequence number with each in order arrival within a circuit.
It makes sense why the code is written as it is to process cells immediately, since, as you point out, the HOL issue has already been resolved by the time it gets to Tor. I was really just trying to manually inject some cell reordering to test Tor's ability to handle cells from different circuits out-of-order with respect to how TCP delivered them. But I'll skip this testing step and go to the next one, which is just running it atop uTCP/uTLS directly and using the incoming_queue to hold cells as needed. I'll have to change a bit more of the code, particularly the asserts that the incoming_queue is empty after a call to channel_process_cells(…).
If you have any other thoughts on this issue, please send them along. I appreciate your insight.
Regards,
Fitz
On May 4, 2013, at 9:05 AM, Ian Goldberg iang@cs.uwaterloo.ca wrote:
On Thu, May 02, 2013 at 10:58:54AM -0400, MF Nowlan wrote:
I am working on integrating uTCP and uTLS ( http://arxiv.org/abs/1103.0463) into Tor to see if we can lower the latency due to head of line blocking across circuits.
You have to be careful to preserve cell ordering *within* circuits, of course.
[Do you know about TCP-over-DTLS and PCTCP?]
- Now, look for case when two cells from different circuits are
present in the incoming_queue and process the second cell before the first cell. channel_process_cells(...) { ... while (NULL != (q = TOR_SIMPLEQ_FIRST(&chan->incoming_queue))) { // Remove q from the incoming_queue. // if the queue is still not empty, get the next one, // if the circuit ids do not match, Swap the cells. } ... }
My problem is that number 2. above never occurs. I have not observed a case where the incoming_queue ever has two cells from different circuits. In fact, I don't think I've ever had a time when the incoming_queue has more than 1 cell in it. I am hammering a small tor test network with 30+ curl requests at once. I have two proxies, each of them uses the same entry node, and the same exit node, and there is only one other node in the network, so the circuit that is set up should be the same for every single request. Am I missing something? Will this function (channel_process_cells) ever process more than one cell at a time? I've checked the logs to verify that different circuits are actually set up for the different requests (rather than the Proxies just reusing the existing circuit and giving each new request a new stream id).
I'm not very familiar with the new channel code, unfortunately (it's on our "todo" list to port TCP-over-DTLS and/or PCTCP to channels, though). But at least in the pre-channel code, the input queues would indeed almost always be empty. Any why wouldn't they be? As soon as TLS reports a cell has arrived, you should be able to process it immediately. Where the HOL blocking occurs is in the Tor *output* queue on the sending side and/or the TCP receive buffer on the receiving side. If a IP packet containing (...TCP...TLS...) a cell gets dropped, all future data on that connection will get stalled in the TCP receive buffer (if the window is open) or the Tor output queue (if the window is closed). Once it gets delivered to Tor, then the stall has already been resolved.
- Ian
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev