On 2/22/13 5:36 PM, Rob Jansen wrote:
On Tue, Feb 19, 2013 at 10:20 AM, Karsten Loesing karsten@torproject.orgwrote:
On 2/12/13 4:54 PM, Damian Johnson wrote:
AppWard and ExitWard indicate the direction of cell flow through the circuit. Previous cell events aggregated statistics for both directions
and
therefore did not capture information which I believe may be important. These are much more informative than N and P. Do you have suggestions
for
improving this further?
Ahhh, gotcha. Personally I'd go for 'inbound' and 'outbound' since they better connotate directionality. AppWard and ExitWard sound weird, but they make sense now too and would be fine if CELL_STATS got a quick description saying what they mean.
Sure, I'm fine with calling it 'inbound' and 'outbound'. Will change that the next time I touch the proposal. Thanks!
I also thought more about Rob's idea to link circuits from client to exit in a testing network. I think we should add two more fields InboundName and OutboundName to CELL_STATS events. These fields would contain LongName ($CC88FB3B78599580F1EE4F6F73E26A7EC3DF2CA1~tokenconn) or Target (like 60.1.0.0:10002) of the previous or next node in a circuit. While it may be possible to extract this information from ORCONN events, that seems somewhat fragile to me. (When testing this, I had a case when we were missing ORCONN events, because the controller was not yet connected, so it was unclear where the CELL_STATS events belonged.)
Here's an example of the extended CELL_STATS event (still using positional arguments, because I didn't change the code there yet):
CELL_STATS 48957 15 $9441269F5989487F07AE824063345A0A6BCAB279 created=1;relay=1 created=1;relay=1 created=0;relay=0 29201 16 $DC284D0012ECB812C0CA2DEB080CB2F839A77E1A create=1 create=1 create=0
And here's how we'd put together relevant CELL_STATS events to learn what happened in a given circuit:
00:23:16 [fileclient-60.1.0.0] >>> circ=34760 conn=12 create_fast=1;relay_early=2 create_fast=1;relay_early=2 00:23:16 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=1;created_fast=1 relay=1;created_fast=1 00:23:17 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=1 relay=1 00:25:01 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=1 relay=1 00:25:01 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay_early=1 relay_early=1 00:25:02 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay_early=1 relay_early=1 00:25:02 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=6 relay=6 00:25:03 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=40 relay=40 00:25:04 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay_early=1 relay_early=1 00:25:04 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=61 relay=61 00:25:05 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay=1;relay_early=3 relay=1;relay_early=3 00:25:06 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=1 relay=1 00:25:06 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay=1 relay=1 00:25:07 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=44 relay=44 00:25:08 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=63 relay=63 00:25:08 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay=3 relay=3 00:25:09 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay=2 relay=2 00:25:10 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=1 relay=1 00:25:10 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay=1 relay=1 00:25:11 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=94 relay=94 00:25:12 [fileclient-60.1.0.0] >>> circ=34760 conn=12 relay=4 relay=4 00:25:12 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=12 relay=12 [tokenconn-59.1.0.0]:[tokenglobal-57.1.0.0]:9272 00:23:16 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 create=1;relay_early=1 create=1;relay_early=1 00:23:16 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 created=1;relay=1 created=1;relay=1 00:25:01 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay_early=1 relay_early=1 00:25:01 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=1 relay=1 00:25:02 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=6 relay=6 00:25:02 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay_early=1 relay_early=1 00:25:03 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=41 relay=41 00:25:04 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=60 relay=60 00:25:04 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay_early=1 relay_early=1 00:25:05 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay=1;relay_early=3 relay=1;relay_early=3 00:25:06 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay=1 relay=1 00:25:06 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=1 relay=1 00:25:07 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=52 relay=52 00:25:08 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay=1 relay=1 00:25:08 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=55 relay=55 00:25:09 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay=4 relay=4 00:25:10 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay=1 relay=1 00:25:10 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=1 relay=1 00:25:11 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=98 relay=98 00:25:12 [tokenglobal-57.1.0.0] <<< circ=9272 conn=15 relay=8 relay=8 00:25:12 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 relay=4 relay=4 [exit1-61.1.0.0]:[tokenglobal-57.1.0.0]:38910 00:23:16 [exit1-61.1.0.0] <<< circ=38910 conn=24 created=1 created=1 00:23:16 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 create=1 create=1 00:25:01 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=1 relay=1 00:25:01 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay_early=1 relay_early=1 00:25:02 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=30 relay=30 00:25:02 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay_early=1 relay_early=1 00:25:03 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=77 relay=77 00:25:04 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay_early=1 relay_early=1 00:25:05 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay=1;relay_early=3 relay=1;relay_early=3 00:25:06 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=1 relay=1 00:25:06 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay=1 relay=1 00:25:07 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=80 relay=80 00:25:08 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=27 relay=27 00:25:08 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay=1 relay=1 00:25:09 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=1 relay=1 00:25:09 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay=4 relay=4 00:25:10 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=27 relay=27 00:25:10 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay=1 relay=1 00:25:11 [exit1-61.1.0.0] <<< circ=38910 conn=24 relay=79 relay=79 00:25:12 [tokenglobal-57.1.0.0] >>> circ=38910 conn=27 relay=4 relay=4
I think this works for all circuits. Rob, what do you think about extending the CELL_STATS events as described above?
I don't think I understand the format. Is this what would appear in the log file? Including the ">>>" and "<<<"? Those seem a little strange. Also, I'm not seeing where e.g. the ID of the circuit as the entry relay knows it is mapped to the ID of the circuit as the middle relay knows it. Circuit IDs are relay-specific, right? Am I missing something here?
Ah, sorry, it's the output of my little Java tool that puts together CELL_STATS events from a scallion.log file. (I can share that code if you want, but it's mostly a hack.) Here's how you'd read that format:
Client fileclient and relay tokenconn have one or more connections that carry circuit 34760. IIUC, fileclient picked that circuit ID. The names [fileclient-60.1.0.0] and [tokenconn-59.1.0.0] are ordered alphabetically, not in the order of whoever opened the circuit.
00:23:16 [fileclient-60.1.0.0] >>> circ=34760 conn=12 create_fast=1;relay_early=2 create_fast=1;relay_early=2
In that second, fileclient sent three cells over circuit 34760 using connection 12, one CREATE_FAST cell and two RELAY_EARLY cells. It's unclear who opened that connection and picked number 12 for it, but it doesn't really matter. All cells were added to and removed from the queue in the same second. The ">>>" stands for outbound.
00:23:16 [tokenconn-59.1.0.0] <<< circ=34760 conn=31 relay=1;created_fast=1 relay=1;created_fast=1
tokenconn sent two cells in inbound direction belonging to the same circuit 34760 but using connection 31, not 12 (which I didn't know was possible, but hey).
[...] [tokenconn-59.1.0.0]:[tokenglobal-57.1.0.0]:9272
Relay tokenconn and relay tokenglobal use one or more connections for circuit 9772, which is the extended circuit from above. The format doesn't show this, but the program generating this format makes sure that these two circuit parts belong together.
00:23:16 [tokenconn-59.1.0.0] >>> circ=9272 conn=18 create=1;relay_early=1 create=1;relay_early=1
Two cells from tokenconn in outbound direction. Similar to above.
[...] [exit1-61.1.0.0]:[tokenglobal-57.1.0.0]:38910
Relay exit1 and tokenglobal use ID 38910 for the last part of this circuit.
If anything here sounds strange to you, please let me know. I'm not 100% certain that this is the best approach to track circuits from client to exit, or if it's even correct.
For example, I assume here that circuit IDs are unique between two nodes, which I think is correct. But before working on this I also assumed that a circuit uses a single connection for both inbound and outbound directions (which is apparently not the case).
Also, do we still need the ID= in ORCONN events if CELL_STATS events are sufficient to reconstruct circuits?
Would the ORCONN IDs still be useful to track state on a per-connection basis, like bandwidth used per connection (from the CONN_BW event)?
Maybe, yes. Okay, leaving it in.
Finally, Rob, should I look into CIRC_BW events you suggested a while ago? If yes, what did you have in mind how that event would look like, and when/by whom would it be emitted?
If we want to do this, it would likely be an aggregation of all STREAM_BW events for a circuit, but only during the time when those streams belonged to that circuit. I don't think it makes sense to emit it for every STREAM_BW event though, so what if we aggregate and emit it once per second? A format similar to the STREAM_BW format should work fine.
Sounds doable. I'll see if I can build this over the weekend.
Best, Karsten