Re: [tor-dev] Proposal: Controller events to better understand connection/circuit usage - tor-dev

23 Feb 2013


      On 2/22/13 8:50 PM, Rob Jansen wrote:
...
On Fri, Feb 22, 2013 at 12:59 PM, Karsten Loesing karsten@torproject.orgwrote:
[...]
...
If anything here sounds strange to you, please let me know.  I'm not
100% certain that this is the best approach to track circuits from
client to exit, or if it's even correct.
For example, I assume here that circuit IDs are unique between two
nodes, which I think is correct.  But before working on this I also
assumed that a circuit uses a single connection for both inbound and
outbound directions (which is apparently not the case).
Whether or not your assumption about circuit ids is correct depends on
which circuit id you are referring to - the source relay circuit id, the
destination relay circuit id, or the id(s) written in the cells. Here is
how I understand the ids work:
Suppose relayA and relayB are part of the same circuit, and relayA is
closer to the client than relayB. The id by which relayA refers to its
circuit_t state is only unique to relayA and is chosen when the circuit_t
struct is created. Similarly, the id by which relayB refers to its
circuit_t state is only unique to relayB and also chosen when the circuit_t
struct is created. Check this in the *circuit_new() functions in
circuitlist.c. Lets refer to these as circuit UIDs, as they are unique and
only known to individual relays.
Now, relayA writes a circuit id into cells it sends to relayB, so that
relayB knows which circuit the cells belong to, but it does not necessarily
use its circuit UID. The id for this is computed in
get_unique_circ_id_by_conn() in circuitbuild.c and is stored in n_circ_id
in the circuit_t struct at relayA. Similarly, relayB uses p_circ_id from
the or_circuit_t struct when sending cells back to relayA.
Upon receiving cells from relayA, relayB immediately uses the id written in
the cell (relayA's circ->n_circ_id) to look up relayB's UID for its
circuit_t state. There is a circuit id map that is kept for this purpose.
This works similarly from relayB to relayA.
Now, the problem that prevents us from linking these is that all of the
controller events print out the UIDs, but not the n_circ_id or p_circ_id. I
tried printing out all of these various IDs but never verified that it
actually worked.
Please let me know if my understanding is flawed in some way.
Your understanding of n_circ_id and p_circ_id matches mine, but are you
sure there's a UID for circuits other than origin circuits?  I think you
mean origin_circuit_t->global_identifier.  But there's no such field for
or_circuit_t or circuit_t.  Or do you mean something else?
Anyway, your description made me realize that my previous attempt to
link CELL_STATS events was flawed for (at least) two reasons:
1. I assumed that a circuit ID (n_circ_id or p_circ_id) is unique for a
given pair of nodes, but it's really only unique for a given connection
between two nodes.
2. I also assumed that a circuit uses two different connections for
inbound and outbound queues, but it's actually the same connection that
is known under different connection IDs by the involved nodes.
Maybe this gets clearer when looking again at the example:
[fileclient-60.1.0.0]:[tokenconn-59.1.0.0]:34760
  00:23:16 [fileclient-60.1.0.0] >>> circ=34760 conn=12
    create_fast=1;relay_early=2 create_fast=1;relay_early=2
  00:23:16 [tokenconn-59.1.0.0] <<< circ=34760 conn=31
    relay=1;created_fast=1 relay=1;created_fast=1
Circuit ID 34760 is what fileclient picked as n_circ_id for this circuit
and what tokenconn stored as p_circ_id.  There's a single OR connection
carrying this circuit which fileclient identifies as 12 and tokenconn as 31.
The part where this gets tricky is when we need to find out (reliably)
that connection IDs 12 and 31 refer to the same connection.  fileclient
never tells tokenconn that it locally refers to this connection as ID
12, and vice versa.  However, there can be more than one connection
between the two nodes, and it's perfectly valid for those connections to
each have a circuit with ID 34760 on them.
I modified my Java program to parse ORCONN events to match corresponding
connection IDs based on state transitions from NEW or LAUNCHED to
CONNECTED.  This works in 99% of cases and only fails for OR connections
that were launched before the controller registered for ORCONN events.
But we probably don't care about that early bootstrapping phase anyway.
Here's the same circuit from my original example, this time with
explanations (IDs might differ, because this example comes from another
Shadow run):
[fileclient-60.1.0.0]:12:34761
This queue is for circuit 34761 which runs over connection 12 at
fileclient.  It's an outbound queue, though that isn't explicitly stated
here.  The ID 34761 is what fileclient picked as n_circ_id.
00:23:16 [fileclient-60.1.0.0] >>> circ=34761 conn=12
    create_fast=1;relay_early=2 create_fast=1;relay_early=2
There were three cells in outbound direction reported in this CELL_STATS
event.  Leaving out other CELL_STATS events here.
[tokenconn-59.1.0.0]:31:34761
This queue is also for circuit 34761, but running over connection 31 at
tokenconn.  It's an inbound queue, so 34761 is a p_circ_id, chosen by
fileclient.
00:23:16 [tokenconn-59.1.0.0] <<< circ=34761 conn=31
    relay=1;created_fast=1 relay=1;created_fast=1
Two cells in inbound direction.
[tokenconn-59.1.0.0]:16:26405
Queue for circuit 26405 on connection 16 at relay tokenconn.
[tokenglobal-57.1.0.0]:15:26405
See above.
[tokenglobal-57.1.0.0]:16:53557
See above.
[exit2-62.1.0.0]:15:53557
See above.
tl;dr: I _think_ it's possible to reconstruct circuits from ORCONN and
CELL_STATS events as they are currently specified in proposal 218.
Best,
Karsten