Sorry for the spam, but I have a critical typo on the previous message. 
Instead of "with 11 nodes (2 relays) and 2 clients who have path restriction." I meant 11 nodes (2 authorities) and 2 clients. 

On Sun, May 8, 2016 at 1:52 AM, Xiaofan Li <xli2@andrew.cmu.edu> wrote:
Tim, 

I'm not sure if Tor is looking for alternative transport protocols like QUIC.

What if it's a lot faster than TCP on Tor? 
 
One of the issues is that any modified client is easy to fingerprint.
So, as with IPv6, we'd need relays to run QUIC and TCP in parallel for some time, then clients could optionally use QUIC when there were enough relays supporting it. Perhaps relays could open a QUIC UDP port on the same port as their TCP ORPort, and then advertise support in their descriptors. But TCP would remain the default for the foreseeable future.
For example, our IPv6 adoption is still at the stage where clients need to be explicitly configured to use it.
(And parts of it are only coming out in 0.2.8.)
If your modifications don't work like this, then it would be very hard for us to adopt them.

It does work like this. Our testing version has "parallel codepath" and supports both QUIC and TCP. And we devised our QUIC API to look almost exactly like the traditional UNIX socket API. So, code change is almost minimal. 
 
Even if they did, I don't know if they solve any pressing issues for us. 

What about the head-of-line blocking issue and the congestion control issue raised in 2009? From this paper, it seems they haven't been completely solved. 
 
(And we'd need both a theoretical security analysis, and a code review. And new features come with new risks and new bugs.)

Of course! We don't expect Tor to suddenly start using QUIC because of a couple of emails. But I believe we do have something to argue for QUIC based on both theories and experimental results. We would probably make a formal, published argument soon. 

I've given you credit for reporting this issue, please feel free to provide your preferred name (or decline) on the ticket.
Thanks! 


About the issue, I've checkout the 0.2.8 commit and tested on that. The problem is still there so I looked deeper into it. I've run it many time and it seems like once I start restricting path, it becomes undeterministic whether the bootstrap will succeed. And I think it might have something to do with the cache-microdesc-consensus file fetched by that client. Just for recap, I'm running a network with 11 nodes (2 relays) and 2 clients who have path restriction. My observations are: 
  • Each client will have a cache-microdesc-consensus file with 4 relays in it. relay 0, 1 and 2 will always be there and the last one changes each time I start the network. 
  • When the all 3 nodes on the restricted path are on the cache-microdesc-consensus file, the bootstrap will succeed quickly. For example, if my path is restricted to R2->R3->R1, since 0, 1 and 2 are always present in the consensus, whenever R3 is there, the bootstrap will work. 
  • When one of the node is not on the consensus, the bootstrap will be stuck and never reach 100%. Depending on which node of the path is not included in the consensus, the error message varies. In the above example, if R3 is not in the consensus, we will fail to connect to hop 1 (assume 0-based logging). 
  • I waited for a long time (~30min) and nothing would improve: consensus does not contain more nodes and bootstrap would still be stuck. 
I think the root of the problem might be the consensus having too few nodes.. Is it normal for a cache-microdesc-consensus file to only have 4 nodes in a 11-node network? Should I look into how the code that generate the consensus? 


The routerlist_t I mentioned is in routerlist.c, line 124.
124/** Global list of all of the routers that we know about. */
125static routerlist_t *routerlist = NULL;

But now I think this probably just stores the same info as the cache-microdesc-consensus file, right? 

Hmm, then it's likely a configuration issue with your network.

Shouldn't chutney also fail if it is a configuration issue? Or are you saying it's a configuration issue with my underlying network topology?
The only thing different in the torrc files for the chutney run and the Emulab run is "Sandbox 1" and "RunAsDaemon 1" but I don't think they cause any issue? 

Thanks!
Li. 
 

On Fri, May 6, 2016 at 6:18 PM, Tim Wilson-Brown - teor <teor2345@gmail.com> wrote:

> On 7 May 2016, at 05:10, Xiaofan Li <xli2@andrew.cmu.edu> wrote:
>
> Thanks for the replies!
>
> 1. About the name:
> Thanks for the headsup! We'll definitely pay attention to the trademark rules when we publish our results. We are not planning to roll out our own version of Tor. I think our most important goal is probably: demonstrate a possibility of UDP-based protocol to solve some of TOR's hard performance problems. And hope that you guys would consider using it in future versions.
> This leads me to a question about licensing: I believe TOR and QUIC have different (conflicting) licenses. Would it even be a possibility that QUIC ever makes it into TOR?

I am not a lawyer or a software architect for Tor, so these are simply my opinions:

Here is the tor license:
https://gitweb.torproject.org/tor.git/tree/LICENSE

Is QUIC under the Chromium license? I couldn't find a QUIC-specific one.
https://chromium.googlesource.com/chromium/src/+/master/LICENSE

If so, the licenses look compatible to me, they're both BSD-style, and are almost identical.

What restrictions concern you?

On the architecture side:

I'm not sure if Tor is looking for alternative transport protocols like QUIC.
One of the issues is that any modified client is easy to fingerprint.
So, as with IPv6, we'd need relays to run QUIC and TCP in parallel for some time, then clients could optionally use QUIC when there were enough relays supporting it. Perhaps relays could open a QUIC UDP port on the same port as their TCP ORPort, and then advertise support in their descriptors. But TCP would remain the default for the foreseeable future.

For example, our IPv6 adoption is still at the stage where clients need to be explicitly configured to use it.
(And parts of it are only coming out in 0.2.8.)

If your modifications don't work like this, then it would be very hard for us to adopt them.
Even if they did, I don't know if they solve any pressing issues for us.
(And we'd need both a theoretical security analysis, and a code review. And new features come with new risks and new bugs.)

> ...
> I also wonder why you need to use path restrictions at all.
> 3. For path restriction, we have our own implementation. We parse the config file and use the nickname in choose_good_*() functions to return the corresponding nodes. We have to use this because restricting the middle node is very important to us for testing HOL blocking problem. (We have to manually create 1-hop overlapping path for two clients and test the interference.)
>
> 4. Regarding the issue, it's probably not entry guard problem, because: 1) Shouldn't that give "failed to select hop 0" instead of hop 1?

Yes, you're right, in that message, hop counts are 0-based.

But the code is inconsistent in onion_extend_cpath:

  if (!info) {
    log_warn(LD_CIRC,"Failed to find node for hop %d of our path. Discarding "
             "this circuit.", cur_len);
    return -1;
  }

  log_debug(LD_CIRC,"Chose router %s for hop %d (exit is %s)",
            extend_info_describe(info),
            cur_len+1, build_state_get_exit_nickname(state));

The control spec says hops are 1-based, so we should fix the logging.

See:
https://trac.torproject.org/projects/tor/ticket/18982

I've given you credit for reporting this issue, please feel free to provide your preferred name (or decline) on the ticket.

> 2) I can see in our debugging log that we failed on the extending info with the second node. The node returned by choose_good_middle_server is not NULL but the routerinfo_t pointer is NULL. Any idea why?

Perhaps you looked it up the wrong way, or it's not in the consensus.
What code are you using to look up the node?

Are you using extend_info_from_node()?
If not, please note that different fields are present depending on whether you use descriptors (ri) or microdescriptors (md).

> My guess is that consensus is a little short for some reasons, how do I validate this guess?

Read the plain-text cached-{microdesc-}consensus file in the tor client's data directory and check if the middle node is in it.
Read the plain-text cached-{descriptors,microdescs} file in the tor client's data directory and check if the middle node is in it.

> Does the global router list contain everything on the consensus?

I'm not sure exactly what you're referring to here, please provide a function or global variable name for this list.

> 5. More observation on this issue:
> For both tor and our tor, when I decrease the size of the network (i.e. number of relays in the network), the hanging issue resolves itself..

Hmm, then it's likely a configuration issue with your network.

> I'll try rebase back to an official release today.

That might help, we are still fixing bugs in 0.2.8.

Tim

Tim Wilson-Brown (teor)

teor2345 at gmail dot com
PGP 968F094B
ricochet:ekmygaiu4rzgsk6n




_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev