Karsten, could I ask you to take a quick look at this code?
https://trac.torproject.org/projects/tor/ticket/7549#comment:14 https://trac.torproject.org/projects/tor/attachment/ticket/7549/onionoo-quer...
It's a daemon that keeps a local cache of potential exit relays, sourced from onionoo. It listens on a local port and sends "EXIT" when asked about an IP address that is possibly an exit. The idea behind this code was to have a fast local database that we can rapidly query from the flash proxy facilitator, in order to prevent Tor users from being flash proxies themselves.
Is this code the most straightforward way you can think to achieve the goal? Do you have any suggestions on the use of onionoo? I wonder if something like the daemon exists already. The code looks reasonable, though I would make some changes before merging it. I want to gauge whether detection of exits is worth the additional code.
David Fifield
On 5/12/13 12:38 PM, David Fifield wrote:
Karsten, could I ask you to take a quick look at this code?
https://trac.torproject.org/projects/tor/ticket/7549#comment:14 https://trac.torproject.org/projects/tor/attachment/ticket/7549/onionoo-quer...
It's a daemon that keeps a local cache of potential exit relays, sourced from onionoo. It listens on a local port and sends "EXIT" when asked about an IP address that is possibly an exit. The idea behind this code was to have a fast local database that we can rapidly query from the flash proxy facilitator, in order to prevent Tor users from being flash proxies themselves.
Is this code the most straightforward way you can think to achieve the goal?
I think so, yes.
The only downside I can see is that it takes about 30--45 minutes for new exits to show up in your local cache. An alternative would be to query the exit list yourself, download the most recent consensus, and compile a list of exit addresses yourself. But that's probably too complicated for the purpose. (A downside of that approach, however, is that you'll have to change your code once TorBEL will be deployed.)
Do you have any suggestions on the use of onionoo?
The code looks sane to me. The only improvement might be to lower ELAPSED_UPDATE_TOR_NODES_TIME to, say, 300 or 600 seconds. Onionoo updates its data once per hour, and with the current 3600 seconds you might be unlucky and download its data right before it gets updated. Given that you're sending the If-Modified-Since header, querying every 5 or 10 minutes (or even more often) is perfectly fine.
I wonder if something like the daemon exists already.
I'm not sure, but Tor2web might do something similar. From Onionoo's project page: "Tor2web is a web proxy to Tor Hidden Services. It uses Onionoo to get the list of currently running Tor Exits to detect if the client is a Tor user and if so redirect them to the .onion address."
The code looks reasonable, though I would make some changes before merging it. I want to gauge whether detection of exits is worth the additional code.
Hope this helps you decide. If you plan to use Onionoo, please let me know, so that I can put flash proxy on the list of Onionoo clients and remember to inform you of upcoming protocol changes.
Best, Karsten
On Mon, May 13, 2013 at 08:58:27AM +0200, Karsten Loesing wrote:
The only downside I can see is that it takes about 30--45 minutes for new exits to show up in your local cache. An alternative would be to query the exit list yourself, download the most recent consensus, and compile a list of exit addresses yourself.
Speaking of delays: the place that knows about new relays first is each directory authority. Not only that, but they know also what IP address the relay is exiting from, since that's where the relay publishes its descriptor from. E.g.,
@uploaded-at 2013-05-12 16:57:35 @source "173.246.102.12" router nthdimension 173.246.101.241 443 0 0 [...]
Seems like this info could provide an alternative, simpler way to generate the exit-addresses file: https://exitlist.torproject.org/exit-addresses
which if we're doing our modularity right, should be the input to the various other scripts.
I guess I should make a trac ticket of this idea. But which component? We sure seem to have a lot of projects that overlap tordnsel / torbel in some way.
--Roger
On 5/13/13 9:38 AM, Roger Dingledine wrote:
On Mon, May 13, 2013 at 08:58:27AM +0200, Karsten Loesing wrote:
The only downside I can see is that it takes about 30--45 minutes for new exits to show up in your local cache. An alternative would be to query the exit list yourself, download the most recent consensus, and compile a list of exit addresses yourself.
Speaking of delays: the place that knows about new relays first is each directory authority. Not only that, but they know also what IP address the relay is exiting from, since that's where the relay publishes its descriptor from. E.g.,
@uploaded-at 2013-05-12 16:57:35 @source "173.246.102.12" router nthdimension 173.246.101.241 443 0 0 [...]
Seems like this info could provide an alternative, simpler way to generate the exit-addresses file: https://exitlist.torproject.org/exit-addresses
which if we're doing our modularity right, should be the input to the various other scripts.
Interesting. Haven't thought of using that information. metrics-db even has this information available from gabelmoo, because it rsyncs gabelmoo's cached-* files (and v3-status-votes) once per hour. But metrics-db discards all descriptor annotations so far.
However, I don't think this information can replace the information we learn from TorDNSEL or TorBEL. Some concerns:
- Relays may exit from more than just one IP address, but the directory authorities would only see at most one of these addresses. Here's an exit list entry with two exit IP addresses:
ExitNode 49A75EE0B80C1963482FDDFCE579D1A0C568D8BB Published 2013-05-12 20:59:32 LastStatus 2013-05-12 22:02:59 ExitAddress 46.165.221.166 2013-05-12 22:03:11 ExitAddress 46.166.163.169 2013-05-12 22:03:11
- The directory authorities sometimes download descriptors they don't have from other directory authorities. In that case we don't learn the IP address that the relay exits from. Here's an example:
@downloaded-at 2013-05-12 18:50:10 @source "154.35.32.5"
- The directory authorities are indeed the first to learn these source IP addresses. But we probably don't want arbitrary services to query the authorities frequently for their cached descriptors to learn their annotations. That means we'd have to aggregate and cache this information at another place, which introduces a delay.
I guess I should make a trac ticket of this idea. But which component? We sure seem to have a lot of projects that overlap tordnsel / torbel in some way.
For now, I'd say it's an "Analysis" ticket, because we don't yet know how to use this information. If you want to make a ticket, I'll paste my concerns above there.
And you're right that Onionoo overlaps with TorDNSEL/TorBEL to a certain extent. Or rather, it uses their data and presents them in a more convenient way. This wasn't planned, and it would be better if TorDNSEL/TorBEL had a more convenient interface that people could use instead. Until that's the case, people can easily use Onionoo.
Best, Karsten
Thank you for taking a look.
On Mon, May 13, 2013 at 08:58:27AM +0200, Karsten Loesing wrote:
I'm not sure, but Tor2web might do something similar. From Onionoo's project page: "Tor2web is a web proxy to Tor Hidden Services. It uses Onionoo to get the list of currently running Tor Exits to detect if the client is a Tor user and if so redirect them to the .onion address."
If I read this right, Tor2web is doing it not only for exits, but for all relays:
https://github.com/globaleaks/Tor2web-3.0/blob/c6e26b35e83fd897f9c4f9cb6787e...
David Fifield
On May 13, 2013, at 7:15 PM, David Fifield david@bamsoftware.com wrote:
Thank you for taking a look.
On Mon, May 13, 2013 at 08:58:27AM +0200, Karsten Loesing wrote:
I'm not sure, but Tor2web might do something similar. From Onionoo's project page: "Tor2web is a web proxy to Tor Hidden Services. It uses Onionoo to get the list of currently running Tor Exits to detect if the client is a Tor user and if so redirect them to the .onion address."
If I read this right, Tor2web is doing it not only for exits, but for all relays:
https://github.com/globaleaks/Tor2web-3.0/blob/c6e26b35e83fd897f9c4f9cb6787e...
Yeah we are doing this because downloading all the exit lists is too much. If there was a better way to do this we would be very useful to us.
We only need the exit list, but we consider any relay as a possible exit. This means that stuff like a relay that exits through a different address will not be detected as coming from Tor (this is something torbel accounts for).
These are the relevant tickets on the topic:
https://trac.torproject.org/projects/tor/ticket/6488
https://github.com/globaleaks/Tor2web-3.0/issues/10
https://github.com/globaleaks/Tor2web-3.0/issues/85
~ Art.
On 5/13/13 7:21 PM, Arturo Filastò wrote:
On May 13, 2013, at 7:15 PM, David Fifield david@bamsoftware.com wrote:
Thank you for taking a look.
On Mon, May 13, 2013 at 08:58:27AM +0200, Karsten Loesing wrote:
I'm not sure, but Tor2web might do something similar. From Onionoo's project page: "Tor2web is a web proxy to Tor Hidden Services. It uses Onionoo to get the list of currently running Tor Exits to detect if the client is a Tor user and if so redirect them to the .onion address."
If I read this right, Tor2web is doing it not only for exits, but for all relays:
https://github.com/globaleaks/Tor2web-3.0/blob/c6e26b35e83fd897f9c4f9cb6787e...
Ah, you're right.
Yeah we are doing this because downloading all the exit lists is too much. If there was a better way to do this we would be very useful to us.
We only need the exit list, but we consider any relay as a possible exit. This means that stuff like a relay that exits through a different address will not be detected as coming from Tor (this is something torbel accounts for).
Hmm, can you be more specific? What feature are you missing, and would that be a feature in Onionoo or TorDNSEL/TorBEL?
Best, Karsten
These are the relevant tickets on the topic:
https://trac.torproject.org/projects/tor/ticket/6488
https://github.com/globaleaks/Tor2web-3.0/issues/10
https://github.com/globaleaks/Tor2web-3.0/issues/85
~ Art.
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev