Detecting if a IP address belongs to a Tor Exit node.

List overview All Threads
Download

newer

older

R: Re: questions about extending...

tor exit node receive buffer and...

Jorge Couchet

4 Dec 2012 4 Dec '12

12:25 p.m.

A FlashProxy (https://crypto.stanford.edu/flashproxy/) is a normal browser that has a Javascript code in order to act as a Proxy. The browser acting as a Proxy is contacting a special server (the "Facilitator" developed in Python) in order to ask for a client and a Tor relay. In the case that the Facilitator 's answer is positive, then the FlashProxy is acting as a bridge in order to connect the client with the Tor relay (i.e. helping the client to connect to the Tor Network from a country with the known relays censored).

I'm working with the ticket 7549 (https://trac.torproject.org/projects/tor/ticket/7549). The ticket 's goal is to avoid a "Tor in Tor situation" when a FlashProxy is serving a client request. This "Tor in Tor situation" could be described as the FlashProxy being itelsf inside of the Tor Network when is trying to help a client computer to connect to the Tor Network.

In order to avoid this situation, the goal here is that the Facilitator is checking if the FlashProxy 's public IP belongs to a Tor Exit node, if so then the Facilitator is giving a negative answer to the Proxy. One possible solution for this scenario is that the Facilitator is running an online lookup that queries a locally running Tor instance in order to known if a given IP address belongs to a Tor Exit node or not.

So, the question is: is there any other reasonable way (efficient -development and execution time- and safe) to see if an IP address belongs to a Tor Exit node?

Thanks in advance for your help!

Show replies by date

Julian Yon

4 Dec 4 Dec

1:10 p.m.

On Tue, 4 Dec 2012 13:25:15 +0100 Jorge Couchet jorge.couchet@gmail.com wrote:

...

I'm working with the ticket 7549 (https://trac.torproject.org/projects/tor/ticket/7549). ... So, the question is: is there any other reasonable way (efficient -development and execution time- and safe) to see if an IP address belongs to a Tor Exit node?

*looks at the ticket and your approach*

Why not just run and query an Onionoo server? It seems silly to duplicate this effort. You can still put that code in a separate daemon if you want to minimise changes to the Facilitator itself, but it won't have to handle any of the hard stuff. Just ping it a request like GET http://onionoo.local/details?search=10.9.8.7&type=relay and parse the returned JSON to check the exit policy.

Julian

-- 3072D/F3A66B3A Julian Yon (2012 General Use) pgp.2012@jry.me

Michael Zeltner

5:51 p.m.

Excerpts from Julian Yon's message of 2012-12-04 14:10:50 +0100:

...

On Tue, 4 Dec 2012 13:25:15 +0100 Jorge Couchet jorge.couchet@gmail.com wrote:

...
I'm working with the ticket 7549 (https://trac.torproject.org/projects/tor/ticket/7549). ... So, the question is: is there any other reasonable way (efficient -development and execution time- and safe) to see if an IP address belongs to a Tor Exit node?

*looks at the ticket and your approach*

Why not just run and query an Onionoo server?

Onionoo isn't really optimised in regards to giving out lists of exits, the parsing of the JSON sounds like a duplicate effort to me. Also, shipping Onionoo with every facilitator seems a bit overkill.

Please correct me if I'm missing something, but there's two options that are easy to integrate that already exist:

Tor Bulk Exit List, if all you need is checking for access on port 80 https://check.torproject.org/cgi-bin/TorBulkExitList.py

And TorDNSEL, which would include checking for IP:port https://www.torproject.org/projects/tordnsel.html.en

Best, Michael

-- https://niij.org/

Julian Yon

8:25 p.m.

On Tue, 04 Dec 2012 18:51:16 +0100 Michael Zeltner m@niij.org wrote:

...

Excerpts from Julian Yon's message of 2012-12-04 14:10:50 +0100:

...
On Tue, 4 Dec 2012 13:25:15 +0100 Jorge Couchet jorge.couchet@gmail.com wrote:

...
I'm working with the ticket 7549 (https://trac.torproject.org/projects/tor/ticket/7549). ... So, the question is: is there any other reasonable way (efficient -development and execution time- and safe) to see if an IP address belongs to a Tor Exit node?

*looks at the ticket and your approach*

Why not just run and query an Onionoo server?

Onionoo isn't really optimised in regards to giving out lists of exits, the parsing of the JSON sounds like a duplicate effort to me. Also, shipping Onionoo with every facilitator seems a bit overkill.

Have you actually read the ticket? This is in contrast with running a full Tor client and connecting to its ControlPort. Now that is what I call overkill! And parsing JSON is hardly difficult. But you're right: there's no need to run the entire Onionoo server. But there is need for a mechanism to retrieve the relevant data.

...

Please correct me if I'm missing something, but there's two options that are easy to integrate that already exist:

Tor Bulk Exit List, if all you need is checking for access on port 80 https://check.torproject.org/cgi-bin/TorBulkExitList.py

And TorDNSEL, which would include checking for IP:port https://www.torproject.org/projects/tordnsel.html.en

While this is the canonical answer to the question, I held back from saying so because: “This ideally will use a locally cached database of exits. (Not an on-demand DNS lookup.) It should continue to work (perhaps with some classification errors) even if the database can't be refreshed for some time.” So, it needs to maintain its own cache, be explicitly non-realtime, be able to refresh its own database but also to gracefully degrade to disconnected operation. By the time you've coded all that up, you've replicated a big chunk functionality.

But, hey, maybe the requirements are poorly stated. It's hard to tell without further info.

Julian

-- 3072D/F3A66B3A Julian Yon (2012 General Use) pgp.2012@jry.me

David Fifield

6 Dec 6 Dec

4:37 a.m.

On Tue, Dec 04, 2012 at 08:25:25PM +0000, Julian Yon wrote:

...

On Tue, 04 Dec 2012 18:51:16 +0100 Michael Zeltner m@niij.org wrote:

...
Excerpts from Julian Yon's message of 2012-12-04 14:10:50 +0100:

...
On Tue, 4 Dec 2012 13:25:15 +0100 Jorge Couchet jorge.couchet@gmail.com wrote:

...
I'm working with the ticket 7549 (https://trac.torproject.org/projects/tor/ticket/7549). ... So, the question is: is there any other reasonable way (efficient -development and execution time- and safe) to see if an IP address belongs to a Tor Exit node?

*looks at the ticket and your approach*

Why not just run and query an Onionoo server?

Onionoo isn't really optimised in regards to giving out lists of exits, the parsing of the JSON sounds like a duplicate effort to me. Also, shipping Onionoo with every facilitator seems a bit overkill.

Have you actually read the ticket? This is in contrast with running a full Tor client and connecting to its ControlPort. Now that is what I call overkill! And parsing JSON is hardly difficult. But you're right: there's no need to run the entire Onionoo server. But there is need for a mechanism to retrieve the relevant data.

Is running a Tor client really so heavyweight? Let me explain more about what we're trying to do. The facilitator needs to know, for each request, whether the requestor is a Tor exit. The facilitator gets many requests. It's on the order of several per second now, and we haven't advertised it yet. We're designing for a few thousand requests per second. I think that rules out anything like a DNS query or TorBulkExitList.py per request.

A reasonable solution is to update a local cache once an hour. What Jorge is asking is, what's the best way to feed this cache? Any source needs to be at least authenticated and should reflect a consensus of just one authority. This is why I suggested a local Tor client, because it will check the authentication.

Another design (non-)constraint: There are few facilitators (currently one), so ease of deployment is not the biggest concern. It does not have to be as easy as setting up a relay (figure that there will be many more websocket relays than facilitators).

The command in the ticket cat $HOME/auto-naming/moria1/cached-des* | python $HOME/git/contrib/exitlist <ip>:<port> > exitlist seems to me that it is reading a list of exits from a local Tor. This seems pretty reasonable to me. I read that Onionoo reads its information from metrics; where does metrics get the data from?

David Fifield

Karsten Loesing

7:39 a.m.

...

...
...
...
Why not just run and query an Onionoo server?

Onionoo isn't really optimised in regards to giving out lists of exits, the parsing of the JSON sounds like a duplicate effort to me. Also, shipping Onionoo with every facilitator seems a bit overkill.

It's not Onionoo's primary purpose to give out lists of exit addresses, but it provides that information, too. It just doesn't offer good query parameters for that use case. But I think you should do okay downloading the full set of relay summaries once per hour and cache that data locally. The URL is:

https://onionoo.torproject.org/summary?type=relay&running=true

The protocol specification is here:

https://onionoo.torproject.org/

I wouldn't recommend running your own Onionoo server, particularly not on every facilitator. But if you cache results, you don't really have to do that.

...

Is running a Tor client really so heavyweight? Let me explain more about what we're trying to do. The facilitator needs to know, for each request, whether the requestor is a Tor exit.

A Tor client won't tell you that, or at least not very reliably. The reason is that some relays are multi-homed, using different IP addresses for registering in the network (which is what the Tor client would tell you) and for exiting to the Internet.

If you want to learn about both network-internal and external IP addresses, you want to download TorDNSEL's exit list. The URL is:

http://exitlist.torproject.org/exit-addresses

Metrics archives these exit lists and has a format description here:

https://metrics.torproject.org/formats.html#exitlist

...

I read that Onionoo reads its information from metrics; where does metrics get the data from?

Metrics aggregates data from network statuses, relay descriptors, and TorDNSEL exit lists, among others.

So, in summary, I could see you using either TorDNSEL's original data or Onionoo's summaries.

If you decide to use Onionoo, we should add Flashproxy to the list of known clients on Onionoo's project page, so that we who we have to contact when we're planning changes to Onionoo's protocol.

Best, Karsten

David Fifield

7 Dec 7 Dec

4:05 a.m.

On Thu, Dec 06, 2012 at 08:39:09AM +0100, Karsten Loesing wrote:

...

...
...
...
...
Why not just run and query an Onionoo server?

Onionoo isn't really optimised in regards to giving out lists of exits, the parsing of the JSON sounds like a duplicate effort to me. Also, shipping Onionoo with every facilitator seems a bit overkill.

It's not Onionoo's primary purpose to give out lists of exit addresses, but it provides that information, too. It just doesn't offer good query parameters for that use case. But I think you should do okay downloading the full set of relay summaries once per hour and cache that data locally. The URL is:

https://onionoo.torproject.org/summary?type=relay&running=true

The protocol specification is here:

https://onionoo.torproject.org/

I wouldn't recommend running your own Onionoo server, particularly not on every facilitator. But if you cache results, you don't really have to do that.

Thank you, Karsten, for the helpful information.

Jorge, I think that using Onionoo as a data source is the thing to do. You should be able to adapt your program from https://trac.torproject.org/projects/tor/ticket/7549#comment:4. You can assume that the Python json library is present.

David Fifield

Jorge Couchet

7:16 a.m.

Thanks to all for the help!

Best!

Jorge

On Fri, Dec 7, 2012 at 5:05 AM, David Fifield david@bamsoftware.com wrote:

...

On Thu, Dec 06, 2012 at 08:39:09AM +0100, Karsten Loesing wrote:

...
...
...
...
...
Why not just run and query an Onionoo server?

Onionoo isn't really optimised in regards to giving out lists of exits, the parsing of the JSON sounds like a duplicate effort to me. Also, shipping Onionoo with every facilitator seems a bit overkill.

It's not Onionoo's primary purpose to give out lists of exit addresses, but it provides that information, too. It just doesn't offer good query parameters for that use case. But I think you should do okay downloading the full set of relay summaries once per hour and cache that data locally. The URL is:

https://onionoo.torproject.org/summary?type=relay&running=true

The protocol specification is here:

https://onionoo.torproject.org/

I wouldn't recommend running your own Onionoo server, particularly not on every facilitator. But if you cache results, you don't really have to do that.

Thank you, Karsten, for the helpful information.

Jorge, I think that using Onionoo as a data source is the thing to do. You should be able to adapt your program from https://trac.torproject.org/projects/tor/ticket/7549#comment:4. You can assume that the Python json library is present.

David Fifield

4353

Age (days ago)

4356

Last active (days ago)

tor-dev@lists.torproject.org

7 comments

5 participants

tags (0)

participants (5)

David Fifield
Jorge Couchet
Julian Yon
Karsten Loesing
Michael Zeltner