Apologies, I am waiting for a train and don't have much bandwidth, so I will be brief:
1) There is no point in issuing <any header of any kind> to anyone unless they are accessing <website> via an exit node.
2) It's inefficient to issue the header upon every web access by every person in the world; when the header is only relevant to 1-in-a-few-thousand users, you will be imposing extra bandwidth cost upon the remaining 99.99...% -- which is unfair to them
3) Therefore: the header should only be issued to people arriving via an exit node. The means of achieving this are
a) Location
b) Bending Alt-Svc to fit and breaking web standards
c) Creating an entirely new header
4) Location already works and does the right thing. Privacy International already use this and issue it to people who connect to their .ORG site from an Exit Node.
5) Bending Alt-Svc to fit, is pointless, because Location already works
6) Creating a new header? Given (4) and (5) above, the only potential material benefit of it that I can see would be to "promote Tor branding" - and (subjective opinion) this would actually harm the cause of Tor at all because it is *special*.
6 Rationale) The majority the "Dark Net" shade which had been thrown at Tor over the past 10 years has pivoted upon "needing special software to access", and creating (pardon me) a "special" header to onionify a fetch seems to be promoting the weirdness of Tor, again.
The required goal of redirection to the corresponding Onion site does not require anything more than a redirect, and - pardon me - but there are already 4x different kinds of redirects that are supported by the Location header (301, 302, 307, 308) with useful semantics. Why reinvent 4 wheels specially for Tor?
7) Story: when I was implementing the Facebook onion, I built infra to support such (eventual) redirection and/or exit-node-usage tracking. Hit " facebook.com/si/proxy/" from Tor/NonTor to see it in action. The most challenging thing for me was to get a reliable and cheap way to look-up, locally, quickly, cheaply and reliably, whether a given IP address corresponded to an exit node. The closest that I could get to that idea was to scrape Onionoo every so often and to cache the results into a distributed, almost-memcache-like table for the entire site. ie: squillions of machines. This mechanism suffers from all the obvious flaws, notably Onionoo crashes and/or "lag" behind the state of the consensus.
8) So, to pass concrete advice on the basis of experience: rather than pursue novel headers and reinvent a bunch of established, widely-understood web redirection technologies, I would ask that Tor focus its efforts instead upon providing a service - perhaps a listener service embedded in little-t tor as an enable-able service akin to SOCKSListener - which can accept a request from <cidr-subnetmask>, receive an newline-terminated IP address, and return a set of flags associated with that IP (exit node, relay, whatever) - or "none" where the IP is not part of the tor network. Riff on this protocol as you see fit.
This would mean more people running more tor daemons in more datacentres (and possibly configuring some of them as relays) - using this lookup service to establish quickly whether $CLIENT_ADDR is an exit node or not, and whether it should be issued "308 Permanent Redirect With Same Method"
I think this is a better goal for Tor to be pursuing. What do you think?
- alec
Hi,
On 15/11/17 11:35, Alec Muffett wrote:
- So, to pass concrete advice on the basis of experience: rather than
pursue novel headers and reinvent a bunch of established, widely-understood web redirection technologies, I would ask that Tor focus its efforts instead upon providing a service - perhaps a listener service embedded in little-t tor as an enable-able service akin to SOCKSListener - which can accept a request from <cidr-subnetmask>, receive an newline-terminated IP address, and return a set of flags associated with that IP (exit node, relay, whatever) - or "none" where the IP is not part of the tor network. Riff on this protocol as you see fit.
Is this not what TorDNSEL does?
https://www.torproject.org/projects/tordnsel.html.en
Thanks, Iain.
On 15 Nov 2017 12:18, "Iain R. Learmonth" irl@torproject.org wrote:
Is this not what TorDNSEL does? https://www.torproject.org/projects/tordnsel.html.en
Hi Iain!
That certainly sounds like it will give you the answer! But although it would give the right kind of answer, it is not what I am asking for.
At the scale of websites like Facebook or the New York Times, a timely response is required for the purposes of rendering a page. The benefits of solving the problem at "enterprise" scale then trickle down to implementations of all sizes.
Speaking as a programmer, it would be delightfully easy to make a DNS query and wait for a response to give you an answer... but then you have to send the query, wait for propagation, wait for a result, trust the result, debug cached versions of the results, leak the fact that all these lookups are going on, and so forth.
This all adds adds up to latency and cost, as well as leaking metadata of your lookups; plus your local DNS administrator will hate you (cf: doing name resolution for every webpage fetch for writing Apache logs, is frowned upon. Better to log the raw IP address and resolve it later if you need.
On the other hand: if you are running a local Tor daemon, a copy of the entire consensus is held locally and is (basically) definitive. You query it with near zero lookup latency, you get an instant response with no practical lag behind "real time", plus there are no men in the middle, and there is no unwanted metadata leakage.
If the Tor daemon is on the local machine, then the lookup cost is near-zero, and - hey! - you are encouraging more people to run more tor daemons, which (as above) has to be a good thing.
So: the results are very close to what TorDNSEL provides, but what I seek is something with different and better latency, security, reliability and privacy qualities than TorDNSEL offers.
- alec
Alec Muffett alec.muffett@gmail.com writes:
On 15 Nov 2017 12:18, "Iain R. Learmonth" irl@torproject.org wrote:
Is this not what TorDNSEL does? https://www.torproject.org/projects/tordnsel.html.en
Hi Iain!
Hey Alec,
thanks for the feedback.
That certainly sounds like it will give you the answer! But although it would give the right kind of answer, it is not what I am asking for.
At the scale of websites like Facebook or the New York Times, a timely response is required for the purposes of rendering a page. The benefits of solving the problem at "enterprise" scale then trickle down to implementations of all sizes.
Speaking as a programmer, it would be delightfully easy to make a DNS query and wait for a response to give you an answer... but then you have to send the query, wait for propagation, wait for a result, trust the result, debug cached versions of the results, leak the fact that all these lookups are going on, and so forth.
This all adds adds up to latency and cost, as well as leaking metadata of your lookups; plus your local DNS administrator will hate you (cf: doing name resolution for every webpage fetch for writing Apache logs, is frowned upon. Better to log the raw IP address and resolve it later if you need.
On the other hand: if you are running a local Tor daemon, a copy of the entire consensus is held locally and is (basically) definitive. You query it with near zero lookup latency, you get an instant response with no practical lag behind "real time", plus there are no men in the middle, and there is no unwanted metadata leakage.
I think it's important to point out that a Tor client is never guaranteed to hold a *definitive* consensus. Currently Tor clients can stay perfectly happy with a consensus that is up to 3 hours old, even if they don't fetch the latest one (which gets made every hour).
In general, the Tor network does not have a definitive state at any point, and different clients/relays can have different states at the same time.
If we were to create "the definitive exit node oracle" we would need a Tor client that polls the dirauths the second a new consensus comes out, and maybe even then there could be desynchs. Perhaps it's worthwhile doing such a thing, and maybe that's exactly what tordnsel is doing, but it's something that can bring extra load to the dirauths and should not be done in many instances.
Furthermore, you said that enterprises might be spooked out by tor-specific "special" HTTP headers, but now we are discussing weird tor modules that communicate with the Tor daemon to decide whether to redirect clients, so it seems to me like an equally "special" Tor setup for sysadmins.
I think it's important to point out that a Tor client is never guaranteed to hold a *definitive* consensus.
That's why I say "(mostly) definitive" in my text - my feeling is that a locally-held copy of the consensus to be queried is going to be on average of far higher quality, completeness, and non-stagnancy than something that one tries to scrape out of Onionoo every 15 minutes.
True "definitiveness" can wait. A solution which does not require treading beyond the local area network for a "good enough" result, is a sufficient 90+% solution :-)
If we were to create "the definitive exit node oracle" we would need a Tor client that polls the dirauths the second a new consensus comes out,
So let's not do that, then.
Furthermore, you said that enterprises might be spooked out by tor-specific "special" HTTP headers,
Yes.
but now we are discussing weird tor modules that communicate with the Tor daemon to decide whether to redirect clients, so it seems to me like an equally "special" Tor setup for sysadmins.
I can see how you would think that, and I would kind-of agree, but at least this would be local and cheap. Perhaps instead of a magic protocol, it should be a REST API that's embedded in the local Tor daemon? That would be a really, REALLY common pattern for an enterprise to query.
How about that?
- alec
On 15 November 2017 at 05:35, Alec Muffett alec.muffett@gmail.com wrote:
Apologies, I am waiting for a train and don't have much bandwidth, so I will be brief:
- There is no point in issuing <any header of any kind> to anyone unless
they are accessing <website> via an exit node.
- It's inefficient to issue the header upon every web access by every
person in the world; when the header is only relevant to 1-in-a-few-thousand users, you will be imposing extra bandwidth cost upon the remaining 99.99...% -- which is unfair to them
Agreed (mostly). I could see use cases where users not accessing a website via Tor may wish to know an onionsite is available, but they are also the vast minority.
- Therefore: the header should only be issued to people arriving via an
exit node. The means of achieving this are
a) Location
b) Bending Alt-Svc to fit and breaking web standards
c) Creating an entirely new header
- Location already works and does the right thing. Privacy International
already use this and issue it to people who connect to their .ORG site from an Exit Node.
Bending Alt-Svc to fit, is pointless, because Location already works
Creating a new header? Given (4) and (5) above, the only potential
material benefit of it that I can see would be to "promote Tor branding" - and (subjective opinion) this would actually harm the cause of Tor at all because it is *special*.
6 Rationale) The majority the "Dark Net" shade which had been thrown at Tor over the past 10 years has pivoted upon "needing special software to access", and creating (pardon me) a "special" header to onionify a fetch seems to be promoting the weirdness of Tor, again.
The required goal of redirection to the corresponding Onion site does not require anything more than a redirect, and - pardon me - but there are already 4x different kinds of redirects that are supported by the Location header (301, 302, 307, 308) with useful semantics. Why reinvent 4 wheels specially for Tor?
I think there are some additional things to gain by using a new header:
Software that understands the header can handle it differently than Location. I think the notification bar and the 'Don't redirect me to the onionsite' options are pretty good UI things we should consider. They're actually not great UX, but it might be 'doing our part' to try and not confuse users about trusted browser chrome.[0]
Users who _appear_ to be coming from an exit node but are not using Tor are not blackholed. How common is this? I've seen reports from users who do this. If I were in a position to, I would consider having exit node traffic 'blend into' more general non-exit traffic (like a university connection) just to make the political statement that "Tor traffic is internet traffic".
Detecting exit nodes is error prone, as you point out. Some exit nodes have their traffic exit a different address than their listening port.[1]
Location is really close to what we need, but it is limited in some ways. I'm still on the fence.
[0] Except of course that notification bars are themselves spoofable chrome but lets ignore that for now... [1] Hey does Exonerator handle these?
On 15 November 2017 at 07:38, Alec Muffett alec.muffett@gmail.com wrote:
I can see how you would think that, and I would kind-of agree, but at least this would be local and cheap. Perhaps instead of a magic protocol, it should be a REST API that's embedded in the local Tor daemon? That would be a really, REALLY common pattern for an enterprise to query.
This information should already be exposed via the Control Port, although there would be more work on behalf of the implementer to parse more information than desired and pare it down to what is needed.
-tom
On Wed, Nov 15, 2017 at 10:03:39AM -0600, Tom Ritter wrote:
Detecting exit nodes is error prone, as you point out. Some exit nodes have their traffic exit a different address than their listening port.[1]
Right. It's not trivial for tor to figure out what exit relays are multi-homed -- at least not without actually establishing circuits and fetching content over each exit relay.
I just finished an exitmap scan and found 17 exit relays that exit from an IP address that is different from what's listed in the consensus:
193.171.202.146 -> 193.171.202.150 for https://atlas.torproject.org/#details/01A9258A46E97FF8B2CAC7910577862C14F2C524 104.223.123.99 -> 104.223.123.98 for https://atlas.torproject.org/#details/D4010FAD096CFB59278015F711776D8CCB2735EC 87.118.83.3 -> 87.118.82.3 for https://atlas.torproject.org/#details/A8EA2EBB29B0BA4472F26A04A342967FF06CC104 89.31.57.58 -> 89.31.57.5 for https://atlas.torproject.org/#details/7DD29A65C370B86B5BE706EA3B1417745714C8AF 37.187.105.104 -> 196.54.55.14 for https://atlas.torproject.org/#details/91824956DFA430C071BF6B94B623DF10931D1D40 77.247.181.164 -> 77.247.181.162 for https://atlas.torproject.org/#details/204DFD2A2C6A0DC1FA0EACB495218E0B661704FD 198.211.103.26 -> 185.165.169.23 for https://atlas.torproject.org/#details/E56E6976ED9C6B72528ECEDA6C6CEEAC767FA26C 52.15.62.13 -> 69.181.127.85 for https://atlas.torproject.org/#details/833B03789A2A98C6B53D792156FEA3D2E1ECE967 138.197.4.77 -> 163.172.45.46 for https://atlas.torproject.org/#details/D5D6DBED4BEB90DB089AC1E57EA3A13B9B8AA769 52.15.62.13 -> 104.132.0.104 for https://atlas.torproject.org/#details/BF0E33F3897A2109D03DAA2F73AAF8ED25FB6F4D 31.185.27.203 -> 31.185.27.201 for https://atlas.torproject.org/#details/5D263037FC175596B3A344132B0B755EB8FB1D1C 104.223.123.101 -> 104.223.123.98 for https://atlas.torproject.org/#details/02A627FA195809A3ABE031B7864CCA7A310F1D44 77.247.181.166 -> 77.247.181.162 for https://atlas.torproject.org/#details/77131D7E2EC1CA9B8D737502256DA9103599CE51 149.56.223.240 -> 149.56.223.241 for https://atlas.torproject.org/#details/B6718125C43ECA2E5011B3C681BB6638617A9686 88.190.118.95 -> 94.23.201.80 for https://atlas.torproject.org/#details/8C8F0AA30AD7819F16BBD530586CFE58EBA39948 192.241.79.175 -> 192.241.79.178 for https://atlas.torproject.org/#details/DA6CB6C05F4A404184FC3A85FDB83F935C6620DC 143.106.60.70 -> 193.15.16.4 for https://atlas.torproject.org/#details/6BF913C31A47E020637121014DB2AFE0877BD31B
Detecting exit nodes is error prone, as you point out. Some exit nodes have their traffic exit a different address than their listening port. Hey does Exonerator handle these?
Right. It's not trivial for tor to figure out what exit relays are multi-homed -- at least not without actually establishing circuits and fetching content over each exit relay.
I just finished an exitmap scan and found 17 exit relays that exit from an IP address that is different from what's listed in the consensus:
This mode of operation, regardless of how it happens, is not in itself a problem, nor cause for alarm. In fact, the nature of these "exit IP different than ORPort" relays can and often does assist users in circumventing censorship... a fundamental use case of Tor. For instance, the arbitrary automated and blind blocking via dumb blocklists that prevent even such most basic user activity and human right to knowledge as simply reading websites via Tor. Such blocking examples can often be found here: https://trac.torproject.org/projects/tor/wiki/org/doc/ListOfServicesBlocking...
It's also entirely up to the exit operator to determine if the third party non contractual / SLA exonerator service is of any particular use or benefit to them or not... perhaps they have other notary means, or are immune or not subject to any such legal or jurisdictional issues, for which it becomes moot.
Similarly, realtime TorDNSEL and the like could be considered to be censorship enabling tools.
On 16 Nov 2017, at 00:38, Alec Muffett alec.muffett@gmail.com wrote:
I think it's important to point out that a Tor client is never guaranteed to hold a *definitive* consensus.
That's why I say "(mostly) definitive" in my text - my feeling is that a locally-held copy of the consensus to be queried is going to be on average of far higher quality, completeness, and non-stagnancy than something that one tries to scrape out of Onionoo every 15 minutes.
Please don't use a consensus or a tor client to check for exits for this purpose. It produces significant numbers of false negatives, because some exits use other IP addresses for their exit traffic.
Using Onionoo or TorDNSEL reduces your false negatives, because it pulls data from Exitmap to populate exit_addresses. (Tor clients do not pull data from Exitmap, and that data is not in the consensus.)
On 16 Nov 2017, at 03:03, Tom Ritter tom@ritter.vg wrote:
Detecting exit nodes is error prone, as you point out. Some exit nodes have their traffic exit a different address than their listening port.[1]
... [1] Hey does Exonerator handle these?
Exonerator uses data from Exitmap, which queries a service through each exit to discover the address(es) the exit uses to send client requests to websites.
The list is updated every 24 hours. So there's really no need to scrape OnionOO every 15 minutes.
but now we are discussing weird tor modules that communicate with the Tor daemon to decide whether to redirect clients, so it seems to me like an equally "special" Tor setup for sysadmins.
I can see how you would think that, and I would kind-of agree, but at least this would be local and cheap. Perhaps instead of a magic protocol, it should be a REST API that's embedded in the local Tor daemon? That would be a really, REALLY common pattern for an enterprise to query.
You can download the set of exit addresses every 24 hours, and write a small tool that implements a REST API to query it:
https://check.torproject.org/exit-addresses
In fact, you could even adapt the "check" service to your needs, if it doesn't do what you want already:
https://gitweb.torproject.org/check.git
Is this the kind of JSON reply you would want?
https://check.torproject.org/api/ip
{"IsTor":true,"IP":"176.10.104.243"}
Or for the interactive version, see:
https://check.torproject.org/cgi-bin/TorBulkExitList.py
(And if you supply a destination port, it's more accurate, because it checks exit policies as well.)
T
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n ------------------------------------------------------------------------
teor teor2345@gmail.com writes:
On 16 Nov 2017, at 00:38, Alec Muffett alec.muffett@gmail.com wrote:
I think it's important to point out that a Tor client is never guaranteed to hold a *definitive* consensus.
That's why I say "(mostly) definitive" in my text - my feeling is that a locally-held copy of the consensus to be queried is going to be on average of far higher quality, completeness, and non-stagnancy than something that one tries to scrape out of Onionoo every 15 minutes.
Please don't use a consensus or a tor client to check for exits for this purpose. It produces significant numbers of false negatives, because some exits use other IP addresses for their exit traffic.
I'm actually not a fan of Alec's idea, and I agree with you that there will be a significant number of false negatives, but it might be worth pointing out that IIUC false negatives are probably not so damaging in this use case, because it would result in users getting thrown to the normal website instead of the onion site, because the website didn't realize they are Tor users. So not much damage done there.
False positives are a bit more damaging for reachability because it means that the website would throw normal users to the onion website which would fail, but that's not so likely (except if exit node operators surf from their exit node, or if an exit node IP is shared to other people).