Hi,
After some interesting discussions irl last week with knowledgeable DNS and security people (hi Jakob) I'd like to hear from people involved with DNS in Tor what current status is and what needs to be done.
More specifically, what's the status of ttdnsd and TorDNSd? Are they being used? Any thoughts about having a local validating resolver?
I know there's been some discussions (4zm, are you here?) about using libunbound (which could be interesting for DNSSEC support). Did that evolve into anything useful?
I'm by no means a DNS expert but would love to see some discussion about this, partly because future IPv6 work will depend on changes to our DNS system.
Thanks, Linus
On Thu, Jan 19, 2012 at 7:39 AM, Linus Nordberg linus@nordberg.se wrote:
Hi,
After some interesting discussions irl last week with knowledgeable DNS and security people (hi Jakob) I'd like to hear from people involved with DNS in Tor what current status is and what needs to be done.
More specifically, what's the status of ttdnsd and TorDNSd? Are they being used? Any thoughts about having a local validating resolver?
I know there's been some discussions (4zm, are you here?) about using libunbound (which could be interesting for DNSSEC support). Did that evolve into anything useful?
I'm by no means a DNS expert but would love to see some discussion about this, partly because future IPv6 work will depend on changes to our DNS system.
Hi, Linus!
So, I think that what we actually need from a proper way to do DNS over Tor is a way for the Tor client to make real DNS requests to get handled by an exit node's DNS server or servers. Right now, we don't have that; we have a pile of half-measures instead.
Specifically, here's Tor's DNS support now: * when the client uses a BEGIN relay cell to open a new stream, the exit node does a lookup on the requested hostname at its nameservers, connects there, and tells the client what the IP was. No info about the lookup other than the IPv4 address is returned. * A client can use a RESOLVE relay cell to do an A lookup, an AAAA lookup (not supported iirc), or a PTR lookup at the exit node's nameservers. But they don't get back the full answer; they only get back the IP address or hostname.
Originally, we limited the DNS functionality that the exit node would expose for you because we were worried about what kind of shennanegans somebody could do with an arbitrarily crafted DNS request, and so we restricted ourselves to a minimalist subset. (This was back when Dan Kaminski's favorite hobby was finding unexpected applications of DNS, like streaming video and whatnot.)
But I think the right design is probably something like allowing clients to request more DNS info via exit nodes' nameservers, and get more info back. We should think of ways to do this that avoid extra round trips, but that should be doable.
At the most extreme, this could just give clients the ability to generate arbitrary DNS requests and get the entire response back. If that seems worrisome, we could limit the form of the requests to a reasonable subset, prevent various "christmas-tree" requests, and so on. I don't personally understand the security issues here too well, but I know they exist.
As an aside, DNSSEC for hostname lookup only helps so much here: If I know for certain that www.example.com is 10.2.3.4, that doesn't really help me if I can't know whether I'm really talking to 10.2.3.4. But there are DNSSEC uses, such as TLS certificate stapling, that would still be reasonable to do over Tor.
yrs,
On 01/19/2012 11:13 PM, Nick Mathewson wrote:
On Thu, Jan 19, 2012 at 7:39 AM, Linus Nordberg linus@nordberg.se wrote:
Hi,
After some interesting discussions irl last week with knowledgeable DNS and security people (hi Jakob) I'd like to hear from people involved with DNS in Tor what current status is and what needs to be done.
More specifically, what's the status of ttdnsd and TorDNSd? Are they being used? Any thoughts about having a local validating resolver?
So far I've seen ttdnsd used only in Tails, TorDNSd was seen mentioned only in the Tor mailing lists (not sure how many individuals may be using it though).
ttdnsd: kind of works, unless validation is required (ttdnsd fails as unbound forwarder, most likely because of not handling DS queries correctly)
It seems that bunch of people who experimented with DNS over Tor came to conclusion that using existing caching resolver like unbound is simpler than specialized resolvers like ttdnsd.
Originally, we limited the DNS functionality that the exit node would expose for you because we were worried about what kind of shennanegans somebody could do with an arbitrarily crafted DNS request, and so we restricted ourselves to a minimalist subset. (This was back when Dan Kaminski's favorite hobby was finding unexpected applications of DNS, like streaming video and whatnot.)
The same queries can be done by anyone via setting up a tunnel to a recursive resolver (except the constraint that it has to be over tcp). So video streaming over DNS and other DNS-tunelling tricks would work.
But I think the right design is probably something like allowing clients to request more DNS info via exit nodes' nameservers, and get more info back. We should think of ways to do this that avoid extra round trips, but that should be doable.
A naive/straightforward idea is to bundle unbound with Tor/TBB and have it accessible through exit-enclave (unless new cell is explicitly desired). However that adds another thing to maintain. And while rare, there exist networks that either "transparent-proxy" DNS or scrub DNSSEC data from answers.
At the most extreme, this could just give clients the ability to generate arbitrary DNS requests and get the entire response back. If that seems worrisome, we could limit the form of the requests to a reasonable subset, prevent various "christmas-tree" requests, and so on. I don't personally understand the security issues here too well, but I know they exist.
The only problematic "christmas tree request" I can think of is the DNSSEC traffic amplification for some crafted queries (but that can be done now over tunnel to recursive resolver as well).
As an aside, DNSSEC for hostname lookup only helps so much here: If I know for certain that www.example.com is 10.2.3.4, that doesn't really help me if I can't know whether I'm really talking to 10.2.3.4. But there are DNSSEC uses, such as TLS certificate stapling, that would still be reasonable to do over Tor.
Sure, exit node must be trusted (same way routers must be trusted not to do some funny DNAT-ing). DNSSEC validation would mitigate a DNS poisoning attack on exit node's resolver (resolvers using static/sequential ports are still widespread).
Ondrej
Hi,
Ondrej Mikle wrote (21 Jan 2012 01:47:56 GMT) :
So far I've seen ttdnsd used only in Tails, TorDNSd was seen mentioned only in the Tor mailing lists (not sure how many individuals may be using it though).
ttdnsd: kind of works, unless validation is required (ttdnsd fails as unbound forwarder, most likely because of not handling DS queries correctly)
It seems that bunch of people who experimented with DNS over Tor came to conclusion that using existing caching resolver like unbound is simpler than specialized resolvers like ttdnsd.
For the record, Tails uses a combination of the pdnsd caching DNS server, the Tor resolver (for request types it supports) and ttdnsd (fallback for other requests); details:
https://tails.boum.org/contribute/design/Tor_enforcement/DNS/
Cheers, -- intrigeri | GnuPG key @ https://gaffer.ptitcanardnoir.org/intrigeri/intrigeri.asc | OTR fingerprint @ https://gaffer.ptitcanardnoir.org/intrigeri/otr.asc | Do not be trapped by the need to achieve anything. | This way, you achieve everything.
On Thu, Jan 19, 2012 at 05:13:19PM -0500, Nick Mathewson wrote:
But I think the right design is probably something like allowing clients to request more DNS info via exit nodes' nameservers, and get more info back. We should think of ways to do this that avoid extra round trips, but that should be doable.
Ha. That'll teach me to answer tor-dev threads assuming nobody broke the threading. :)
So Nick, are you thinking we want a way for exit relays to receive an already-formatted dns query inside the Tor protocol, and get it onto the network somehow heading towards their configured nameservers? Or did you have something else in mind?
I wonder if we want a begin_dns relay command, sort of like the current begin and begin_dir commands, and then just let them talk TCP to one of our nameservers? Or is that too much like the previous hacks?
--Roger
On 01/30/2012 07:59 AM, Roger Dingledine wrote:
On Thu, Jan 19, 2012 at 05:13:19PM -0500, Nick Mathewson wrote:
But I think the right design is probably something like allowing clients to request more DNS info via exit nodes' nameservers, and get more info back. We should think of ways to do this that avoid extra round trips, but that should be doable.
So Nick, are you thinking we want a way for exit relays to receive an already-formatted dns query inside the Tor protocol, and get it onto the network somehow heading towards their configured nameservers? Or did you have something else in mind?
I wonder if we want a begin_dns relay command, sort of like the current begin and begin_dir commands, and then just let them talk TCP to one of our nameservers? Or is that too much like the previous hacks?
In GNUnet, we simply send the raw DNS payload over the mesh network to the exit node (in what for you would be a new cell type), the exit node sends it out via UDP to whatever DNS server the user provided, and the exit sends the response back to the initiator. So the exit never parses the DNS request or response at all. From what I've seen so far, 512 byte cells might do just fine >90% of the time, unless of course DNSSEC somehow takes off. However, GNUnet's message size limit is 64k, so this is not something I've been studying extensively.
In cases where we need to parse DNS queries (likely outside of Tor's scope), we have our own library to do so, but even there we never parse DNS queries that did not originate from our own system.
In summary, I think begin_dns is a good idea, but I'm not sure you need to then talk TCP to the nameserver -- UDP ought to suffice.
My 2 cents
Happy hacking!
Christian
On 01/30/2012 01:09 AM, Christian Grothoff wrote:
On 01/30/2012 07:59 AM, Roger Dingledine wrote:
On Thu, Jan 19, 2012 at 05:13:19PM -0500, Nick Mathewson wrote:
But I think the right design is probably something like allowing clients to request more DNS info via exit nodes' nameservers, and get more info back. We should think of ways to do this that avoid extra round trips, but that should be doable.
So Nick, are you thinking we want a way for exit relays to receive an already-formatted dns query inside the Tor protocol, and get it onto the network somehow heading towards their configured nameservers? Or did you have something else in mind?
I wonder if we want a begin_dns relay command, sort of like the current begin and begin_dir commands, and then just let them talk TCP to one of our nameservers? Or is that too much like the previous hacks?
In GNUnet, we simply send the raw DNS payload over the mesh network to the exit node (in what for you would be a new cell type), the exit node sends it out via UDP to whatever DNS server the user provided, and the exit sends the response back to the initiator. So the exit never parses the DNS request or response at all. From what I've seen so far, 512 byte cells might do just fine >90% of the time, unless of course DNSSEC somehow takes off. However, GNUnet's message size limit is 64k, so this is not something I've been studying extensively.
In cases where we need to parse DNS queries (likely outside of Tor's scope), we have our own library to do so, but even there we never parse DNS queries that did not originate from our own system.
In summary, I think begin_dns is a good idea, but I'm not sure you need to then talk TCP to the nameserver -- UDP ought to suffice.
I think begin_dns is a good idea as well.
It seems to me that there are a few ways to do it:
send the query and the type send a raw packet that is then forwarded send a variable query and a fixed type (what we do now)
I think that even if DNSSEC dies tomorrow, we'd be silly to stick with the way we do things now. I also think that sending a raw packet is a bit sketchy as it basically means that you're sending client side crafted data - this usually isn't good news from an anonymity perspective.
If begin_dns worked by sending the query and the type, we'd remove almost all possibilities of client side DNS fingerprinting but we'd add attack surface to the exit nodes...
However, I imagine that if we wanted, we could add a new flag 'dns' that would allow various exit nodes to declare themselves open for begin_dns cells. When a user opens the DNSPort, they could select nodes flagged with 'dns' to query directly. If none existed or the query was of a generic CNAME, PTR or A record type, we could use any normally available node.
On the 'dns' flagged exit nodes, a client could begin_dns and then we'd parse the query and the type, generate the DNS query and then ask the our locally configured name server. In an ideal world, we'd use something like unbound to do the parsing and perhaps even to do caching.
All the best, Jacob
On 01/30/2012 11:18 AM, Jacob Appelbaum wrote:
On 01/30/2012 01:09 AM, Christian Grothoff wrote:
In summary, I think begin_dns is a good idea, but I'm not sure you need to then talk TCP to the nameserver -- UDP ought to suffice.
I think begin_dns is a good idea as well.
Seconded, I also find it as a good idea.
It seems to me that there are a few ways to do it:
send the query and the type send a raw packet that is then forwarded send a variable query and a fixed type (what we do now)
I think that even if DNSSEC dies tomorrow, we'd be silly to stick with the way we do things now. I also think that sending a raw packet is a bit sketchy as it basically means that you're sending client side crafted data - this usually isn't good news from an anonymity perspective.
I'd suggest that client sends query string, RR type and class in the cell. The class is almost always INTERNET, but CHAOS can be useful for debugging which server of anycast cluster are you actually talking to. You'll almost never need the class CHAOS, but when you do, it will come in handy (see TXT "hostname.bind" and "version.bind").
DNSSEC: it will become very useful once DANE protocol is standardized (see https://datatracker.ietf.org/doc/draft-ietf-dane-protocol/). DANE is a certificate-pinning protocol, saying which site should have which TLS certificate or which CA should have issued it (maybe Sovereign Keys or Auditable CAs will catch on first, but there's no way of knowing yet).
If begin_dns worked by sending the query and the type, we'd remove almost all possibilities of client side DNS fingerprinting but we'd add attack surface to the exit nodes...
I agree. How do we evaluate exit nodes' attack surface? (I suggested fuzzing libunbound/ldns as one method). How could we hide the CHAOS queries?
However, I imagine that if we wanted, we could add a new flag 'dns' that would allow various exit nodes to declare themselves open for begin_dns cells. When a user opens the DNSPort, they could select nodes flagged with 'dns' to query directly. If none existed or the query was of a generic CNAME, PTR or A record type, we could use any normally available node.
With current code of relays, CNAME, A and PTR for in-addr.arpa would work. These three RR types have an advantage that they can be easily checked for resolution of private adresses (like Tor does now; though banning resolution of ".local" FQDNs might be added, it's a damn special case).
I'd add NS, DNAME and AAAA to the default-allowed set (DNAME is quite rare, nevertheless used, there's also BNAME RFC draft that seems expired).
If we want to support DNSSEC, then DS, DNSKEY, RRSIG, NSEC, NSEC3 should be allowed as well.
On the 'dns' flagged exit nodes, a client could begin_dns and then we'd parse the query and the type, generate the DNS query and then ask the our locally configured name server. In an ideal world, we'd use something like unbound to do the parsing and perhaps even to do caching.
libunbound as well as unbound do caching. ldns can do parsing (libunbound uses ldns).
Ondrej
On 01/30/2012 06:07 PM, Ondrej Mikle wrote:
On 01/30/2012 11:18 AM, Jacob Appelbaum wrote:
On 01/30/2012 01:09 AM, Christian Grothoff wrote:
In summary, I think begin_dns is a good idea, but I'm not sure you need to then talk TCP to the nameserver -- UDP ought to suffice.
I think begin_dns is a good idea as well.
Seconded, I also find it as a good idea.
Glad to hear it.
It seems to me that there are a few ways to do it:
send the query and the type send a raw packet that is then forwarded send a variable query and a fixed type (what we do now)
I think that even if DNSSEC dies tomorrow, we'd be silly to stick with the way we do things now. I also think that sending a raw packet is a bit sketchy as it basically means that you're sending client side crafted data - this usually isn't good news from an anonymity perspective.
I'd suggest that client sends query string, RR type and class in the cell. The class is almost always INTERNET, but CHAOS can be useful for debugging which server of anycast cluster are you actually talking to. You'll almost never need the class CHAOS, but when you do, it will come in handy (see TXT "hostname.bind" and "version.bind").
I think that almost any record type is fine.
DNSSEC: it will become very useful once DANE protocol is standardized (see https://datatracker.ietf.org/doc/draft-ietf-dane-protocol/). DANE is a certificate-pinning protocol, saying which site should have which TLS certificate or which CA should have issued it (maybe Sovereign Keys or Auditable CAs will catch on first, but there's no way of knowing yet).
Agreed. DANE is an important nail in the CA Racket's coffin. :)
If begin_dns worked by sending the query and the type, we'd remove almost all possibilities of client side DNS fingerprinting but we'd add attack surface to the exit nodes...
I agree. How do we evaluate exit nodes' attack surface? (I suggested fuzzing libunbound/ldns as one method). How could we hide the CHAOS queries?
Well - first off, we'd want to determine the places where new code is added - if we don't change current things and only add a cell type, I think that's quite easy to do. Secondly, I'd imagine that we'd want to audit the underlying library quite extensively.
However, I imagine that if we wanted, we could add a new flag 'dns' that would allow various exit nodes to declare themselves open for begin_dns cells. When a user opens the DNSPort, they could select nodes flagged with 'dns' to query directly. If none existed or the query was of a generic CNAME, PTR or A record type, we could use any normally available node.
With current code of relays, CNAME, A and PTR for in-addr.arpa would work. These three RR types have an advantage that they can be easily checked for resolution of private adresses (like Tor does now; though banning resolution of ".local" FQDNs might be added, it's a damn special case).
Right. We could certainly enable inspection at DNSPort time - it can check for RFC1918 addresses. I personally want a way to know what a server replied with - even if it might be harmful, I want a true, verifiable answer. I also want a way to ensure that it doesn't shoot people in the foot. So, perhaps we can do both?
I'd add NS, DNAME and AAAA to the default-allowed set (DNAME is quite rare, nevertheless used, there's also BNAME RFC draft that seems expired).
I'd like to see - TXT, SSHFP, CHAOS, NS, DNAME, AAAA, A, PTR, CNAME, DS, DNSKEY, RRSIG, NSEC, NSEC3, CERT, IPSECKEY, KEY, SOA, MX, SRV, SPF
It's becoming very difficult to use Tor without native SRV record for say, Jabber and the same is true for MX and other types.
Basically, the entire list: http://en.wikipedia.org/wiki/List_of_DNS_record_types
If we want to support DNSSEC, then DS, DNSKEY, RRSIG, NSEC, NSEC3 should be allowed as well.
Agreed.
On the 'dns' flagged exit nodes, a client could begin_dns and then we'd parse the query and the type, generate the DNS query and then ask the our locally configured name server. In an ideal world, we'd use something like unbound to do the parsing and perhaps even to do caching.
I think it's reasonable to separate it into two tasks - 'dns' flagged exits would require supporting begin_dns - caching is something we should probably have but a full unbound cache is something perhaps huge to put into the same process.
libunbound as well as unbound do caching. ldns can do parsing (libunbound uses ldns).
I think that seems OK. I think the first step is a proposal, the second step is likely to implement whatever "begin_dir" means, a third step is another proposal where we add the "dns" flag to the Tor spec; likely we'd find that the second step requires a cache...
Thanks for hacking on this!
All the best, Jacob
On Tue, Jan 31, 2012 at 1:08 AM, Jacob Appelbaum jacob@appelbaum.net wrote:
I think that seems OK. I think the first step is a proposal,
Anybody volunteering for this, or should I throw it on my pile?
On 01/31/2012 03:42 PM, Nick Mathewson wrote:
On Tue, Jan 31, 2012 at 1:08 AM, Jacob Appelbaum jacob@appelbaum.net wrote:
I think that seems OK. I think the first step is a proposal,
Anybody volunteering for this, or should I throw it on my pile?
I volunteer for writing the proposal.
Ondrej
On Tue, Jan 31, 2012 at 4:22 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
On 01/31/2012 03:42 PM, Nick Mathewson wrote:
On Tue, Jan 31, 2012 at 1:08 AM, Jacob Appelbaum jacob@appelbaum.net wrote:
I think that seems OK. I think the first step is a proposal,
Anybody volunteering for this, or should I throw it on my pile?
I volunteer for writing the proposal.
Great! If you haven't seen it already, please have a look at the repository linked from https://gitweb.torproject.org/torspec.git , especially spec/proposals/* , for our proposal format.
cheers,
On 01/31/2012 06:42 AM, Nick Mathewson wrote:
On Tue, Jan 31, 2012 at 1:08 AM, Jacob Appelbaum jacob@appelbaum.net wrote:
I think that seems OK. I think the first step is a proposal,
Anybody volunteering for this, or should I throw it on my pile?
I think it might make sense for you, me and Ondrej to write one up?
All the best, Jacob
On Tue, Jan 31, 2012 at 6:20 PM, Jacob Appelbaum jacob@appelbaum.net wrote:
On 01/31/2012 06:42 AM, Nick Mathewson wrote:
On Tue, Jan 31, 2012 at 1:08 AM, Jacob Appelbaum jacob@appelbaum.net wrote:
I think that seems OK. I think the first step is a proposal,
Anybody volunteering for this, or should I throw it on my pile?
I think it might make sense for you, me and Ondrej to write one up?
I'll wait to see what Ondrej comes up with; it's pretty normal to do revisions on this stuff, after all, and he's already said he'd like to give it a try. If you want to do one too, I'd be glad to take on merging. Or ask Ondrej if he wants to collaborate on the first draft.
On 01/31/2012 03:29 PM, Nick Mathewson wrote:
On Tue, Jan 31, 2012 at 6:20 PM, Jacob Appelbaum jacob@appelbaum.net wrote:
On 01/31/2012 06:42 AM, Nick Mathewson wrote:
On Tue, Jan 31, 2012 at 1:08 AM, Jacob Appelbaum jacob@appelbaum.net wrote:
I think that seems OK. I think the first step is a proposal,
Anybody volunteering for this, or should I throw it on my pile?
I think it might make sense for you, me and Ondrej to write one up?
I'll wait to see what Ondrej comes up with; it's pretty normal to do revisions on this stuff, after all, and he's already said he'd like to give it a try. If you want to do one too, I'd be glad to take on merging. Or ask Ondrej if he wants to collaborate on the first draft.
That sounds good. I'll wait for the first draft and send feedback.
All the best, Jacob
On 02/01/2012 10:01 AM, Jacob Appelbaum wrote:
That sounds good. I'll wait for the first draft and send feedback.
First draft is ready here:
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
Hopefully I reflected all the main points made in the DNS threads. There are a few TODOs I couldn't decide what the best course of action would be (usually options are listed).
I tried to keep Tor as much "DNS/DNSSEC-agnostic" as possible. There exists a combination of answers to the TODOs that Tor won't have to touch the DNSSEC part, except for calling ub_resolve ;-) And DNS packet itself would be touched only in the DNSPort/SOCKS part (but still no fiddling with DNSSEC part).
Ondrej
Ondrej,
I may have missed parts of the previous discussion, but why are you not encapsulating the whole DNS request from the client? Various flags and other options (e.g. EDNS0) would be quite useful to be able to transport across the TOR network.
jakob
On 02/07/2012 03:18 PM, Jakob Schlyter wrote:
I may have missed parts of the previous discussion, but why are you not encapsulating the whole DNS request from the client? Various flags and other options (e.g. EDNS0) would be quite useful to be able to transport across the TOR network.
There were two main objections:
1. full packet might leak identifying information about OS or resolver used, quoting Nick:
There are parts of a DNS packet that we wouldn't want to have the Tor client make up. For example, DNS transaction IDs would need to avoid collisions. Similarly, I don't see why the client should be setting most of the possible flags.
The query will work as if following was set: flags 0x110 (recursive, non-authenticated data ok), DO bit set. Is there any reason for setting some flags otherwise or make some optional?
2. Roger wanted Tor to know as little as possible about DNS internals.
Ondrej
On 7 feb 2012, at 22:08, Ondrej Mikle wrote:
- full packet might leak identifying information about OS or resolver used,
quoting Nick:
There are parts of a DNS packet that we wouldn't want to have the Tor client make up. For example, DNS transaction IDs would need to avoid collisions. Similarly, I don't see why the client should be setting most of the possible flags.
The query will work as if following was set: flags 0x110 (recursive, non-authenticated data ok), DO bit set. Is there any reason for setting some flags otherwise or make some optional?
If you bundle a full resolver (e.g. libunbound) with the TOR client, you will be much better off doing full DNS packet transport, or you have to rewrite the upstream forwarding code. I do about the potential fingerprinting issues (I'm one of the people behind Net::DNS::Fingerprint), but in this case I believe we can mitigate these issues (if considered important) by masking/rewriting some DNS request fields within the TOR client and/or exit node.
jakob
On 02/10/2012 08:20 AM, Jakob Schlyter wrote:
On 7 feb 2012, at 22:08, Ondrej Mikle wrote:
- full packet might leak identifying information about OS or resolver used,
quoting Nick:
There are parts of a DNS packet that we wouldn't want to have the Tor client make up. For example, DNS transaction IDs would need to avoid collisions. Similarly, I don't see why the client should be setting most of the possible flags.
The query will work as if following was set: flags 0x110 (recursive, non-authenticated data ok), DO bit set. Is there any reason for setting some flags otherwise or make some optional?
If you bundle a full resolver (e.g. libunbound) with the TOR client, you will be much better off doing full DNS packet transport, or you have to rewrite the upstream forwarding code. I do about the potential fingerprinting issues (I'm one of the people behind Net::DNS::Fingerprint), but in this case I believe we can mitigate these issues (if considered important) by masking/rewriting some DNS request fields within the TOR client and/or exit node.
I guess you are right as long as the DNS packet transmitted to exit node is always generated by libunbound (BTW fpdns is a neat tool).
Validation must be on by default as well, otherwise it would be really to fingerprint users that turned it on manually.
I'll update the draft in a few days, just a quick summary of changes: - drop IDs (use StreamID), drop length from DNS_RESPONSE, keep just uint16_t total_length - separate tool for AXFR/IXFR so that server can be specified - validation always on client side by default
Ondrej
Hi,
I've updated the Tor DNS/DNSSEC draft from what was said in this thread. Short summary of changes:
- drop IDs (use StreamID), drop length from DNS_RESPONSE, keep just uint16_t total_length - separate tool for AXFR so that server can be specified - validation always on client side by default - full DNS packets sent in DNS_BEGIN (generated by libunbound)
Other changes (mostly minor):
- IXFR not supported (rare corner case) - "common" DNS policy - if updates between Tor versions change this "allowed set" (e.g. new RR type), exit node with old Tor version simply returns REFUSED - specified the algorithm of TTL normalization
Link to full text (diff is pasted at the end of this mail):
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
Ondrej
Diff:
diff --git a/proposals/ideas/xxx-dns-dnssec.txt b/proposals/ideas/xxx-dns-dnssec.txt index 865e06d..ea711ce 100644 --- a/proposals/ideas/xxx-dns-dnssec.txt +++ b/proposals/ideas/xxx-dns-dnssec.txt @@ -33,26 +33,22 @@ Status: Draft
DNS_BEGIN payload:
- RR type (2 octets) - RR class (2 octets) - ID (2 octets) - length (1 octet) - query (variable) + DNS packet data (variable length)
- The RR type and class match counterparts in DNS packet. ID is for - identifying which data belong together, since response can be longer than - single cell's payload. The ID MUST be random and MUST NOT be copied from - xid of request DNS packet (in case of using DNSPort). + The DNS packet must be generated internally by libunbound to avoid + fingerprinting users by differences in client resolvers' behavior.
DNS_RESPONSE payload:
ID (2 octets) - data length (2 octets) - total length (4 octets) + total length (2 octets) data (variable)
- Data contains the reply DNS packet. Total length describes length of - complete response packet. + Data contains the reply DNS packet or its part if packet would not fit into + the cell. Total length describes length of complete response packet. + + AXFR and IXRF are not supported in this cell by design (see specialized tool + below).
2. Interfaces to applications
@@ -80,11 +76,9 @@ Status: Draft for asking authoritative servers.
For client side, full validation would be optional described by option - DNSValidation (0|1). (TODO: what is a sensible default? Validation is not - much useful in A/AAAA case, but for instance SRV, TXT and TLSA are a - different case. Only reason for turning validation off is a faster - round-trip. We can also leave it to validating resolver that uses DNSPort as - forwarder.) + DNSValidation (0|1). By default validation is turned on, otherwise it would + be easy to fingerprint people who turned it on and asked for not-so-common + records like SRV.
4. Changes to directory flags
@@ -93,9 +87,12 @@ Status: Draft - CommonDNS - reflects "common" DNSQueryPolicy - FullDNS - reflects "full" DNSQueryPolicy
- (TODO: how do we handle adding new RR types to "common" as they are created? - One option would be to create CommonDNS_1 ... CommonDNS_N such that - CommonDNS_{N-1} is subset of CommonDNS_N.) + Exit node asked for a RR type not in CommonDNS policy will return REFUSED in + as status in the reply DNS packet contained in DNS_RESPONSE cell. + + If new types are added to CommonDNS set (e.g. new RFC adds a record type) + and exit node's Tor version does not recognize it as allowed, it will send + REFUSED as well.
5. Implementation notes
@@ -105,8 +102,7 @@ Status: Draft Client will periodically purge incomplete DNS replies. Any unexpected DNS_RESPONSE will be dropped.
- Request for special names (.onion, .exit, .noconnect) will return SERVFAIL - (for NXDOMAIN we'd have to implement NSEC/NSEC3). + Request for special names (.onion, .exit, .noconnect) will return REFUSED.
RELAY_BEGIN would function "normally", there is no need for returning DNS data. In case of malicious exit, client can't check he's really connected to @@ -116,21 +112,52 @@ Status: Draft
AD flag must be zeroed out on client unless validation is performed.
-6. Security implications +6. Separate tool for AXFR
- Client as well as exit MUST block attempts to resolve local RFC 1918, 4193, - 4291 adresses (PTR) or local names (e.g. "*.local") in order not to leak - unnecessary information about home network. (TODO: TLD whitelist instead of - filtering "*.local" names? That would require exit node to periodically - update list from ICANN.) + The AXFR tool will have similar interface like tor-resolve, but will + return raw DNS data. + + Parameters are: query domain, server IP of authoritative DNS.
- An exit resolving names SHOULD use libunbound for all types of resolving so - that an attacker eavesdropping will have it harder to distinguish which - names were queried by connect command and which using the DNS subsystem - (TODO: will this really help, since attacker can guess from RR type and - whether or not a TCP connection follows?) + The tool will transfer the data through "ordinary" tunnel using RELAY_BEGIN + and related cells. + + This design decision serves two goals: + + - DNS_BEGIN and DNS_RESPONSE will be simpler to implement (lower chance of + bugs) + - in practice it's often useful do AXFR queries on secondary authoritative + DNS servers + + IXFR will not be supported (infrequent corner case, can be done by manual + tunnel creation over Tor if truly necessary). + +7. Security implications + + Client as well as exit MUST block attempts to resolve local RFC 1918, 4193, + 4291 adresses (PTR).
- TTL in reply DNS packet MUST be somehow normalized at exit node so that - client won't learn what other clients queried. Transaction ID is provided - randomly by libunbound, no need to modify. This affects only DNSPort and + An exit node resolving names will use libunbound for all types of resolving, + including lookup of A/AAAA records when connecting stream to desired + server. Ordinary streams will gain a small benefit of defense against DNS + cache poisoning on exit node's network. + + Transaction ID is provided randomly by libunbound, no + need to modify. This affects only DNSPort and SOCKS interfaces. + +8. TTL normalization idea + + Complex on implementation, because it requires parsing DNS packets at exit + node. + + TTL in reply DNS packet MUST be normalized at exit node so that client won't + learn what other clients queried. The normalization is done in following + way: + + - for a RR, the original TTL value received from authoritative DNS server + should be used when sending DNS_RESPONSE, trimming the values to interval + [5, 600] + - does not pose "ghost-cache-attack", since once RR is flushed from + libunbound's cache, it must be fetched anew +
On Sat, Feb 4, 2012 at 10:38 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
On 02/01/2012 10:01 AM, Jacob Appelbaum wrote:
That sounds good. I'll wait for the first draft and send feedback.
First draft is ready here:
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
Cool! Since you're calling it a draft, I'm not assigning it a number yet; please let me know when you want that to change.
Some initial comments:
DNS_BEGIN payload:
RR type (2 octets) RR class (2 octets) ID (2 octets) length (1 octet) query (variable)
The RR type and class match counterparts in DNS packet. ID is for identifying which data belong together, since response can be longer than single cell's payload. The ID MUST be random and MUST NOT be copied from xid of request DNS packet (in case of using DNSPort).
I think you can dispense with the "ID" field entirely; the "StreamID" part of the relay cell header should already fulfill this role, if I'm understanding the purpose of "ID" correctly.
Like Jakob, I'm wondering why there isn't any support for setting flags.
I wonder whether the "length" field here is redundant with the "length" field in the relay header. Probably not, I guess: Having a length field here means we can send
DNS_RESPONSE payload:
ID (2 octets) data length (2 octets) total length (4 octets) data (variable)
So to be clear, if the reply is 1200 bytes long, then the user will receive four cells, with relay payload contents: { ID = x, data_len = 490, total_len = 1200, data = (bytes[0..489] } { ID = x, data_len = 490, total_len = 1200, data = (bytes[490..979] } { ID = x, data_len = 220, total_len = 1200, data = (bytes[980..1199], zero padding} }
As above, I think we can eliminate the ID field. Also, in this case, I think the length field in this packet _is_ redundant with the length field of the relay cell header.
I think the total_len field could be replaced with a single bit to indicate "this is the last cell".
Data contains the reply DNS packet. Total length describes length of complete response packet.
I think we want to do some sanitization on the reply DNS packet. In particular, we have no need to say what the transaction ID was, or
Initial Questions:
When running in dnsport mode, it seems we risk leaking information about the client resolver based on which requests it makes in what order. Is that so?
How many round trips are we looking at here for typical use cases, and what can we do to reduce them? We've found that anything that adds extra round trips to opening a connection in Tor is a real problem for a lot of use cases, and so we should try to avoid them as much as possible.
On 02/07/2012 07:18 PM, Nick Mathewson wrote:
On Sat, Feb 4, 2012 at 10:38 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
First draft is ready here:
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
Some initial comments:
DNS_BEGIN payload:
RR type (2 octets) RR class (2 octets) ID (2 octets) length (1 octet) query (variable)
The RR type and class match counterparts in DNS packet. ID is for identifying which data belong together, since response can be longer than single cell's payload. The ID MUST be random and MUST NOT be copied from xid of request DNS packet (in case of using DNSPort).
I think you can dispense with the "ID" field entirely; the "StreamID" part of the relay cell header should already fulfill this role, if I'm understanding the purpose of "ID" correctly.
You're understanding the purpose correctly. I thought that more requests could be used in a single stream, but after re-reading tor-spec.txt, we can just use StreamID the same way as for RELAY_RESOLVE(D). So let's ditch the ID.
Like Jakob, I'm wondering why there isn't any support for setting flags.
See my response to Jakob. I don't think it's worth to use anything else than flags 0x110 (normal query, recursive, non-authenticated data ok) with DO bit set. Unless there is a really good reason for other flags, that would only have potential to leak identifying bits.
We could make an extra reserved fields in the spec for flags and OPT RR and for now the client will memset them to zeros, exit node will ignore them.
I wonder whether the "length" field here is redundant with the "length" field in the relay header. Probably not, I guess: Having a length field here means we can send
DNS_RESPONSE payload:
ID (2 octets) data length (2 octets) total length (4 octets) data (variable)
So to be clear, if the reply is 1200 bytes long, then the user will receive four cells, with relay payload contents: { ID = x, data_len = 490, total_len = 1200, data = (bytes[0..489] } { ID = x, data_len = 490, total_len = 1200, data = (bytes[490..979] } { ID = x, data_len = 220, total_len = 1200, data = (bytes[980..1199], zero padding} }
Your example with 1200 byte reply is correct.
Also, in this case, I think the length field in this packet _is_ redundant with the length field of the relay cell header.
The inner "length" might be useful in case we wanted to add an extra field (maybe not a good idea for some other reason, like confusing older OP if we did add a field later?).
I think the total_len field could be replaced with a single bit to indicate "this is the last cell".
"End" bit would work, but I find it easier to know beforehand how much data to expect - we don't have to worry about realloc and memory fragmentation. Client could deny request if claimed total_length is too high right away (or later if OR keeps pushing more data than claimed).
That also means AXFR/IXFR would be off limits (I'm OK with that).
Data contains the reply DNS packet. Total length describes length of complete response packet.
I think we want to do some sanitization on the reply DNS packet. In particular, we have no need to say what the transaction ID was, or
Sure, we can scrub transaction id in reply (xid should be random and the client knows anyway where the exit node is, but why not).
Initial Questions:
When running in dnsport mode, it seems we risk leaking information about the client resolver based on which requests it makes in what order. Is that so?
Yes. For example, validating vs non-validating resolver is very easy to spot. An attacker eavesdropping on exit node might have it harder due to caching in libunbound, but malicious exit node can spot validating resolver just by the fact that it's asking for DS/DNSKEY records.
Thus client-side validation when using DNSPort or SOCKS resolve must be on by default.
How many round trips are we looking at here for typical use cases, and what can we do to reduce them? We've found that anything that adds extra round trips to opening a connection in Tor is a real problem for a lot of use cases, and so we should try to avoid them as much as possible.
Requiring client-side validation for A/AAAA in RELAY_BEGIN is pointless (would only make it slower), client cannot check where exit node connects and eavesdropping attacker can easily know which DNS request belongs to DNSPort request and which to RELAY_BEGIN (that's true in current implementation as well - if TCP connection does not follow, it's DNSPort/SOCKS resolve request).
So no additional overhead for RELAY_BEGIN.
Case of DNSPort queries - example for addons.mozilla.org with empty cache:
Standard query A addons.mozilla.org Standard query DNSKEY <Root> Standard query DS org Standard query DNSKEY org Standard query DS mozilla.org Standard query DNSKEY mozilla.org
Note that we could "preheat" cache by resolving DS and DNSKEY for common TLDs like com, net, org at Tor start (regardless of whether DNSPort is on or not); like TBB "preheats" check.toproject.org now :-)
To give you an idea how it looks as cache fills up, here are three requests for "addons.mozilla.org", "api-dev.bugzilla.mozilla.org", "www.torproject.org", starting with empty cache:
Standard query A addons.mozilla.org Standard query DNSKEY <Root> Standard query DS org Standard query DNSKEY org Standard query DS mozilla.org Standard query DNSKEY mozilla.org Standard query A api-dev.bugzilla.mozilla.org Standard query A www.torproject.org Standard query DS torproject.org Standard query DNSKEY torproject.org
Ondrej
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
On 02/07/2012 07:18 PM, Nick Mathewson wrote:
part of the relay cell header should already fulfill this role, if I'm understanding the purpose of "ID" correctly.
You're understanding the purpose correctly. I thought that more requests could be used in a single stream, but after re-reading tor-spec.txt, we can just use StreamID the same way as for RELAY_RESOLVE(D). So let's ditch the ID.
Agreed. It means you can only have 65536 total streams and requests inflight per circuit at a time, but that's a pretty generous limit.
Like Jakob, I'm wondering why there isn't any support for setting flags.
See my response to Jakob. I don't think it's worth to use anything else than flags 0x110 (normal query, recursive, non-authenticated data ok) with DO bit set. Unless there is a really good reason for other flags, that would only have potential to leak identifying bits.
I can't think of one offhand; I had at first thought that non-recursive queries were good for something, but I'm not really sure what.
[...]
Your example with 1200 byte reply is correct.
Also, in this case, I think the length field in this packet _is_ redundant with the length field of the relay cell header.
The inner "length" might be useful in case we wanted to add an extra field (maybe not a good idea for some other reason, like confusing older OP if we did add a field later?).
I think if we want an extra field in the future, we want to put it after the end of the response (that is, after total_len), rather than having it be optionally in every cell.
I think the total_len field could be replaced with a single bit to indicate "this is the last cell".
"End" bit would work, but I find it easier to know beforehand how much data to expect - we don't have to worry about realloc and memory fragmentation. Client could deny request if claimed total_length is too high right away (or later if OR keeps pushing more data than claimed).
Hm. If so, maybe total_len only needs to be in the first cell then.
That also means AXFR/IXFR would be off limits (I'm OK with that).
Me too.
[...]
Initial Questions:
When running in dnsport mode, it seems we risk leaking information about the client resolver based on which requests it makes in what order. Is that so?
Yes. For example, validating vs non-validating resolver is very easy to spot. An attacker eavesdropping on exit node might have it harder due to caching in libunbound, but malicious exit node can spot validating resolver just by the fact that it's asking for DS/DNSKEY records.
Thus client-side validation when using DNSPort or SOCKS resolve must be on by default.
How many round trips are we looking at here for typical use cases, and what can we do to reduce them? We've found that anything that adds extra round trips to opening a connection in Tor is a real problem for a lot of use cases, and so we should try to avoid them as much as possible.
Requiring client-side validation for A/AAAA in RELAY_BEGIN is pointless (would only make it slower), client cannot check where exit node connects and eavesdropping attacker can easily know which DNS request belongs to DNSPort request and which to RELAY_BEGIN (that's true in current implementation as well
- if TCP connection does not follow, it's DNSPort/SOCKS resolve request).
So no additional overhead for RELAY_BEGIN.
Case of DNSPort queries - example for addons.mozilla.org with empty cache:
Hang on, is each one of these a *round trip*? I don't think so. That is, you don't need to get the answer for the A query before you do the other lookups; the client can launch them all at once.
Having extra queries isn't a huge problem; it's having extra round trips specifically that would hurt. From a cursory look, it doesn't seem like we're adding any extra round trips here.
I wonder, do we want to add a "resolve and connect" mode to relay_begin, as discussed elsewhere in this thread?
On Tue, 07 Feb 2012, Nick Mathewson wrote:
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
On 02/07/2012 07:18 PM, Nick Mathewson wrote:
Like Jakob, I'm wondering why there isn't any support for setting flags.
See my response to Jakob. I don't think it's worth to use anything else than flags 0x110 (normal query, recursive, non-authenticated data ok) with DO bit set. Unless there is a really good reason for other flags, that would only have potential to leak identifying bits.
I can't think of one offhand; I had at first thought that non-recursive queries were good for something, but I'm not really sure what.
CD (checking disabled) is quite an important flag in my opinion. In fact, we should set it every time that the tor client is able to validate DNSSSEC themselves.
There also probably ought to be a tor made up flag for "give me the (or one) entire cert chain from the root so I can validate this thing myself without a gazillion round trips". (If we set this we probably also leak less about what we have cached already.) That might require we come up with a way to serialize a number of DNS replies that are the response to a single query.
Cheers,
On 02/08/2012 09:09 AM, Peter Palfrader wrote:
On Tue, 07 Feb 2012, Nick Mathewson wrote:
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
On 02/07/2012 07:18 PM, Nick Mathewson wrote:
Like Jakob, I'm wondering why there isn't any support for setting flags.
See my response to Jakob. I don't think it's worth to use anything else than flags 0x110 (normal query, recursive, non-authenticated data ok) with DO bit set. Unless there is a really good reason for other flags, that would only have potential to leak identifying bits.
I can't think of one offhand; I had at first thought that non-recursive queries were good for something, but I'm not really sure what.
CD (checking disabled) is quite an important flag in my opinion. In fact, we should set it every time that the tor client is able to validate DNSSSEC themselves.
Sorry, I named CD flag wrong ("unauthenticated data ok"), but it's set.
There also probably ought to be a tor made up flag for "give me the (or one) entire cert chain from the root so I can validate this thing myself without a gazillion round trips". (If we set this we probably also leak less about what we have cached already.) That might require we come up with a way to serialize a number of DNS replies that are the response to a single query.
I like the idea - every lookup would be single roundtrip and would not leak cache state.
It might be very tricky to do it right. There's one (incomplete) draft about serializing DNSSEC data into own structures (https://tools.ietf.org/html/draft-agl-dane-serializechain-01). I find using own structures means essentially rewriting validation from scratch (definitely should be avoided).
A naive implementation of simply putting DNS packets together and throwing them in front of libunbound to sort them out might be much less error-prone.
We should also think about error states and corner cases: what happens if exit node does not send all needed packets? Retry? Declare it fail?
Ondrej
On 02/08/2012 02:59 AM, Nick Mathewson wrote:
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
I think if we want an extra field in the future, we want to put it after the end of the response (that is, after total_len), rather than having it be optionally in every cell.
OK.
That also means AXFR/IXFR would be off limits (I'm OK with that).
Me too.
Without AXFR/IXFR we could limit total_len to 2 octets.
"End" bit would work, but I find it easier to know beforehand how much data to expect - we don't have to worry about realloc and memory fragmentation. Client could deny request if claimed total_length is too high right away (or later if OR keeps pushing more data than claimed).
Hm. If so, maybe total_len only needs to be in the first cell then.
True. Though I'd prefer it in every DNS_RESPONSE cell, I find "firm" cell structure less error-prone (saving 2 octets per cell does not seem so substantial). The total_length in following cells belonging to the same StreamID could be just ignored.
Just to sum up the changes of DNS_RESPONSE, the new structure would be:
total length (2 octets) data (variable)
So no additional overhead for RELAY_BEGIN.
Case of DNSPort queries - example for addons.mozilla.org with empty cache:
Hang on, is each one of these a *round trip*? I don't think so. That is, you don't need to get the answer for the A query before you do the other lookups; the client can launch them all at once.
libunbound tries to parallelize requests as much as possible, sending bunch of requests first, continuing as the responses return.
(Hm, I've just noticed that when asking a forwarder, libunbound serializes it instead. I'll have to ask about this.)
I wonder, do we want to add a "resolve and connect" mode to relay_begin, as discussed elsewhere in this thread?
Only reason I can think of being useful is for getting NSEC/NSEC3 proof that domain does not exist. Does not seem worth the extra complexity, unless someone thinks of better use.
Ondrej
On 02/08/2012 11:47 PM, Ondrej Mikle wrote:
On 02/08/2012 02:59 AM, Nick Mathewson wrote:
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
I think if we want an extra field in the future, we want to put it after the end of the response (that is, after total_len), rather than having it be optionally in every cell.
OK.
That also means AXFR/IXFR would be off limits (I'm OK with that).
Me too.
Without AXFR/IXFR we could limit total_len to 2 octets.
I'd really like to be able to AXFR. I think it's important to have Tor's DNSPort able to do some of the most basic and common DNS stuff.
All the best, Jacob
On 02/09/2012 12:24 AM, Jacob Appelbaum wrote:
On 02/08/2012 11:47 PM, Ondrej Mikle wrote:
On 02/08/2012 02:59 AM, Nick Mathewson wrote:
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
I think if we want an extra field in the future, we want to put it after the end of the response (that is, after total_len), rather than having it be optionally in every cell.
OK.
That also means AXFR/IXFR would be off limits (I'm OK with that).
Me too.
Without AXFR/IXFR we could limit total_len to 2 octets.
I'd really like to be able to AXFR. I think it's important to have Tor's DNSPort able to do some of the most basic and common DNS stuff.
What about making a specialized tool for AXFR/IXFR (like tor-resolve)? Its interface could be listening for DNS packets and returning DNS-stream with AXFR/IXFR data. Since practically every DNS server open to AXFR/IXRF must listen on TCP, this can be much easier implemented using the already existing TCP tunneling in Tor.
I think this solution would make the rest of design simpler.
Ondrej
On 02/09/2012 10:58 PM, Ondrej Mikle wrote:
On 02/09/2012 12:24 AM, Jacob Appelbaum wrote:
On 02/08/2012 11:47 PM, Ondrej Mikle wrote:
On 02/08/2012 02:59 AM, Nick Mathewson wrote:
On Tue, Feb 7, 2012 at 7:33 PM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
I think if we want an extra field in the future, we want to put it after the end of the response (that is, after total_len), rather than having it be optionally in every cell.
OK.
That also means AXFR/IXFR would be off limits (I'm OK with that).
Me too.
Without AXFR/IXFR we could limit total_len to 2 octets.
I'd really like to be able to AXFR. I think it's important to have Tor's DNSPort able to do some of the most basic and common DNS stuff.
What about making a specialized tool for AXFR/IXFR (like tor-resolve)? Its interface could be listening for DNS packets and returning DNS-stream with AXFR/IXFR data. Since practically every DNS server open to AXFR/IXRF must listen on TCP, this can be much easier implemented using the already existing TCP tunneling in Tor.
I think this solution would make the rest of design simpler.
Another good reason for separate tool is to be able to specify the actual server where to ask for *XFR. Using just standard recursion and asking master NS may not always give the results you want.
Standard recursion with AXFR never worked for me in cases of servers listed here: http://axfr.nohack.se/
Ondrej
On Mon, Jan 30, 2012 at 1:59 AM, Roger Dingledine arma@mit.edu wrote:
On Thu, Jan 19, 2012 at 05:13:19PM -0500, Nick Mathewson wrote:
But I think the right design is probably something like allowing clients to request more DNS info via exit nodes' nameservers, and get more info back. We should think of ways to do this that avoid extra round trips, but that should be doable.
Ha. That'll teach me to answer tor-dev threads assuming nobody broke the threading. :)
So Nick, are you thinking we want a way for exit relays to receive an already-formatted dns query inside the Tor protocol, and get it onto the network somehow heading towards their configured nameservers? Or did you have something else in mind?
Approximately. There are parts of a DNS packet that we wouldn't want to have the Tor client make up. For example, DNS transaction IDs would need to avoid collisions. Similarly, I don't see why the client should be setting most of the possible flags.
I wonder if we want a begin_dns relay command, sort of like the current begin and begin_dir commands, and then just let them talk TCP to one of our nameservers? Or is that too much like the previous hacks?
I think the exit should be able to make the tcp/udp decision, and we'd want the first part of any query to nest inside the begin_dns cell type to avoid using two cells where one would do. Perhaps it also should be something where the last cell of a query gets a "this is the last cell" flag to avoid having to use an END _QUERY cell.
I think exits should do some rudimentary validation on client queries, to avoid shenanigans.
I think that we should also consider having an improved resolve+connect mechanism so that we can get the performance of a BEGIN cell (by avoiding a redundant round-trip) while still getting the DNS information we want.