Hi all,
the DNS/DNSSEC resolving draft for seems to be finished.
I added a few thoughts on mitigating circuit correlation (mentioned in proposal 171). Somebody could look at those if they are not totally stupid (last two paragraphs of section 7).
A note is added about the "DNSSEC stapling" [1] (extremely difficult, won't be implemented).
The draft is here (full text pasted at the end of this mail):
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
The draft could probably be given a "proposal number" and merged into torspec proposals directory unless there is an objection.
I'll leave few weeks (2-3) in case someone finds a vulnerability or has an objection. After that I could slowly begin implementing it in a separate branch.
[1] https://lists.torproject.org/pipermail/tor-dev/2012-February/003285.html
Ondrej
---- pasted proposal (hopfully will wrap well) ----
Filename: xxx-dns-dnssec.txt Title: Support for full DNS and DNSSEC resolution in Tor Authors: Ondrej Mikle Created: 4 February 2012 Modified: 10 March 2012 Status: Draft
0. Overview
Adding support for any DNS query type to Tor, as well as DNSSEC support.
0.1. Motivation
Many applications running over Tor need more than just resolving FQDN to IPv4 and vice versa. Sometimes to prevent DNS leaks the applications have to be hacked around to be supplied necessary data by hand (e.g. SRV records in XMPP). TLS connections will benefit from planned TLSA record that provides certificate pinning to avoid another Diginotar-like fiasco.
DNSSEC is part of the DNS protocol and the most appropriate place for DNSSEC API would be probably in OS libraries (e.g. libc). However that will probably take time until it becomes widespread.
On the Tor's side (as opposed to application's side), DNSSEC will provide protection against DNS cache-poisoning attacks (provided that exit is not malicious itself, but still reduces attack surface).
1. Design
1.1 New cells
There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll use DNS_BEGIN and DNS_RESPONSE for short below).
DNS_BEGIN payload:
DNS packet data (variable length)
The DNS packet must be generated internally by libunbound to avoid fingerprinting users by differences in client resolvers' behavior.
DNS_RESPONSE payload:
total length (2 octets) data (variable)
Data contains the reply DNS packet or its part if packet would not fit into the cell. Total length describes length of complete response packet.
AXFR and IXRF are not supported in this cell by design (see specialized tool below).
2. Interfaces to applications
DNSPort evdns - existing implementation will be updated to use DNS_BEGIN.
SOCKS proxy - new command will be added, containing RR type, class and query. Response will simply contain the DNS packet.
3. New options in configuration file
libunbound takes couple of parameters, e.g. trust anchors and cache-size. In order not to put them all into torrc, there will be only one option, configuration file name. Tor will be distributed with some sensible defaults. New option will be named UnboundConfig and value will be filename.
An option DNSQueryPolicy will determine what query types and classes are permitted:
- common - class INTERNET, RR types listed on https://en.wikipedia.org/wiki/List_of_DNS_record_types#Resource_records - full - any query type and class is allowed
Class CHAOS in "common" would not be of much use, since its prevalent use is for asking authoritative servers.
For client side, full validation would be optional described by option DNSValidation (0|1). By default validation is turned on, otherwise it would be easy to fingerprint people who turned it on and asked for not-so-common records like SRV.
4. Changes to directory flags
Exit nodes will signal their resolving capability by two flags:
- CommonDNS - reflects "common" DNSQueryPolicy - FullDNS - reflects "full" DNSQueryPolicy
Exit node asked for a RR type not in CommonDNS policy will return REFUSED in as status in the reply DNS packet contained in DNS_RESPONSE cell.
If new types are added to CommonDNS set (e.g. new RFC adds a record type) and exit node's Tor version does not recognize it as allowed, it will send REFUSED as well.
5. Implementation notes
There will be one instance of ub_ctx (libunbound resolver structure) in Tor, libunbound is thread-safe.
Client will periodically purge incomplete DNS replies. Any unexpected DNS_RESPONSE will be dropped.
Request for special names (.onion, .exit, .noconnect) will return REFUSED.
RELAY_BEGIN would function "normally", there is no need for returning DNS data. In case of malicious exit, client can't check he's really connected to whatever IP is in A/AAAA. We won't send any NSEC/NSEC3 back in case FQDN does not exist, it would needlessly complicate things. Client can check by extra query on DNSPort.
AD flag must be zeroed out on client unless validation is performed.
6. Separate tool for AXFR
The AXFR tool will have similar interface like tor-resolve, but will return raw DNS data.
Parameters are: query domain, server IP of authoritative DNS.
The tool will transfer the data through "ordinary" tunnel using RELAY_BEGIN and related cells.
This design decision serves two goals:
- DNS_BEGIN and DNS_RESPONSE will be simpler to implement (lower chance of bugs) - in practice it's often useful do AXFR queries on secondary authoritative DNS servers
IXFR will not be supported (infrequent corner case, can be done by manual tunnel creation over Tor if truly necessary).
7. Security implications
Client as well as exit MUST block attempts to resolve local RFC 1918, 4193, 4291 adresses (PTR).
An exit node resolving names will use libunbound for all types of resolving, including lookup of A/AAAA records when connecting stream to desired server. Ordinary streams will gain a small benefit of defense against DNS cache poisoning on exit node's network.
Transaction ID is provided randomly by libunbound, no need to modify. This affects only DNSPort and SOCKS interfaces.
As proposal 171 mentions, we need mitigate circuit correlation. One solution would be keeping multiple streams to multiple exit nodes and picking one at random for DNS resolution. Other would be keeping DNS-resolving circuit open only for a short time (e.g. 1-2 minutes).
Yet another option for mitigating circuit correlation would be having separate circuit for each application, but that would require some cooperation of application and Tor, e.g. via some LD_PRELOAD mechanism.
8. TTL normalization idea
A bit complex on implementation, because it requires parsing DNS packets at exit node.
TTL in reply DNS packet MUST be normalized at exit node so that client won't learn what other clients queried. The normalization is done in following way:
- for a RR, the original TTL value received from authoritative DNS server should be used when sending DNS_RESPONSE, trimming the values to interval [5, 600] - does not pose "ghost-cache-attack", since once RR is flushed from libunbound's cache, it must be fetched anew
9. Implementation notes
I noticed that libunbound does not always parallelize requests that could be parallelized when using a forwarder (this does not apply to unrelated queries). Thus, A query for addons.mozilla.org looks like (note the interleaving of query/reponse):
Time Info 0.000000 Standard query A addons.mozilla.org 0.178366 Standard query response A 63.245.217.112 RRSIG 0.178572 Standard query DNSKEY <Root> 0.178617 Standard query response DNSKEY DNSKEY RRSIG 0.178981 Standard query DS org 0.179041 Standard query response DS DS RRSIG 0.179192 Standard query DNSKEY org 0.179233 Standard query response DNSKEY DNSKEY DNSKEY DNSKEY RRSIG RRSIG 0.179505 Standard query DS mozilla.org 0.179562 Standard query response DS RRSIG 0.179717 Standard query DNSKEY mozilla.org 0.179762 Standard query response DNSKEY DNSKEY DNSKEY RRSIG RRSIG
Further investigation is needed how to work around this. Maybe future version will have it fixed, since I see DNS queries exiting from unbound forwarder to authoritative DNS server are parallelized.
10. "DNSSEC stapling"
The following idea tries to mitigate attack where observer of exit node can learn the fact that client's OR is "heating up DNS cache".
Instead of asking for several records (DS, DNSKEY, etc.), exit node would send all of them at once in a "stapled response".
Unfortunately this is extremely difficult to implement correctly [1] [2]. Thus we need to live with fact that exit node or an eavesdropper of such exit node will know that an OR used some TLD for the first time.
Causing unrelated errors or vulnerabilities in Tor by implementing this algorithm is not worth the risk.
References
[1] https://www.ietf.org/mail-archive/web/dane/current/msg02823.html [2] http://unbound.net/pipermail/unbound-users/2012-February/002239.html
On 03/10/2012 03:22 PM, Ondrej Mikle wrote:
The draft is here (full text pasted at the end of this mail):
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
Just a quick fix, I've noticed I have two sections named "Implementation notes".
s/9. Implementation notes/9. Notes on libunbound parallelization/ (it's already pushed into the github repo above).
Ondrej
On Sat, Mar 10, 2012 at 9:22 AM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
Hi all,
the DNS/DNSSEC resolving draft for seems to be finished.
Hi, Ondrej! I've got a few questions and comments. I might have more once I've thought a little more about the issues on this.
I added a few thoughts on mitigating circuit correlation (mentioned in proposal 171). Somebody could look at those if they are not totally stupid (last two paragraphs of section 7).
A note is added about the "DNSSEC stapling" [1] (extremely difficult, won't be implemented).
The draft is here (full text pasted at the end of this mail):
https://github.com/hiviah/torspec/blob/master/proposals/ideas/xxx-dns-dnssec...
The draft could probably be given a "proposal number" and merged into torspec proposals directory unless there is an objection.
I'll leave few weeks (2-3) in case someone finds a vulnerability or has an objection. After that I could slowly begin implementing it in a separate branch.
[1] https://lists.torproject.org/pipermail/tor-dev/2012-February/003285.html
Ondrej
---- pasted proposal (hopfully will wrap well) ----
Filename: xxx-dns-dnssec.txt Title: Support for full DNS and DNSSEC resolution in Tor Authors: Ondrej Mikle Created: 4 February 2012 Modified: 10 March 2012 Status: Draft
- Overview
Adding support for any DNS query type to Tor, as well as DNSSEC support.
0.1. Motivation
Many applications running over Tor need more than just resolving FQDN to IPv4 and vice versa. Sometimes to prevent DNS leaks the applications have to be hacked around to be supplied necessary data by hand (e.g. SRV records in XMPP). TLS connections will benefit from planned TLSA record that provides certificate pinning to avoid another Diginotar-like fiasco.
DNSSEC is part of the DNS protocol and the most appropriate place for DNSSEC API would be probably in OS libraries (e.g. libc). However that will probably take time until it becomes widespread.
On the Tor's side (as opposed to application's side), DNSSEC will provide protection against DNS cache-poisoning attacks (provided that exit is not malicious itself, but still reduces attack surface).
- Design
1.1 New cells
There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll use DNS_BEGIN and DNS_RESPONSE for short below).
DNS_BEGIN payload:
DNS packet data (variable length)
The DNS packet must be generated internally by libunbound to avoid fingerprinting users by differences in client resolvers' behavior.
Have you looked at the ldns API? From what I can tell, it is what libunbound uses internally, and is what actually generates and handles the queries.
Also, from a spec POV, it's better to say "The format must match that used by"... than "the packet must be generated by"
Last time we talked about this, we mentioned that some fields (like the request ID) that we wanted to clean up, and some flags we wanted to disallow. Did we decide not to do that?
DNS_RESPONSE payload:
total length (2 octets) data (variable)
Data contains the reply DNS packet or its part if packet would not fit into the cell. Total length describes length of complete response packet.
AXFR and IXRF are not supported in this cell by design (see specialized tool below).
As noted in the last mail, total_length is needless here; RELAY packets already have a length field.
- Interfaces to applications
DNSPort evdns - existing implementation will be updated to use DNS_BEGIN.
SOCKS proxy - new command will be added, containing RR type, class and query. Response will simply contain the DNS packet.
This would need an actual specification.
- New options in configuration file
libunbound takes couple of parameters, e.g. trust anchors and cache-size. In order not to put them all into torrc, there will be only one option, configuration file name. Tor will be distributed with some sensible defaults. New option will be named UnboundConfig and value will be filename.
An option DNSQueryPolicy will determine what query types and classes are permitted:
- common - class INTERNET, RR types listed on https://en.wikipedia.org/wiki/List_of_DNS_record_types#Resource_records - full - any query type and class is allowed
Class CHAOS in "common" would not be of much use, since its prevalent use is for asking authoritative servers.
For client side, full validation would be optional described by option DNSValidation (0|1). By default validation is turned on, otherwise it would be easy to fingerprint people who turned it on and asked for not-so-common records like SRV.
- Changes to directory flags
Exit nodes will signal their resolving capability by two flags:
- CommonDNS - reflects "common" DNSQueryPolicy - FullDNS - reflects "full" DNSQueryPolicy
Exit node asked for a RR type not in CommonDNS policy will return REFUSED in as status in the reply DNS packet contained in DNS_RESPONSE cell.
If new types are added to CommonDNS set (e.g. new RFC adds a record type) and exit node's Tor version does not recognize it as allowed, it will send REFUSED as well.
- Implementation notes
There will be one instance of ub_ctx (libunbound resolver structure) in Tor, libunbound is thread-safe.
Hm. Looking at the libunbound codebase, it makes me pretty sad that Libunbound wants to open up a separate thread so that it can do its own libevent-based event loop. Is there no way we can make libunbound (or ldns) integrate with our own event loop?
Also, for the record, I'm a little confused about the feature sets here. What does libunbound add to ldns here that we need?
Client will periodically purge incomplete DNS replies. Any unexpected DNS_RESPONSE will be dropped.
Request for special names (.onion, .exit, .noconnect) will return REFUSED.
RELAY_BEGIN would function "normally", there is no need for returning DNS data. In case of malicious exit, client can't check he's really connected to whatever IP is in A/AAAA. We won't send any NSEC/NSEC3 back in case FQDN does not exist, it would needlessly complicate things. Client can check by extra query on DNSPort.
What fraction of clients actually use DNSPort as opposed as to just doing everything via SOCKS connect requests? I worry that, by leaving RELAY_BEGIN users out of this entirely, we're making a feature that most clients just won't wind up using. I wonder whether the earlier idea of having a RELAY_BEGIN_DNS that does both the lookup and a connect wouldn't be a good idea -- both to save the round-trip, and to give the client the appropriate dnssec information.
And I *do* think that the dnssec information would be useful to the client: Even though we can't check whether the exit really connected to the requested IP or not, we're going to cache that IP, and perhaps ask other exits to connect to it when we want to connect to the corresponding hostname.
[...]
In a final version of this document, I'd like to see a more rigorous (pseudocode?) description of what the client and the exit node need to check when, and what they do in response. (e.g., "upon receiving a FOO cell, the exit node verifies that Bar. If not, ...") This would make the implementation easier to check against the spec, and the spec easier for dns gurus to audit.
cheers,
On 03/12/2012 07:08 PM, Nick Mathewson wrote:
On Sat, Mar 10, 2012 at 9:22 AM, Ondrej Mikle ondrej.mikle@gmail.com wrote:
- Design
1.1 New cells
There will be two new cells, RELAY_DNS_BEGIN and RELAY_DNS_RESPONSE (we'll use DNS_BEGIN and DNS_RESPONSE for short below).
DNS_BEGIN payload:
DNS packet data (variable length)
The DNS packet must be generated internally by libunbound to avoid fingerprinting users by differences in client resolvers' behavior.
Have you looked at the ldns API? From what I can tell, it is what libunbound uses internally, and is what actually generates and handles the queries.
Yes, libunbound uses ldns internally. However with ldns you have to do full traversal to the root manually and watch out for things like CNAME/DNAME. It's real PITA (for example, in DNSSEC Validator Firefox add-on that uses ldns we have 13 states that describe various "levels" of validation result).
Also, from a spec POV, it's better to say "The format must match that used by"... than "the packet must be generated by"
OK.
Last time we talked about this, we mentioned that some fields (like the request ID) that we wanted to clean up, and some flags we wanted to disallow. Did we decide not to do that?
Seems I've forgotten to add the part about DNS flags (other things like IDs are cleaned from the proposal).
I originally proposed to "hardcode" flags: 0x110 (recursive, checking disabled), EDNS0 DO bit set.
DNS_RESPONSE payload:
total length (2 octets) data (variable)
Data contains the reply DNS packet or its part if packet would not fit into the cell. Total length describes length of complete response packet.
AXFR and IXRF are not supported in this cell by design (see specialized tool below).
As noted in the last mail, total_length is needless here; RELAY packets already have a length field.
One length field is gone, but we still need total_length since reply DNS packet may not fit in a single cell (most replies that include DNSSEC data fit within 1-3 cells).
- Interfaces to applications
DNSPort evdns - existing implementation will be updated to use DNS_BEGIN.
SOCKS proxy - new command will be added, containing RR type, class and query. Response will simply contain the DNS packet.
This would need an actual specification.
OK, I'll write one.
- Implementation notes
There will be one instance of ub_ctx (libunbound resolver structure) in Tor, libunbound is thread-safe.
Hm. Looking at the libunbound codebase, it makes me pretty sad that Libunbound wants to open up a separate thread so that it can do its own libevent-based event loop. Is there no way we can make libunbound (or ldns) integrate with our own event loop?
I'll have look at it whether it can be done with some reasonably small changes to original code. Why is an extra thread issue? IIRC libunbound can open multiple threads, depending on what configuration it is given via ub_ctx_config().
There are ub_poll/ub_process/ub_cancel that could possibly allow integrating into Tor's libevent loop.
Also, for the record, I'm a little confused about the feature sets here. What does libunbound add to ldns here that we need?
Libunbound makes life much easier - does full validation of chain up to root, including special cases such as CNAME/DNAME, has cache, load-balancing logic (if multiple threads are used). Basically everything mentioned in unbound.conf can be done with libunbound.
Client will periodically purge incomplete DNS replies. Any unexpected DNS_RESPONSE will be dropped.
Request for special names (.onion, .exit, .noconnect) will return REFUSED.
RELAY_BEGIN would function "normally", there is no need for returning DNS data. In case of malicious exit, client can't check he's really connected to whatever IP is in A/AAAA. We won't send any NSEC/NSEC3 back in case FQDN does not exist, it would needlessly complicate things. Client can check by extra query on DNSPort.
What fraction of clients actually use DNSPort as opposed as to just doing everything via SOCKS connect requests? I worry that, by leaving RELAY_BEGIN users out of this entirely, we're making a feature that most clients just won't wind up using. I wonder whether the earlier idea of having a RELAY_BEGIN_DNS that does both the lookup and a connect wouldn't be a good idea -- both to save the round-trip, and to give the client the appropriate dnssec information.
I suspect only minimal portion of clients use DNSPort. Against attacker eavesdropping on exit node, making exit node use libunbound for all resolving hides DNSPort use (unless queries are for RRs other than A/AAAA/PTR). However malicious exit can see the difference.
RELAY_BEGIN_DNS would work for lookup of A/AAAA, but all other RRs "stick out" (and as I understand, the DNSPort is supposed exactly for support of other RRs like SRV for XMPP). I don't know if this can be somehow worked around.
And I *do* think that the dnssec information would be useful to the client: Even though we can't check whether the exit really connected to the requested IP or not, we're going to cache that IP, and perhaps ask other exits to connect to it when we want to connect to the corresponding hostname.
I've been thinking about this for a while but came to conclusion it only proves one thing to the client: that exit node at some point learned DNS "translation" of FQDN. All the high-profile sites state-level attackers would be interested in run on some sort of CDN/cloud, meaning IPs are exchanged often through CNAME redirection.
I guess using a cached IP later would through another exit work for _most_ cases. But what kind of attack does it prevent compared to not sending the resolved IP data back to client? (It's the same issue that Perspectives/Convergence have with CDN services: except it's with certificates instead of IPs).
In a final version of this document, I'd like to see a more rigorous (pseudocode?) description of what the client and the exit node need to check when, and what they do in response. (e.g., "upon receiving a FOO cell, the exit node verifies that Bar. If not, ...") This would make the implementation easier to check against the spec, and the spec easier for dns gurus to audit.
Sure. I'll add it to the next version.
Ondrej