tor-dev July 2016

tor-dev@lists.torproject.org

37 participants
49 discussions

Proposal 247 (Hidden Service Vanguards) Overhaul
by Mike Perry 09 Jun '17

09 Jun '17

I spent some time trying to clean up proposal 247 based on everyone's comments, as well as based on my own thoughts. Please have a look if you commented on the original proposal, and complain if I've not taken your thoughts into account. (Aaron: In particular, I made several tradeoffs in favor of performance and DoS resistance that may be at odds with some of your suggestions, but I think the end result is still OK after looking into the Sybil rates and thinking about the adversary model in more detail. You may disagree). I've attached my updated version of the proposal inline in this mail, but the canonical updated proposal is in my remote at: https://gitweb.torproject.org/user/mikeperry/torspec.git/tree/proposals/247… Here's a summary of the changes (which are also listed in Git): * Try to make a coherent threat model and specify its assumptions * Fold in my comments about using disjoint sets ("buckets") for the third level guard. * Make the parameter discussion subsection its own section, and include tables with far more detail for the Sybil success rates. * Put the rotation period in a separate subsection from the number of guards * Switch to using min(X,X) and max(X,X) for the distribution for the second and third layer guard lifespans, respectively. Add a subsection describing this distribution (3.2.3) * Changed the default parameters based on these tables, and based on my own intuition about Tor's performance properties. * Move the load balancing, torrc, and other performance considerations to their own section (Section 5). * Move "3.2. Distinguishing new HS circuits from normal HS circuits" to section 4.1. * Fold in some of "3.3. Circuit nodes can now be linked to specific hidden services" into 4.1. Some of it I just removed, though, because I did not find it credible. * Added Roger's concerns about guard linkability to Section 4.2. * Added a denial of service subsection to Section 4.3. ================================ Filename: 247-hs-guard-discovery.txt Title: Defending Against Guard Discovery Attacks using Vanguards Author: George Kadianakis Created: 2015-07-10 Status: Draft 0. Motivation A guard discovery attack allow attackers to determine the guard node of a Tor client. The hidden service rendezvous protocol provides an attack vector for a guard discovery attack since anyone can force an HS to construct a 3-hop circuit to a relay (#9001). Following the guard discovery attack with a compromise and/or coercion of the guard node can lead to the deanonymization of a hidden service. 1. Overview This document tries to make the above guard discovery + compromise attack harder to launch. It introduces an optional configuration option which makes the hidden service also pin the second and third hops of its circuits for a longer duration. With this new path selection, we force the adversary to perform a Sybil attack and two compromise attacks before succeeding. This is an improvement over the current state where the Sybil attack is trivial to pull off, and only a single compromise attack is required. With this new path selection, an attacker is forced to do a one or more node compromise attacks before learning the guard node of a hidden service. This increases the uncertainty of the attacker, since compromise attacks are costly and potentially detectable, so an attacker will have to think twice before beginning a chain of node compromise attacks that he might not be able to complete. 1.1. Visuals Here is how a hidden service rendezvous circuit currently looks like: -> middle_1 -> middle_A -> middle_2 -> middle_B -> middle_3 -> middle_C -> middle_4 -> middle_D HS -> guard -> middle_5 -> middle_E -> Rendezvous Point -> middle_6 -> middle_F -> middle_7 -> middle_G -> middle_8 -> middle_H -> ... -> ... -> middle_n -> middle_n this proposal pins the two middles nodes to a much more restricted set, as follows: -> guard_3A_A -> guard_2_A -> guard_3A_B -> guard_3A_C -> Rendezvous Point HS -> guard_1 -> guard_3B_D -> guard_2_B -> guard_3B_E -> guard_3B_F -> Rendezvous Point Note that the third level guards are partitioned into buckets such that they are only used with one specific second-level guard. In this way, we ensure that even if an adversary is able to execute a Sybil attack against the third layer, they only get to learn one of the second-layer Guards, and not all of them. This prevents the adversary from gaining the ability to take their pick of the weakest of the second-level guards for further attack. 2. Design This feature requires the HiddenServiceGuardDiscovery torrc option to be enabled. When a hidden service picks its guard nodes, it also picks two additional sets of middle nodes `second_guard_set` and `third_guard_set` of size NUM_SECOND_GUARDS and NUM_THIRD_GUARDS respectively for each hidden service. These sets are unique to each hidden service created by a single Tor client, and must be kept separate and distinct. When a hidden service needs to establish a circuit to an HSDir, introduction point or a rendezvous point, it uses nodes from `second_guard_set` as the second hop of the circuit and nodes from that second hop's corresponding `third_guard_set` as third hops of the circuit. A hidden service rotates nodes from the 'second_guard_set' at a random time between MIN_SECOND_GUARD_LIFETIME hours and MAX_SECOND_GUARD_LIFETIME hours. A hidden service rotates nodes from the 'third_guard_set' at a random time between MIN_THIRD_GUARD_LIFETIME and MAX_THIRD_GUARD_LIFETIME hours. These extra guard nodes should be picked with the same path selection procedure that is used for regular middle nodes (though see Section 5.1 for performance reasons to restrict this slightly). Each node's rotation time is tracked independently, to avoid disclosing the rotation times of the primary and second-level guards. XXX how should proposal 241 ("Resisting guard-turnover attacks") be applied here? 2.1. Security parameters We set NUM_SECOND_GUARDS to 4 nodes and NUM_THIRD_GUARDS to 16 nodes (ie four sets of four). XXX: 3 and 12 might be another option here, in which case our rotation period for the second guard position can be reduced to 15 days. We set MIN_SECOND_GUARD_LIFETIME to 1 day, and MAX_SECOND_GUARD_LIFETIME to 33 days, for an average rotation rate of ~11 days, using the min(X,X) distribution specified in Section 3.2.2. We set MIN_THIRD_GUARD_LIFETIME to 1 hour, and MAX_THIRD_GUARD_LIFETIME to 18 hours, for an average rotation rate of ~12 hours, using the max(X,X) distribution specified in Section 3.2.2. XXX make all the above consensus parameters? Yes. Very yes, especially if we decide to change the primary guard lifespan. See Section 3 for more analysis on these constants. 3. Rationale and Security Parameter Selection 3.1. Threat model, Assumptions, and Goals Consider an adversary with the following powers: - Can launch a Sybil guard discovery attack against any node of a rendezvous circuit. The slower the rotation period of the node, the longer the attack takes. Similarly, the higher the percentage of the network is compromised, the faster the attack runs. - Can compromise any node on the network, but this compromise takes time and potentially even coercive action, and also carries risk of discovery. We also make the following assumptions about the types of attacks: 1. A Sybil attack is noisy. It will require either large amounts of traffic, multiple test circuits, or both. 2. A Sybil attack against the second or first layer Guards will be more noisy than a Sybil attack against the third layer guard, since the second and first layer Sybil attack requires a timing side channel in order to determine success, where as the Sybil success is almost immediately obvious to third layer guard, since it will now be returned as a rend point for circuits for the hidden service in question. 3. As soon as the adversary is confident they have won the Sybil attack, an even more aggressive circuit building attack will allow them to determine the next node very fast (an hour or less). 4. The adversary is strongly disincentivized from compromising nodes that may prove useless, as node compromise is even more risky for the adversary than a Sybil attack in terms of being noticed. Given this threat model, our security parameters were selected so that the first two layers of guards should be hard to attack using a Sybil guard discovery attack and hence require a node compromise attack. Ideally, we want the node compromise attacks to carry a non-negligible probability of being useless to the adversary by the time they complete. On the other hand, the outermost layer of guards should rotate fast enough to _require_ a Sybil attack. 3.2. Parameter Tuning 3.2.1. Sybil rotation counts for a given number of Guards The probability of Sybil success for Guard discovery can be modeled as the probability of choosing 1 or more malicious middle nodes for a sensitive circuit over some period of time. P(At least 1 bad middle) = 1 - P(All Good Middles) = 1 - P(One Good middle)^(num_middles) = 1 - (1 - c/n)^(num_middles) c/n is the adversary compromise percentage In the case of Vanguards, num_middles is the number of Guards you rotate through in a given time period. This is a function of the number of vanguards in that position (v), as well as the number of rotations (r). P(At least one bad middle) = 1 - (1 - c/n)^(v*r) Here's detailed tables in terms of the number of rotations required for a given Sybil success rate for certain number of guards. 1.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 11 6 4 3 3 2 2 2 2 1 1 15% 17 9 6 5 4 3 3 2 2 2 2 25% 29 15 10 8 6 5 4 4 3 3 2 50% 69 35 23 18 14 12 9 8 7 6 5 60% 92 46 31 23 19 16 12 11 10 8 6 75% 138 69 46 35 28 23 18 16 14 12 9 85% 189 95 63 48 38 32 24 21 19 16 12 90% 230 115 77 58 46 39 29 26 23 20 15 95% 299 150 100 75 60 50 38 34 30 25 19 99% 459 230 153 115 92 77 58 51 46 39 29 5.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 3 2 1 1 1 1 1 1 1 1 1 15% 4 2 2 1 1 1 1 1 1 1 1 25% 6 3 2 2 2 1 1 1 1 1 1 50% 14 7 5 4 3 3 2 2 2 2 1 60% 18 9 6 5 4 3 3 2 2 2 2 75% 28 14 10 7 6 5 4 4 3 3 2 85% 37 19 13 10 8 7 5 5 4 4 3 90% 45 23 15 12 9 8 6 5 5 4 3 95% 59 30 20 15 12 10 8 7 6 5 4 99% 90 45 30 23 18 15 12 10 9 8 6 10.0% Network Compromise: Sybil Success One Two Three Four Five Six Eight Nine Ten Twelve Sixteen 10% 2 1 1 1 1 1 1 1 1 1 1 15% 2 1 1 1 1 1 1 1 1 1 1 25% 3 2 1 1 1 1 1 1 1 1 1 50% 7 4 3 2 2 2 1 1 1 1 1 60% 9 5 3 3 2 2 2 1 1 1 1 75% 14 7 5 4 3 3 2 2 2 2 1 85% 19 10 7 5 4 4 3 3 2 2 2 90% 22 11 8 6 5 4 3 3 3 2 2 95% 29 15 10 8 6 5 4 4 3 3 2 99% 44 22 15 11 9 8 6 5 5 4 3 The rotation counts in these tables were generated with: def count_rotations(c, v, success): r = 0 while 1-math.pow((1-c), v*r) < success: r += 1 return r 3.2.2. Rotation Period As specified in Section 3.1, the primary driving force for the third layer selection was to ensure that these nodes rotate fast enough that it is not worth trying to compromise them, because it is unlikely for compromise to succeed and yield useful information before the nodes stop being used. For this reason we chose 1 to 18 hours, with a weighted distribution (Section 3.2.3) causing the expected average to be 12 hours. From the table in Section 3.2.1, it can be seen that this means that the Sybil attack will complete with near-certainty (99%) in 29*12 hours (14.5 days) for the 1% adversary, 3 days for the 5% adversary, and 1.5 days for the 10% adversary. Since rotation of each node happens independently, the distribution of when the adversary expects to win this Sybil attack in order to discover the next node up is uniform. This means that on average, the adversary should expect that half of the rotation period of the next node is already over by the time that they win the Sybil. With this fact, we choose our range and distribution for the second layer rotation to be short enough to cause the adversary to risk compromising nodes that are useless, yet long enough to require a Sybil attack to be noticeable in terms of client activity. For this reason, we choose a minimum second-layer guard lifetime of 1 day, since this gives the adversary a minimum expected value of 12 hours for during which they can compromise a guard before it might be rotated. If the total expected rotation rate is 11 days, then the adversary can expect overall to have 5.5 days remaining after completing their Sybil attack before a second-layer guard rotates away. 3.2.3. Rotation distribution In order to skew the distribution of the third layer guard towards higher values, we use max(X,X) for the distribution, where X is a random variable that takes on values from the uniform distribution. In order to skew the distribution of the second layer guard towards low values (to increase the risk of compromising useless nodes) we skew the distribution towards lower values, using min(X,X). Here's a table of expectation (arithmetic means) for relevant ranges of X (sampled from 0..N). The current choice for second-layer guards is noted with **, and the current choice for third-layer guards is noted with ***. Range Min(X,X) Max(X,X) 10 2.85 6.15 11 3.18 6.82 12 3.51 7.49 13 3.85 8.15 14 4.18 8.82 15 4.51 9.49 16 4.84 10.16 17 5.18 10.82*** 18 5.51 11.49 19 5.84 12.16 20 6.18 12.82 21 6.51 13.49 22 6.84 14.16 23 7.17 14.83 24 7.51 15.49 25 7.84 16.16 26 8.17 16.83 27 8.51 17.49 28 8.84 18.16 29 9.17 18.83 30 9.51 19.49 31 9.84 20.16 32 10.17** 20.83 33 10.51 21.49 34 10.84 22.16 35 11.17 22.83 36 11.50 23.50 37 11.84 24.16 38 12.17 24.83 39 12.50 25.50 4. Security concerns and mitigations 4.1. Mitigating fingerprinting of new HS circuits By pinning the middle nodes of rendezvous circuits, we make it easier for all hops of the circuit to detect that they are part of a special hidden service circuit with varying degrees of certainty. The Guard node is able to recognize a Vanguard client with a high degree of certainty because it will observe a client IP creating the overwhelming majority of its circuits to just a few middle nodes in any given 10-18 day time period. The middle nodes will be able to tell with a variable certainty that depends on both its traffic volume and upon the popularity of the service, because they will see a large number of circuits that tend to pick the same Guard and Exit. The final nodes will be able to tell with a similar level certainty that depends on their capacity and the service popularity, because they will see a lot of rend handshakes that all tend to have the same second hop. The most serious of these is the Guard fingerprinting issue. When proposal xxx-padding-negotiation is implemented, services that enable this feature should use those padding primitives to create fake circuits to random middle nodes that are not their guards, in an attempt to look more like a client. Additionally, if Tor Browser implements "virtual circuits" based on SOCKS username+password isolation in order to enforce the re-use of paths when SOCKS username+passwords are re-used, then the number of middle nodes in use during a typical user's browsing session will be proportional to the number of sites they are viewing at any one time. This is likely to be much lower than one new middle node every ten minutes, and for some users, may be close to the number of Vanguards we're considering. This same reasoning is also an argument for increasing the number of second-level guards beyond just two, as it will spread the hidden service's traffic over a wider set of middle nodes, making it both easier to cover, and behave closer to a client using SOCKS virtual circuit isolation. 4.2. Hidden service linkability Multiple hidden services on the same Tor instance should use separate second and third level guard sets, otherwise an adversary is trivially able to determine that the two hidden services are co-located by inspecting their current chosen rend point nodes. Unfortunately, if the adversary is still able to determine that two or more hidden services are run on the same Tor instance through some other means, then they are able to take advantage of this fact to execute a Sybil attack more effectively, since there will now be an extra set of guard nodes for each hidden service in use. For this reason, if Vanguards are enabled, and more than one hidden service is configured, the user should be advised to ensure that they do not accidentally leak that the two hidden services are from the same Tor instance. 4.3. Denial of service Since it will be fairly trivial for the adversary to enumerate the current set of rend nodes for a hidden service, denial of service becomes a serious risk for Vanguard users. For this reason, it is important to support a large number of third-level guards, to increase the amount of resources required to bring a hidden service offline by DoSing just a few Tor nodes. 5. Performance considerations The switch to a restricted set of nodes will very likely cause significant performance issues, especially for high-traffic hidden services. If any of the nodes they select happen to be temporarily overloaded, performance will suffer dramatically until the next rotation period. 5.1. Load Balancing Since the second and third level "guards" are chosen from the set of all nodes eligible for use in the "middle" hop (as per hidden services today), this proposal should not significantly affect the long-term load on various classes of the Tor network, and should not require any changes to either the node weight equations, or the bandwidth authorities. Unfortunately, transient load is another matter, as mentioned previously. It is very likely that this scheme will increase instances of transient overload at nodes selected by high-traffic hidden services. One option to reduce the impact of this transient overload is to restrict the set of middle nodes that we chose from to some percentage of the fastest middle-capable relays in the network. This may have some impact on load balancing, but since the total volume of hidden service traffic is low, it may be unlikely to matter. 5.2. Circuit build timeout The adaptive circuit build timeout mechanism in Tor is what corrects for instances of transient node overload right now. The timeout will naturally tend to select the current fastest and least-loaded paths even through this set of restricted routes, but it may fail to behave correctly if there are a very small set of nodes in each guard set, as it is based upon assumptions about the current path selection algorithm, and it may need to be tuned specifically for Vanguards, especially if the set of possible routes is small. 5.3. OnionBalance At first glance, it seems that this scheme makes multi-homed hidden services such as OnionBalance[1] even more important for high-traffic hidden services. Unfortunately, if it is equally damaging to the user for any of their multi-homed hidden service locations to be discovered, then OnionBalance is strictly equivalent to simply increasing the number of second-level guard nodes in use, because an active adversary can perform simultaneous Sybil attacks against all of the rend points offered by the multi-homed OnionBalance introduction points. 5.4. Default vs optional behavior We suggest this torrc option to be optional because it changes path selection in a way that may seriously impact hidden service performance, especially for high traffic services that happen to pick slow guard nodes. However, by having this setting be disabled by default, we make hidden services who use it stand out a lot. For this reason, we should in fact enable this feature globally, but only after we verify its viability for high-traffic hidden services, and ensure that it is free of second-order load balancing effects. Even after that point, until Single Onion Services are implemented, there will likely still be classes of very high traffic hidden services for whom some degree of location anonymity is desired, but for which performance is much more important than the benefit of Vanguards, so there should always remain a way to turn this option off. 6. Future directions Here are some more ideas for improvements that should be done sooner or later: - Maybe we should make the size and rotation period of secondary/third guard sets to be configurable by the user. - To make it harder for an adversary, a hidden service MAY extend the path length of its circuits by an additional static hop. This forces the adversary to use another coercion attack to walk the chain up to the hidden service. 7. Acknowledgments Thanks to Aaron Johnson, John Brooks, Mike Perry and everyone else who helped with this idea. 1. https://onionbalance.readthedocs.org/en/latest/design.html#overview -- Mike Perry

7 14

onionoo.tpo stuck at 2016-05-13 12:00
by nusenu 29 Jan '17

29 Jan '17

Hi Karsten, I was surprised that ornetradar did not send a single email for yesterday's new relays. After looking into it, it turned out it is an onionoo problem. "relays_published":"2016-05-13 12:00:00" https://onionoo.torproject.org/details?limit=4 regards, nusenu

2 11

[Proposal] A simple way to make Tor-Browser-Bundle more portable and secure
by Daniel Simon 31 Oct '16

31 Oct '16

Hello. How it's currently done - The Tor Browser Bundle is dynamically linked against glibc. Security problem - The Tor Browser Bundle has the risk of information about the host system's library ecosystem leaking out onto the network. Portability problem - The Tor Browser Bundle can't be run on systems that don't use glibc, making it unusable due to different syscalls. Solution proposed - Static link the Tor Browser Bundle with musl libc.[1] It is a simple and fast libc implementation that was especially crafted for static linking. This would solve both security and portability issues. What is Tor developers' opinion about this? I personally don't see any drawbacks and would be interested in discussing this further. Sincerely, Daniel [1] https://www.musl-libc.org/

6 8

Re: [tor-dev] Onioncat and Prop224
by str4d 09 Oct '16

09 Oct '16

On 27/04/16 22:31, grarpamp wrote: > On 4/25/16, Tim Wilson-Brown - teor <teor2345(a)gmail.com> wrote: >> >>> On 22 Apr 2016, at 17:03, grarpamp <grarpamp(a)gmail.com> wrote: >>> >>> FYI: The onioncat folks are interested in collaborating >>> with tor folks regarding prop224. >>> >>> https://gitweb.torproject.org/torspec.git/tree/proposals/224-rend-spec-ng.t… >> >> I'm interested in what kind of collaboration onioncat would like to do on >> prop224, next-generation hidden services. >> It would be great to work this out in the next few weeks, as we're coding >> parts of the proposal right now. > > Yep :) And I know Bernhard was hoping to get in touch with Roger > on this before long. > > Basically, prop224 HS being wider than 80 bits will break onioncat's > current HS onion <---> IPv6 addressing mechanism. > > They're looking at various backward compatibility options, as well > as possibly making side use of the HSDir DHT, or even integrating > more directly with the tor client. > Just FYI, I recently migrated all of I2P's spec proposals to the website, and came across a seven-year-old proposal that Bernhard wrote about improving I2P support in GarliCat: https://geti2p.net/spec/proposals/105-garlicat-name-translation I don't know how well it has aged, but given that Tor is now facing the same issues that I2P has, perhaps it can be of some use if resurrected from the dead :) > >> But the tor-onions mailing list is to discuss the technical details running >> onion services. > > Readers of tor-onions / newbies may have been unfamiliar with > onioncat. It's a way to get non-TCP between TorHS onions, thus in > the thread "Hidden datagram service". > > I think there are a nontrivial number of users interested in, and > using, non-strictly-TCP transport over an IPv6 tunnel interface. > For example, look at users of CJDNS... > > For which we should try to continue a way, in v2, to do that over > anonymous overlay network Tor / I2P. > There is already some work on doing this in I2P: https://github.com/majestrate/i2p-tools/tree/master/i2tun https://github.com/majestrate/i2p-tools/tree/master/pyi2tun I2P also natively supports non-TCP protocols if that helps (only datagrams implemented thus far). str4d

9 20

prop224: Ditching key blinding for shorter onion addresses
by George Kadianakis 29 Sep '16

29 Sep '16

Hello people, this is an experimental mail meant to address legitimate usability concerns with the size of onion addresses after proposal 224 gets implemented. It's meant for discussion and it's far from a full blown proposal. Anyway, after prop224 gets implemented, we will go from 16-character onion addresses to 52-character onion addresses. See here for more details: https://gitweb.torproject.org/torspec.git/tree/proposals/224-rend-spec-ng.t… This happens because we want the onion address to be a real public key, and not the truncated hash of a public key as it is now. We want that so that we can do fun cryptography with that public key. Specifically, we want to do key blinding as specified here: https://gitweb.torproject.org/torspec.git/tree/proposals/224-rend-spec-ng.t… As I understand it the key blinding scheme is trying to achieve the following properties: a) Every HS has a permanent identity onion address b) Clients use an ephemeral address to fetch descriptors from HSDir c) Knowing the ephemeral address never reveals the permanent onion address c) Descriptors are encrypted and can only be read by clients that know the identity onion key d) Descriptors are signed and verifiable by clients who know the identity onion key e) Descriptors are also verifiable in a weaker manner by HSDirs who know the ephemeral address In this email I'm going to sketch a scheme that has all above properties except from (e). The suggested scheme is basically the current HSDir protocol, but with clients using ephemeral addresses for fetching HS descriptors. Also, we truncate onion address hashes to something larger than 80bits. Here is a sketch of the scheme: ------ Hidden service Alice has a long-term public identity key: A Hidden service Alice has a long-term private identity key: a The onion address of Alice, as in the current scheme, is a truncated H(A). So let's say: onion_address = H(A) truncated to 128 bits. The full public key A is contained in Alice's descriptor as it's currently the case. When Alice wants to publish a descriptor she computes an ephemeral address based on the current time period 't': ephemeral_address = H(t || onion_address) Legitimate clients who want to fetch the descriptor also do the same, since they know both 't' and 'onion_address'. Descriptors are encrypted using a key derived from the onion_address. Hence, only clients that know the onion_address can decrypt it. Descriptors are signed using the long-term private key of the hidden service, and can be verified by clients who manage to decrypt the descriptor. --- Assuming the above is correct and makes sense (need more brain), it should maintain all the security properties above except from (e). So basically in this scheme, HSDirs won't be able to verify the signatures of received descriptors. The obvious question here is, is this a problem? IIUC, having the HSDirs verify those signatures does not offer any additional security, except from making sure that the descriptor signature was actually created using a legitimate ed25519 key. Other than that, I don't see it offering much. So, what does this additional HSDir verification offer? It seems like a weak way to ensure that no garbage is uploaded on the HSDir hash ring. However, any reasonable attacker will put their garbage in a descriptor and sign it with a random ed25519 key, and it will trivially pass the HSDir validation. So do we actually care about this property enough to introduce huge onion addresses to the system? Please discuss and poke holes at the above system. Cheers!

11 23

onion moshing
by David Stainton 25 Sep '16

25 Sep '16

I was inspired by onioncat to write a twisted python implementation. Onionvpn doesn't have as many features as onioncat. I've successfully tested that onionvpn and onioncat can talk to each other and play nice. Both onionvpn and onioncat implement a virtual public network. Anyone can send packets to you if they know your onion address or ipv6 address... however injection attacks are unlikely since the attacker cannot know the contents of your traffic without compromising the tor process managing the onion service. I've also tested with mosh; that is, you can use mosh which only works with ipv4 over an ipv4-to-ipv6 tunnel over onionvpn/onioncat. Like this: mosh-client -> udp/ipv4 -> ipv6 -> tun device -> tcp-to-tor -> onion service decodes ipv6 to tun device -> ipv6 -> udp/ipv4 -> mosh-server https://github.com/david415/onionvpn If an onionvpn/onioncat operator were to NAT the onion ipv6 traffic to the Internet then that host essentially becomes a special IPv6 exit node for the tor network. The same can be done for IPv4. Obviously operating such an exit node might be risky due to the potential for abuse... however don't you just love the idea of being about to use low-level network scanners over tor? I wonder if Open Observatory of Network Interference would be interested in this. david

4 3

onionoo: poor reverse DNS results
by nusenu 02 Sep '16

02 Sep '16

Hi Karsten, some time ago I reported onionoo's poor reverse DNS results https://trac.torproject.org/projects/tor/ticket/18342 and it didn't change since then. As of 2016-07-02 08:00 (tpo instance) 3144 out of 8473 (~37%) still have no reverse DNS result (=IP address). Do you have an idea whether this will improve sometime before 2016-09-01? thanks, nusenu btw: Thanks for working around the maxmind AS problem with reverting to the May version.

2 2

Proposal 271: Another algorithm for guard selection
by Nick Mathewson 19 Aug '16

19 Aug '16

Filename: 271-another-guard-selection.txt Title: Another algorithm for guard selection Author: Isis Lovecruft, George Kadianakis, Ola Bini, Nick Mathewson Created: 2016-07-11 Supersedes: 259, 268 Status: Open 0.0. Preliminaries This proposal derives from proposals 259 and 268; it is meant to supersede both. It is in part a restatement of it, in part a simplification, and in part a refactoring so that it does not have the serialization problems noted by George Kadianakis. It makes other numerous small changes. Isis, George, and Ola should all get the credit for the well-considered ideas. Whenever I say "Y is a subset of X" you can think in terms of "Y-membership is a flag that can be set on members of X" or "Y-membership is a predicate that can be evaluated on members of X." "More work is needed." There's a to-do at the end of the document. 0.1. Notation: identifiers We mention identifiers of these kinds: [SECTIONS] {INPUTS}, {PERSISTENT_DATA}, and {OPERATING_PARAMETERS}. {non_persistent_data} <states>. Each named identifier receives a type where it is defined, and is used by reference later on. I'm using this convention to make it easier to tell for certain whether every thingy we define is used, and vice versa. 1. Introduction and motivation Tor uses entry guards to prevent an attacker who controls some fraction of the network from observing a fraction of every user's traffic. If users chose their entries and exits uniformly at random from the list of servers every time they build a circuit, then an adversary who had (k/N) of the network would deanonymize F=(k/N)^2 of all circuits... and after a given user had built C circuits, the attacker would see them at least once with probability 1-(1-F)^C. With large C, the attacker would get a sample of every user's traffic with probability 1. To prevent this from happening, Tor clients choose a small number of guard nodes (currently 3). These guard nodes are the only nodes that the client will connect to directly. If they are not compromised, the user's paths are not compromised. But attacks remain. Consider an attacker who can run a firewall between a target user and the Tor network, and make many of the guards they don't control appear to be unreachable. Or consider an attacker who can identify a user's guards, and mount denial-of-service attacks on them until the user picks a guard that the attacker controls. In the presence of these attacks, we can't continue to connect to the Tor network unconditionally. Doing so would eventually result in the user choosing a hostile node as their guard, and losing anonymity. This proposal outlines a new entry guard selection algorithm, which tries to meet the following goals: - Heuristics and algorithms for determining how and which guards are chosen should be kept as simple and easy to understand as possible. - Clients in censored regions or who are behind a fascist firewall who connect to the Tor network should not experience any significant disadvantage in terms of reachability or usability. - Tor should make a best attempt at discovering the most appropriate behaviour, with as little user input and configuration as possible. - Tor clients should discover usable guards without too much delay. - Tor clients should resist (to the extent possible) attacks that try to force them onto compromised guards. 2. State instances In the algorithm below, we describe a set of persistent and non-persistent state variables. These variables should be treated as an object, of which multiple instances can exist. In particular, we specify the use of three particular instances: A. UseBridges If UseBridges is set, then we replace the {GUARDS} set in [Sec:GUARDS] below with the list of list of configured bridges. We maintain a separate persistent instance of {SAMPLED_GUARDS} and {CONFIRMED_GUARDS} and other derived values for the UseBridges case. B. EntryNodes / ExcludeNodes / Reachable*Addresses / FascistFirewall / ClientUseIPv4=0 If one of the above options is set, and UseBridges is not, then we compare the fraction of usable guards in the consensus to the total number of guards in the consensus. If this fraction is less than {MEANINGFUL_RESTRICTION_FRAC}, we use a separate instance of the state. If this fraction is less than {EXTREME_RESTRICTION_FRAC}, we use a separate instance of the state, and warn the user. [TODO: should we have a different instance for each set of heavily restricted options?] C. Default If neither of the above variant-state instances is used, we use a default instance. 3. The algorithm. 3.0. The guards listed in the current consensus. [Section:GUARDS] By {set:GUARDS} we mean the set of all guards in the current consensus that are usable for all circuits. (They must have the flags: Stable, Fast, V2Dir, Guard.) **Rationale** We require all guards to have the flags that we potentially need from any guard, so that all guards are usable for all circuits. 3.1. The Sampled Guard Set. [Section:SAMPLED] We maintain a set, {set:SAMPLED_GUARDS}, that persists across invocations of Tor. It is an unordered subset of the nodes that we have seen listed as a guard in the consensus at some point. For each such guard, we record persistently: - {pvar:ADDED_ON_DATE}: The date on which it was added to sampled_guards. We base this value on RAND(now, {GUARD_LIFETIME}/10). See Appendix [RANDOM] below. - {pvar:ADDED_BY_VERSION}: The version of Tor that added it to sampled_guards. - {pvar:IS_LISTED}: Whether it was listed as a usable Guard in the _most recent_ consensus we have seen. - {pvar:FIRST_UNLISTED_AT}: If IS_LISTED is false, the publication date of the earliest consensus in which this guard was listed such that we have not seen it listed in any later consensus. Otherwise "None." We randomize this, based on RAND(added_at_time, {REMOVE_UNLISTED_GUARDS_AFTER} / 5) For each guard in {SAMPLED_GUARDS}, we also record this data, non-persistently: - {tvar:last_tried_connect}: A 'last tried to connect at' time. Default 'never'. - {tvar:is_reachable}: an "is reachable" tristate, with possible values { <state:yes>, <state:no>, <state:maybe> }. Default '<maybe>.' [Note: "yes" is not strictly necessary, but I'm making it distinct from "maybe" anyway, to make our logic clearer. A guard is "maybe" reachable if it's worth trying. A guard is "yes" reachable if we tried it and succeeded.] - {tvar:failing_since}: The first time when we failed to connect to this guard. Defaults to "never". Reset to "never" when we successfully connect to this guard. - {tvar:is_pending} A "pending" flag. This indicates that we are trying to build an exploratory circuit through the guard, and we don't know whether it will succeed. We require that {SAMPLED_GUARDS} contain at least {MIN_SAMPLE_THRESHOLD} of the number of guards in the consensus (if possible), but not more than {MAX_SAMPLE_THRESHOLD} of the number of guards in the consensus. To add a new guard to {SAMPLED_GUARDS}, pick an entry at random from ({GUARDS} - {SAMPLED_GUARDS}), weighted by bandwidth. We remove an entry from {SAMPLED_GUARDS} if: * We have a live consensus, and {IS_LISTED} is false, and {FIRST_UNLISTED_AT} is over {REMOVE_UNLISTED_GUARDS_AFTER} days in the past. OR * We have a live consensus, and we cannot parse {ADDED_BY_VERSION}. OR * We have a live consensus, and {ADDED_ON_DATE} is over {GUARD_LIFETIME} ago, *and* {CONFIRMED_ON_DATE} is either "never", or over {GUARD_CONFIRMED_MIN_LIFETIME} ago. Note that {SAMPLED_GUARDS} does not depend on our configuration. It is possible that we can't actually connect to any of these guards. **Rationale** The {SAMPLED_GUARDS} set is meant to limit the total number of guards that a client will connect to in a given period. The upper limit on its size prevents us from considering too many guards. The first expiration mechanism is there so that our {SAMPLED_GUARDS} list does not accumulate so many dead guards that we cannot add new ones. The second expiration mechanism makes us rotate our guards slowly over time. 3.2. The Usable Sample [Section:FILTERED] We maintain another set, {set:FILTERED_GUARDS}, that does not persist. It is derived from: - {SAMPLED_GUARDS} - our current configuration, - the path bias information. A guard is a member of {set:FILTERED_GUARDS} if and only if all of the following are true: - It is a member of {SAMPLED_GUARDS}, with {IS_LISTED} set to true. - It is not disabled because of path bias issues. - It is not disabled because of ReachableAddress police, the ClientUseIPv4 setting, the ClientUseIPv6 setting, the FascistFirewall setting, or some other option that prevents using some addresses. - It is not disabled because of ExcludeNodes. - It is a bridge if UseBridges is true; or it is not a bridge if UseBridges is false. We have an additional subset, {set:USABLE_FILTERED_GUARDS}, which is defined to be the subset of {FILTERED_GUARDS} where {is_reachable} is <yes> or <maybe>. We try to maintain a requirement that {USABLE_FILTERED_GUARDS} contain at least {MIN_FILTERED_SAMPLE} elements: Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we add new elements to {SAMPLED_GUARDS} until one of the following is true: * {USABLE_FILTERED_GUARDS} is large enough, OR * {SAMPLED_GUARDS} is at its maximum size. ** Rationale ** These filters are applied _after_ sampling: if we applied them before the sampling, then our sample would reflect the set of filtering restrictions that we had in the past. 3.3. The confirmed-guard list. [Section:CONFIRMED] [formerly USED_GUARDS] We maintain a persistent ordered list, {list:CONFIRMED_GUARDS}. It contains guards that we have used before, in our preference order of using them. It is a subset of {SAMPLED_GUARDS}. For each guard in this list, we store persistently: - {pvar:IDENTITY} Its fingerprint - {pvar:CONFIRMED_ON_DATE} When we added this guard to {CONFIRMED_GUARDS}. Randomized as RAND(now, {GUARD_LIFETIME}/10). We add new members to {CONFIRMED_GUARDS} when we mark a circuit built through a guard as "for user traffic." Whenever we remove a member from {SAMPLED_GUARDS}, we also remove it from {CONFIRMED_GUARDS}. [Note: You can also regard the {CONFIRMED_GUARDS} list as a total ordering defined over a subset of {SAMPLED_GUARDS}.] Definition: we call Guard A "higher priority" than another Guard B if, when A and B are both reachable, we would rather use A. We define prioirty as follows: * Every guard in {CONFIRMED_GUARDS} has a higher priority than every guard not in {CONFIRMED_GUARDS}. * Among guards in {CONFIRMED_GUARDS}, the one appearing earlier on the {CONFIRMED_GUARDS} list has a higher priority. * Among guards that do not appear in {CONFIRMED_GUARDS}, {is_pending}==true guards have higher priority. * Among those, the guard with earlier {last_tried_connect} time have higher priority. * Finally, among guards that do not appear in {CONFIRMED_GUARDS} with {is_pending==false}, all have equal priority. ** Rationale ** We add elements to this ordering when we have actually used them for building a usable circuit. We could mark them at some other time (such as when we attempt to connect to them, or when we actually connect to them), but this approach keeps us from committing to a guard before we actually use it for sensitive traffic. 3.4. The Primary guards [Section:PRIMARY] We keep a run-time non-persistent ordered list of {list:PRIMARY_GUARDS}. It is a subset of {FILTERED_GUARDS}. It contains {N_PRIMARY_GUARDS} elements. To compute primary guards, take the ordered intersection of {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first {N_PRIMARY_GUARDS} elements. If there are fewer than {N_PRIMARY_GUARDS} elements, add additional elements to PRIMARY_GUARDS chosen _uniformly_ at random from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}). Note that {PRIMARY_GUARDS} do not have to be in {USABLE_FILTERED_GUARDS}: they might be unreachable. ** Rationale ** These guards are treated differently from other guards. If one of them is usable, then we use it right away. For other guards {FILTERED_GUARDS}, if it's usable, then before using it we might first double-check whether perhaps one of the primary guards is usable after all. 3.5. Retrying guards. [Section:RETRYING] (We run this process as frequently as needed. It can be done once a second, or just-in-time.) If a primary sampled guard's {is_reachable} status is <no>, then we decide whether to update its {is_reachable} status to <maybe> based on its {last_tried_connect} time, its {failing_since} time, and the {PRIMARY_GUARDS_RETRY_SCHED} schedule. If a non-primary sampled guard's {is_reachable} status is <no>, then we decide whether to update its {is_reachable} status to <maybe> based on its {last_tried_connect} time, its {failing_since} time, and the {GUARDS_RETRY_SCHED} schedule. ** Rationale ** An observation that a guard has been 'unreachable' only lasts for a given amount of time, since we can't infer that it's unreachable now from the fact that it was unreachable a few minutes ago. 3.6. Selecting guards for circuits. [Section:SELECTING] Every origin circuit is now in one of these states: <state:usable_on_completion>, <state:usable_if_no_better_guard>, <state:waiting_for_better_guard>, or <state:complete>. You may only attach streams to <complete> circuits. (Additionally, you may only send RENDEZVOUS cells, ESTABLISH_INTRO cells, and INTRODUCE cells on <complete> circuits.) The per-circuit state machine is: New circuits are <usable_on_completion> or <usable_if_no_better_guard>. A <usable_on_completion> circuit may become <complete>, or may fail. A <usable_if_no_better_guard> circuit may become <usable_on_completion>; may become <waiting_for_better_guard>; or may fail. A <waiting_for_better_guard> circuit will become <complete>, or will be closed, or will fail. A <complete> circuit remains <complete> until it fails or is closed. Each of these transitions is described below. We keep, as global transient state: * {tvar:last_time_on_internet} -- the last time at which we successfully used a circuit or connected to a guard. At startup we set this to "infinitely far in the past." When we want to build a circuit, and we need to pick a guard: * If any entry in PRIMARY_GUARDS has {is_reachable} status of <maybe> or <yes>, return the first such guard. The circuit is <usable_on_completion>. [Note: We do not use {is_pending} on primary guards, since we are willing to try to build multiple circuits through them before we know for sure whether they work, and since we will not use any non-primary guards until we are sure that the primary guards are all down. (XX is this good?)] * Otherwise, if the ordered intersection of {CONFIRMED_GUARDS} and {USABLE_FILTERED_GUARDS} is nonempty, return the first entry in that intersection that has {is_pending} set to false. Set its value of {is_pending} to true. The circuit is now <usable_if_no_better_guard>. (If all entries have {is_pending} true, pick the first one.) * Otherwise, if there is no such entry, select a member at random from {USABLE_FILTERED_GUARDS}. Set its {is_pending} field to true. The circuit is <usable_if_no_better_guard>. We update the {last_tried_connect} time for the guard to 'now.' ** Rationale ** We're getting to the core of the algorithm here. Our main goals are to make sure that 1. If it's possible to use a primary guard, we do. 2. We probably use the first primary guard. So we only try non-primary guards if we're pretty sure that all the primary guards are down, and we only try a given primary guard if the earlier primary guards seem down. When we _do_ try non-primary guards, however, we only build one circuit through each, to give it a chance to succeed or fail. If ever such a circuit succeeds, we don't use it until we're pretty sure that it's the best guard we're getting. (see below). [XXX timeout.] 3.7. When a circuit fails. [Section:ON_FAIL] When a circuit fails in a way that makes us conclude that a guard is not reachable, we take the following steps: * We set the guard's {is_reachable} status to <no>. If it had {is_pending} set to true, we make it non-pending. * We close the circuit, of course. (This removes it from consideration by the algorithm in [UPDATE_WAITING].) * Update the list of waiting circuits. (See [UPDATE_WAITING] below.) [Note: the existing Tor logic will cause us to create more circuits in response to some of these steps; and also see [ON_CONSENSUS].] ** Rationale ** See [SELECTING] above for rationale. 3.8. When a circuit succeeds [Section:ON_SUCCESS] When a circuit succeeds in a way that makes us conclude that a guard _was_ reachable, we take these steps: * We set its {is_reachable} status to <yes>. * We set its {failing_since} to "never". * If the guard was {is_pending}, we clear the {is_pending} flag. * If the guard was not a member of {CONFIRMED_GUARDS}, we add it to the end of {CONFIRMED_GUARDS}. * If this circuit was <usable_on_completion>, this circuit is now <complete>. You may attach streams to this circuit, and use it for hidden services. * If this circuit was <usable_if_no_better_guard>, it is now <waiting_for retry>. You may not yet attach streams to it. Then check whether the {last_time_on_internet} is more than {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: * If it is, then mark all {PRIMARY_GUARDS} as "maybe" reachable. * If it is not, update the list of waiting circuits. (See [UPDATE_WAITING] below) [Note: the existing Tor logic will cause us to create more circuits in response to some of these steps; and see [ON_CONSENSUS].] ** Rationale ** See [SELECTING] above for rationale. 3.9. Updating the list of waiting circuits [Section:UPDATE_WAITING] We run this procedure whenever it's possible that a <waiting_for_better_guard> circuit might be ready to be called <complete>. * If any circuit is <waiting_for_better_guard>, and every currently {is_pending} circuit whose guard has higher priority has been in state <usable_if_no_better_guard> for at least {NONPRIMARY_GUARD_CONNECT_TIMEOUT} seconds, and all primary guards have reachable status of <no>, then call that circuit <complete>. * If any circuit is <complete>, then do not use any <waiting_for_better_guard> or <usable_if_no_better_guard> circuits circuits whose guards have lower priority. (Time them out after a {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds.) **Rationale** If we open a connection to a guard, we might want to use it immediately (if we're sure that it's the best we can do), or we might want to wait a little while to see if some other circuit which we like better will finish. When we mark a circuit <complete>, we don't close the lower-priority circuits immediately: we might decide to use them after all if the <complete> circuit goes down before {NONPRIMARY_GUARD_IDLE_TIMEOUT} seconds. 3.10. Whenever we get a new consensus. [Section:ON_CONSENSUS] We update {GUARDS}. For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and {FIRST_UNLISTED_AT}. [**] We remove entries from {SAMPLED_GUARDS} if appropriate, according to the sampled-guards expiration rules. If they were in {CONFIRMED_GUARDS}, we also remove them from {CONFIRMED_GUARDS}. We recompute {FILTERED_GUARDS}, and everything that derives from it, including {USABLE_FILTERED_GUARDS}, and {PRIMARY_GUARDS}. (Whenever one of the configuration options that affects the filter is updated, we repeat the process above, starting at the [**] line.) 3.11. Deciding whether to generate a new circuit. [Section:NEW_CIRCUIT_NEEDED] In current Tor, we generate a new circuit when we don't have enough circuits either built or in-progress to handle a given stream, or an expected stream. For the purpose of this rule, we say that <waiting_for_better_guard> circuits are neither built nor in-progress; that <complete> circuits are built; and that the other states are in-progress. A. Appendices A.1. Parameters with suggested values. [Section:PARAM_VALS] (All suggested values chosen arbitrarily) {param:MIN_SAMPLE_THRESHOLD} -- 15 {param:MAX_SAMPLE_THRESHOLD} -- 50 {param:GUARD_LIFETIME} -- 120 days {param:REMOVE_UNLISTED_GUARDS_AFTER} -- 20 days [previously ENTRY_GUARD_REMOVE_AFTER] {param:MIN_FILTERED_SAMPLE} -- 10 {param:N_PRIMARY_GUARDS} -- 3 {param:PRIMARY_GUARDS_RETRY_SCHED} -- every 30 minutes for the first 6 hours. -- every 2 hours for the next 3.75 days. -- every 4 hours for the next 3 days. -- every 9 hours thereafter. {param:GUARDS_RETRY_SCHED} -- 1 hour -- every hour for the first 6 hours. -- every 4 hours for the next 3.75 days. -- every 18 hours for the next 3 days. -- every 36 hours thereafter. {param:INTERNET_LIKELY_DOWN_INTERVAL} -- 10 minutes {param:NONPRIMARY_GUARD_CONNECT_TIMEOUT} -- 15 seconds {param:NONPRIMARY_GUARD_IDLE_TIMEOUT} -- 10 minutes {param:MEANINGFUL_RESTRICTION_FRAC} -- .2 {param:EXTREME_RESTRICTION_FRAC} -- .01 {param:GUARD_CONFIRMED_MIN_LIFETIME} -- 60 days A.2. Random values [Section:RANDOM] Frequently, we want to randomize the expiration time of something so that it's not easy for an observer to match it to its start time. We do this by randomizing its start date a little, so that we only need to remember a fixed expiration interval. By RAND(now, INTERVAL) we mean a time between now and INTERVAL in the past, chosen uniformly at random. A.3. Why not a sliding scale of primaryness? [Section:CVP] At one meeting, I floated the idea of having "primaryness" be a continuous variable rather than a boolean. I'm no longer sure this is a great idea, but I'll try to outline how it might work. To begin with: being "primary" gives it a few different traits: 1) We retry primary guards more frequently. [Section:RETRYING] 2) We don't even _try_ building circuits through lower-priority guards until we're pretty sure that the higher-priority primary guards are down. (With non-primary guards, on the other hand, we launch exploratory circuits which we plan not to use if higher-priority guards succeed.) [Section:SELECTING] 3) We retry them all one more time if a circuit succeeds after the net has been down for a while. [Section:ON_SUCCESS] We could make each of the above traits continuous: 1) We could make the interval at which a guard is retried depend continuously on its position in CONFIRMED_GUARDS. 2) We could change the number of guards we test in parallel based on their position in CONFIRMED_GUARDS. 3) We could change the rule for how long the higher-priority guards need to have been down before we call a <usable_if_no_better_guard> circuit <complete> based on a possible network-down condition. For example, we could retry the first guard if we tried it more than 10 seconds ago, the second if we tried it more than 20 seconds ago, etc. I am pretty sure, however, that if these are worth doing, they need more analysis! Here's why: * They all have the potential to leak more information about a guard's exact position on the list. Is that safe? Is there any way to exploit that? I don't think we know. * They all seem like changes which it would be relatively simple to make to the code after we implement the simpler version of the algorithm described above. TODO. Still non-addressed issues [Section:TODO] Formats to use when making information persistent Migration from old data format to new. Explain the overall flow of the circuit creation and guard picking algorithms, if they are not clear. Simulate to answer: Will this work in a dystopic world? Simulate actual behavior. For all lifetimes: instead of storing the "this began at" time, store the "remove this at" time, slightly randomized. Clarify that when you get a <complete> circuit, you might need to relaunch circuits through that same guard immediately, if they are circuits that have to be independent. Fix all items marked XX or TODO. "Directory guards" -- do they matter? Suggestion: require that all guards support downloads via BEGINDIR. We don't need to worry about directory guards for relays, since we aren't trying to prevent relay enumeration. IP version preferenes via ClientPreferIPv6ORPort Suggestion: Treat it as a preference when adding to {CONFIRMED_GUARDS}, but not otherwise.

3 2

txtorcon 0.15.0
by meejah 15 Aug '16

15 Aug '16

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm happy to announce txtorcon 0.15.0: * added support for NULL control-port-authentication which is often appropriate when used with a UNIX domain socket * switched to https://docs.python.org/3/library/ipaddress.html instead of Google's ipaddr; the API should be the same from a user perspective but **packagers and tutorials** will want to change their instructions slightly (``pip install ipaddress`` or ``apt-get install python-ipaddress`` are the new ways). * support the new ADD_ONION and DEL_ONION "ephemeral hidden services" commands in TorConfig * a first stealth-authentication implementation (for "normal" hidden services, not ephemeral) * bug-fix from https://github.com/david415 to raise ConnectionRefusedError instead of StopIteration when running out of SOCKS ports. * new feature from https://github.com/david415 adding a ``build_timeout_circuit`` method which provides a Deferred that callbacks only when the circuit is completely built and errbacks if the provided timeout expires. This is useful because :doc:`TorState.build_circuit` callbacks as soon as a Circuit instance can be provided (and then you'd use :doc:`Circuit.when_built` to find out when it's done building). * new feature from https://github.com/coffeemakr falling back to password authentication if cookie authentication isn't available (or fails, e.g. because the file isn't readable). * both TorState and TorConfig now have a ``.from_protocol`` class-method. * spec-compliant string-un-escaping from https://github.com/coffeemakr * fix https://github.com/meejah/txtorcon/issues/176 You can download the release from PyPI or GitHub (or of course "pip install txtorcon"): https://pypi.python.org/pypi/txtorcon/0.15.0 https://github.com/meejah/txtorcon/releases/tag/v0.15.0 Releases are also available from the hidden service: http://timaq4ygg2iegci7.onion/txtorcon-0.15.0.tar.gz http://timaq4ygg2iegci7.onion/txtorcon-0.15.0.tar.gz.asc You can verify the sha256sum of both by running the following 4 lines in a shell wherever you have the files downloaded: cat <<EOF | sha256sum --check f2e8cdb130aa48d63c39603c2404d9496c669fa8b4c724497ca6bfa7752a9475 dist/txtorcon-0.15.0.tar.gz a359fb5e560263499400018262494378b3d347cd04a361adb08939df95ecedf6 dist/txtorcon-0.15.0-py2-none-any.whl EOF thanks, meejah -----BEGIN PGP SIGNATURE----- iQEcBAEBAgAGBQJXl/KEAAoJEMJgKAMSgGmn76gH/1du7i9dmkMpr2PJrexVeXSo 9mSaeX/7KKaW71pEMmaCXfvhDJ6dMZDQpZ7saTM31zJZTp+MXjtHf0DZI2QTwgDw NYEBH+LO8PINN1ezPomgeZE6E4eJYlaDCyO6c7j3cOsEmohST+GPpvvdWdft+Sw2 hWvVf2+I4BV7vcIx6WQx4jKBS2gmlHbxuUv3LAnjj/Tn6oSYpft1IUK39pM66DX4 FzdYeBTloC6nzyH4sRTxnax+l9MfQJ2ZR+5alJi8uEvGlk580ciFASQNCVLaBY9r 4YALoipEg2Fm4BFA7qLsH0aFoLgx0lv7ng8lmpaP7XlPjUCuA7OcDp5jSqhGt2A= =hWml -----END PGP SIGNATURE-----

5 7

Tor and Namecoin
by Jeremy Rand 04 Aug '16

04 Aug '16

Hello Tor devs, Namecoin is interested in collaboration with Tor in relation to human-readable .onion names; I'm reaching out to see how open the Tor community would be to this, and to get feedback on how exactly the integration might work. The new hidden service spec is going to substantially increase the length of .onion names, which presents usability concerns. Namecoin provides a way to resolve a human-readable .bit name to a .onion name. Another benefit of Namecoin is that it provides a way to lookup TLS fingerprints for clearnet .bit sites, which reduces the risk of MITM attacks on clearnet websites from malicious or compromised CA's. I had the pleasure of meeting Mike Perry at the Decentralized Web Summit at the Internet Archive in June; I talked to him about Namecoin's rough plans and he suggested I post here. I understand that Riccardo Spagni from Monero discussed this topic as well with Roger Dingledine at the Security in Times of Surveillance conference at Ei_PSI. The two most major concerns that I expect would be brought up involve anonymity and blockchain size. Here's how we plan to deal with these issues: Namecoin already provides location-anonymity for name registrations assuming that it's routed via Tor. It's also necessary to broadcast transactions for different names to different peers, which isn't coded yet, but this is just coding work rather than an engineering challenge -- a usable workaround today is running multiple Namecoin wallets. The more interesting challenge is blockchain anonymity for registrations, due to the linkability required for blockchain validation. An important point here is that transactions for a given name are inherently linkable to each other, and that this isn't problematic. The problem would come when multiple names are linked together, or when a name is linked with currency transactions. The solution I've come up with is to use atomic cross-chain trades, which let a user buy namecoins using a cryptocurrency that is designed to provide anonymity (such as Monero or Zcash, both of which have cryptographic proofs of anonymity, given a certain anonymity set and security assumptions). The user would use an anonymous cryptocurrency to buy a small amount of namecoins (enough to register a single name and keep it renewed for a while). If the user wanted to register another name, she would perform another atomic cross-chain trade, receiving namecoins that are not linked to the namecoins obtained for the first name. As long as those namecoins are not mixed by the wallet software, the names remain unlinked. Many users won't want to download the full Namecoin blockchain (around 3 GB at the moment). I have a proof-of-concept SPV-based Namecoin name lookup client working as of early June. I just got a large part of that code upstreamed into libdohj, and I'm working on getting the rest upstreamed and released. It's in Java (based on BitcoinJ), so it's not subject to the memory safety concerns that C/C++ code are. The SPV name lookups are implemented in 3 ways, depending on the user's needs: Option A: 1. Block headers are synced over the Namecoin P2P network. (Over clearnet this takes about 5 minutes the first time it runs.) 2. An index mapping unexpired block heights to block hashes is constructed, so that lookups can be done quickly. (This occurs when the SPV client starts, after syncup has completed; it's fast enough that I haven't found a need to benchmark it.) 3. When a name lookup request is received, the client asks a remote API server for the height of the last update of the name. 4. The client looks up the block hash of that height from its index, and requests that block over the P2P network. 5. The client verifies that the received block matches the correct hash and that the block follows Namecoin rules (e.g. verifying the merkle root). 6. The client looks through the transactions in the block until it finds the one that updates the name. 7. The client retrieves the value of the name from that transaction, and returns it to the user. Option B: 1 through 3. Same as Option A. 4. The API server also provides the full content of the transaction, as well as a merkle proof of inclusion in the block. 5. The client verifies that the merkle proof links the hash of the provided transaction to the merkle root of the block header with the given height. 6. The client retrieves the value of the name from the provided transaction, and returns it to the user. Option C: 1. Block headers are synced over the Namecoin P2P network, as well as full blocks for the past year (meaning that all full blocks that contain unexpired name data will be synced). (Over clearnet this takes about 10 minutes the first time it runs.) 2. An index mapping names to transactions is constructed as the full blocks are downloaded. (This uses LevelDB.) 3. When a name lookup request is received, the client looks up the transaction in the LevelDB index. 4. The client retrieves the value of the name from that transaction, and returns it to the user. For Options A and B, if the API server is malicious, it can do any of the following: 1. Falsely claim that the name doesn't exist. 2. Provide outdated name data that is less than 36000 blocks old (the expiration period for Namecoin). (Option C is not vulnerable to either of those attacks.) If multiple API servers are consulted, and they return different results, it is easy to tell which is lying (although I haven't implemented any such logic yet). The API server cannot do any of the following: 1. Provide name data that isn't from the blockchain with the most work. 2. Provide name data that is more than 36000 blocks old (the expiration period for Namecoin). The reason an API server is used in Options A and B instead of the P2P network, is that the P2P network is unauthenticated and easy to Sybil. The P2P network is great for getting data that is independently verifiable (e.g. block headers and contents of blocks), but it's unwise to rely on the P2P network to get unverifiable data such as a block height of a name. An API server is authenticated (currently via CA-based TLS, but a cert pin or PGP signing is certainly doable), which reduces the possible points of attack. This is analogous to why Tor uses centralized directory authorities -- authenticated trust points are harder to Sybil. (We do have longer term plans to introduce a way for SPV clients to get the latest transaction associated with a name, without using an API server or needing to download any full blocks, but that's out of scope of this email.) Options A and B do reveal to the API server which name is being looked up. If mode A is used, it also reveals to a P2P peer which block height is being looked up (which narrows the set of names by a factor of ~36000). Therefore, Tor stream isolation should be used in such cases. (That's not implemented yet.) Option C doesn't generate any network traffic on lookups, so it doesn't reveal anything. In my testing, an SPV-based name lookup using Option A takes around 650 milliseconds (over clearnet). The vast majority of this is latency to the API server (the server I'm testing with is on a low-budget hosting plan). The portion consisting of a block retrieval over P2P takes around 98 milliseconds (although it varies by block size). Option C takes around 4 milliseconds. The storage overhead of Option C's LevelDB database is around 400 MB right now, although I believe it's feasible to reduce this significantly. There are a few options I can think of for integrating this with Tor for .onion naming. One would be to modify OnioNS to call the Namecoin SPV client. This would concern me because OnioNS is in C++, which introduces the risk of memory safety vulnerabilities. Another would be to use an intermediate proxy like Yawning's or-ctl-filter. A third option would be to try to get external name resolution implemented in Tor itself, which I believe Jeff Burdges has suggested in the past. If Option A or B is used, any solution would need to pass the stream isolation info to the SPV client. Integrating this with Tor Browser for TLS certificate validation might involve a Firefox patch. There are tricks that can be done with the CertDB and SiteSecurityService XPCOM interfaces that will do the job without Firefox patches, but XPCOM is being phased out by Mozilla in favor of WebExtensions, and I'm unaware of any equivalent features in WebExtensions. (Also, it's unclear to me whether CertDB and SiteSecurityService would introduce isolation issues -- I can't think of any obvious attacks, but I haven't thought very hard about it.) I'm trying to engage with Mozilla to see if we can work out a WebExtensions feature for this, but nothing conclusive has happened on that front yet. On the subject of reproducible builds, I've never tried to build Java code in Gitian, so I'm not certain how difficult it's going to be. Since Android uses Java, maybe the Guardian Project devs would have some insight into the best way to do it. One of the Namecoin developers (Joseph Bisch) is really good with reproducible builds (you probably know him since he's the author of the Debian guest support in Gitian), so I'm reasonably confident that a way to do it can be found. I'd love to hear feedback on all of this. Cheers, -Jeremy Rand Lead Application Engineer of Namecoin

4 7

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

tor-dev July 2016