tor-dev April 2019

tor-dev@lists.torproject.org

16 participants
20 discussions

the consequences of deprecating debian alpha repos with every new major branch
by nusenu 21 Dec '21

21 Dec '21

Hi, tldr: - more outdated relays (that is a claim I'm making and you could easily proof me wrong by recreating the 0.3.3.x alpha repos and ship 0.3.3.7 in them and see how things evolve after a week or so) - more work for the tpo website maintainer - less happy relay operators [3][4] - more work for repo maintainers? (since a new repo needs to be created) When the tor 0.3.4 alpha repos (deb.torproject.org) first appeared on 2018-05-23 I was about to submit a PR for the website to include it in the sources.list generator [1] on tpo but didn't do it because I wanted to wait for a previous PR to be merged first. The outstanding PR got merged eventually (2018-06-28) but I still did not submit a PR to update the repo generator for 0.3.4.x nonetheless and here is why. Recently I was wondering why are there so many relays running tor version 0.3.3.5-rc? (see OrNetStats or Relay Search) (> 3.2% CW fraction) Then I realized that this was the last version the tor-experimental-0.3.3.x-* repos were shipping before they got abandoned due to the new 0.3.4.x-* repos (I can no longer verify it since they got removed by now). Peter made it clear in the past that the current way to have per-major-version debian alpha repos (i.e. tor-experimental-0.3.4.x-jessie) will not change [2]: > If you can't be bothered to change your sources.list once or twice a > year, then you probably should be running stable. but maybe someone else would be willing to invoke a "ln" commands everytime a new new alpha repo is born. tor-alpha-jessie -> tor-experimental-0.3.4.x-jessie once 0.3.5.x repos are created the link would point to tor-alpha-jessie -> tor-experimental-0.3.5.x-jessie It is my opinion that this will help reduce the amount of relays running outdated versions of tor. It will certainly avoid having to update the tpo website, which isn't a big task and could probably be automated but it isn't done currently. "..but that would cause relay operators to jump from i.e. 0.3.3.x to 0.3.4.x alphas (and break setups)!" Yes, and I think that is better than relays stuck on an older version because the former repo no longer exists and operators still can choose the old repos which will not jump to newer major versions. [1] https://www.torproject.org/docs/debian.html.en#ubuntu [2] https://trac.torproject.org/projects/tor/ticket/14997#comment:3 [3] https://lists.torproject.org/pipermail/tor-relays/2018-June/015549.html [4] https://trac.torproject.org/projects/tor/ticket/26474 -- https://twitter.com/nusenu_ https://mastodon.social/@nusenu

4 7

tor relay process health data for operators (controlport)
by nusenu 16 Nov '20

16 Nov '20

Hi, every now and then I'm in contact with relay operators about the "health" of their relays. Following these 1:1 discussions and the discussion on tor-relays@ I'd like to rise two issues with you (the developers) with the goal to help improve relay operations and end user experience in the long term: 1) DNS (exits only) 2) tor relay health data 1) DNS ------ Current situation: Arthur Edelstein provides public measurements to tor exit relay operators via his page at: https://arthuredelstein.net/exits/ This page is updated once daily. the process to use that data looks like this: - first they watch Arthur's measurement results - if their failure rate is non-zero they try to tweak/improve/change their setup - wait for another 24 hours (next measurement) This is a somewhat suboptimal and slow feedback loop and is probably also less accurate and less valuable data when compared to the data the tor process can provide. Suggestion for improvement: Exposes the following DNS status information via tor's controlport to help debug and detect DNS issues on exit relays: (total numbers since startup) - amount of DNS queries send to the resolver - amount of DNS queries send to the resolver due to a RESOLVE request - DNS queries send to resolver due to a reverse RESOLVE request - amount of queries that did not result in any answer from the resolver - breakdown of number of responses by response code (RCODE) https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-pa… - max amount of DNS queries send per curcuit If this causes a significant performance impact this feature should be disabled by default. 2) general relay health metrics -------------------------------- Compared to other server daemons (webserver, DNS server, ..) tor provides little data for operators to detect operational issues and anomalies. I'd suggest to provide the following stats via the control port: (most of them are already written to logfiles by default but not accessible via the controlport as far as I've seen) - total amount of memory used by the tor process - amount of currently open circuits - circuit handshake stats (TAP / NTor) DoS mitigation stats - amount of circuits killed with too many cells - amount of circuits rejected - marked addresses - amount of connections closed - amount of single hop clients refused - amount of closed/failed circuits broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n1402 https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n1994 - amount of closed/failed OR connections broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n2205 If this causes a significant performance impact this feature should be disabled by default. cell stats - extra info cell stats as defined in: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1072 This data should be useful to answer the following questions: - High level questions: Is the tor relay healthy? - is it hitting any resource limits? - is the tor process under unusual load? - why is tor using more memory? - is it slower than usual at handling circuits? - can the DNS resolver handle the amount of DNS queries tor is sending it? This data could help prevent errors from occurring or provide additional data when trying to narrow down issues. When it comes to the question: **Is it "safe" to make this data accessible via the controlport?** I assume it is safe for all information that current versions of tor writes to logfiles or even publishes as part of its extra info descriptor. Should tor provide this or similar data I'm planing to write scripts for operators to make use of that data (for example a munin plugin that connects to tor's controlport). I'm happy to help write updates for control-spec should these features seem reasonable to you. Looking forward to hearing your feedback. nusenu -- https://twitter.com/nusenu_ https://mastodon.social/@nusenu

5 7

DNS-over-HTTPS (DOH) in Firefox/Torbrowser
by nusenu 01 Jul '20

01 Jul '20

Hi, since Mozilla did tests [0] on DOH [1] in Firefox I was wondering if Torbrowser developers have put any thought into that as well? Note: I'm _not_ suggesting to use DOH in torbrowser I'm just asking because the answer probably matters for exit documentation in the relay guide if clients do DNS themselves over TCP connections instead of relying on the exit (even if torbrowser is not the only tor client). thanks, nusenu [0] https://www.ghacks.net/2018/03/20/firefox-dns-over-https-and-a-worrying-shi… [1] https://datatracker.ietf.org/doc/draft-hoffman-dns-over-https/ -- https://mastodon.social/@nusenu twitter: @nusenu_

4 4

Lets give every circuit its own exit IP?
by nusenu 18 Mar '20

18 Mar '20

The unbearable situation with Google's reCAPTCHA motivated this email (but it is not limited to this specific case). This idea came up when seeing a similar functionality in unbound (which has it for a different reason). Assumption: There are systems that block some tor exit IP addresses (most likely the bigger once), but they are not blocked due to the fact that they are tor exits. It just occurred that the IP got flagged because of "automated / malicious" requests and IP reputation systems. What if every circuit had its "own" IP address at the exit relay to avoid causing collateral damage to all users of the exit if one was bad? (until the exit runs out of IPs and starts to recycle previously used IPs again) The goal is to avoid accumulating a bad "reputation" for the single used exit IP address that affects all tor users of that exit. Instead of doing it on the circuit level you could do it based on time. Change the exit IP every 5 minutes (but do _not_ change the exit IPs for _existing_ circuits even if they live longer than 5 minutes). Yes, no one has that many IPv4 addresses but with the increasing availability of IPv6 at exits and destinations, this could be feasible to a certain extend, depending on how many IPv6 addresses the exit operator has. There are exit operators that have entire /48 IPv6 blocks. problems: - will not solve anything since reputation will shift to netblocks as well (How big of a netblock are you willing to block?) - you can tell two tor users easily apart from each other even if they use the same exit (or more generally: you can tell circuits apart). There might be all kinds of bad implications that I'm not thinking off right now. - check.tpo would no longer be feasible - how can do we still provide the list of exit IPs for easy blocking? Exits could signal their used netblock via their descriptor. What if they don't? (that in turn opens new kinds of attacks where an exit claims to be /0 and the target effectively blocks everything) - more state to track and store at the exit -... some random thoughts, nusenu -- https://mastodon.social/@nusenu twitter: @nusenu_

4 6

New revision: Proposal 295: Using ADL for relay cryptography (solving the crypto-tagging attack)
by Nick Mathewson 11 Jul '19

11 Jul '19

Hi! I'm sending a new version of proposal 295 from Tomer Ashur, Orr Dunkelman, and Atul Luykx. It's an updated version of their design for an improved relay cell encryption scheme, to prevent tagging attacks. This proposal is checked into the torspec repository. I'm also linking to a diagram for this scheme (and its latex source) from Atul Luykx: https://people.torproject.org/~nickm/prop295/ Finally, I have a draft python reference implementation for an older version of this proposal. I hope to be updating it soon and sending out a link next week. cheers! -- Nick Filename: 295-relay-crypto-with-adl.txt Title: Using ADL for relay cryptography (solving the crypto-tagging attack) Author: Tomer Ashur, Orr Dunkelman, Atul Luykx Created: 22 Feb 2018 Last-Modified: 1 March 2019 Status: Open 0. Context Although Crypto Tagging Attacks were identified already in the original Tor design, it was not before the rise of the Procyonidae in 2012 that their severity was fully realized. In Proposal 202 (Two improved relay encryption protocols for Tor cells) Nick Mathewson discussed two approaches to stymie tagging attacks and generally improve Tor's cryptography. In Proposal 261 (AEZ for relay cryptography) Mathewson puts forward a concrete approach which uses the tweakable wide-block cipher AEZ. This proposal suggests an alternative approach to Proposal 261 using the notion of Release (of) Unverified Plaintext (RUP) security. It describes an improved algorithm for circuit encryption based on CTR-mode which is already used in Tor, and an additional component for hashing. Incidentally, and similar to Proposal 261, this proposal employs the ENCODE-then-ENCIPHER approach thus it improves Tor's E2E integrity by using (sufficient) redundancy. For more information about the scheme and a security proof for its RUP-security see Tomer Ashur, Orr Dunkelman, Atul Luykx: Boosting Authenticated Encryption Robustness with Minimal Modifications. CRYPTO (3) 2017: 3-33 available online at https://eprint.iacr.org/2017/239 . For authentication between the OP and the edge node we use the PIV scheme: https://eprint.iacr.org/2013/835 2. Preliminaries 2.1 Motivation For motivation, see proposal 202. 2.2. Notation Symbol Meaning ------ ------- M Plaintext C_I Ciphertext CTR Counter Mode N_I A de/encryption nonce (to be used in CTR-mode) T_I A tweak (to be used to de/encrypt the nonce) T'_I A running digest ^ XOR || Concatenation (This is more readable than a single | but must be adapted before integrating the proposal into tor-spec.txt) 2.3. Security parameters HASH_LEN -- The length of the hash function's output, in bytes. PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509) DIG_KEY_LEN -- The key length used to digest messages (e.g., using GHASH). Since GHASH is only defined for 128-bit keys, we recommend DIG_KEY_LEN = 128. ENC_KEY_LEN -- The key length used for encryption (e.g., AES). We recommend ENC_KEY_LEN = 128. 2.4. Key derivation (replaces Section 5.2.2) For newer KDF needs, Tor uses the key derivation function HKDF from RFC5869, instantiated with SHA256. The generated key material is: K = K_1 | K_2 | K_3 | ... where, if H(x,t) denotes HMAC_SHA256 with value x and key t, and m_expand denotes an arbitrarily chosen value, and INT8(i) is an octet with the value "i", then K_1 = H(m_expand | INT8(1) , KEY_SEED ) and K_(i+1) = H(K_i | m_expand | INT8(i+1) , KEY_SEED ), in RFC5869's vocabulary, this is HKDF-SHA256 with info == m_expand, salt == t_key, and IKM == secret_input. When used in the ntor handshake a string of key material is generated and is used in the following way: Length Purpose Notation ------ ------- -------- HASH_LEN forward digest IV DF * HASH_LEN backward digest IV DB * ENC_KEY_LEN encryption key Kf ENC_KEY_LEN decryption key Kb DIG_KEY_LEN forward digest key Khf DIG_KEY_LEN backward digest key Khb ENC_KEY_LEN forward tweak key Ktf ENC_KEY_LEN backward tweak key Ktb DIGEST_LEN nonce to use in the * hidden service protocol * I am not sure that we need these any longer. Excess bytes from K are discarded. 2.6. Ciphers For hashing(*) we use GHASH with a DIG_KEY_LEN-bit key. We write this as Digest(K,M) where K is the key and M the message to be hashed. We use AES with an ENC_KEY_LEN-bit key. For AES encryption (resp., decryption) we write E(K,X) (resp., D(K,X)) where K is an ENC_KEY_LEN-bit key and X the block to be encrypted (resp., decrypted). For a stream cipher, unless otherwise specified, we use ENC_KEY_LEN-bit AES in counter mode, with a nonce that is generated as explained below. We write this as Encrypt(K,N,X) (resp., Decrypt(K,N,X)) where K is the key, N the nonce, and X the message to be encrypted (resp., decrypted). (*) The terms hash and digest are used interchangeably. 3. Routing relay cells 3.1. Forward Direction The forward direction is the direction that CREATE/CREATE2 cells are sent. 3.1.1. Routing from the Origin Let n denote the integer representing the destination node. For I = 1...n+1, T'_{I} is initialized to the 128-bit string consisting entirely of '0's. When an OP sends a relay cell, they prepare the cell as follows: The OP prepares the authentication part of the message: C_{n+1} = M T_{n+1} = Digest(Khf_n,T'_{n+1}||C_{n+1}) N_{n+1} = T_{n+1} ^ E(Ktf_n,T_{n+1} ^ 0) T'_{n+1} = T_{n+1} Then, the OP prepares the multi-layered encryption: For I=n...1: C_I = Encrypt(Kf_I,N_{I+1},C_{I+1}) T_I = Digest(Khf_I,T'_I||C_I) N_I = T_I ^ E(Ktf_I,T_I ^ N_{I+1}) T'_I = T_I The OP sends C_1 and N_1 to node 1. 3.1.2. Relaying Forward at Onion Routers When a forward relay cell is received by OR I, it decrypts the payload with the stream cipher, as follows: 'Forward' relay cell: T_I = Digest(Khf_I,T'_I||C_I) N_{I+1} = T_I ^ D(Ktf_I,T_I ^ N_I) C_{I+1} = Decrypt(Kf_I,N_{I+1},C_I) T'_I = T_I The OR then decides whether it recognizes the relay cell as described below. If the OR recognizes the cell, it processes the contents of the relay cell. Otherwise, it passes C_{I+1}||N_{I+1} along the circuit if the circuit continues. For more information, see section 4 below. 3.2. Backward Direction The backward direction is the opposite direction from CREATE/CREATE2 cells. 3.2.1. Relaying Backward at Onion Routers When a backward relay cell is received by OR I, it encrypts the payload with the stream cipher, as follows: 'Backward' relay cell: T_I = Digest(Khb_I,T'_I||C_{I+1}) N_I = T_I ^ E(Ktb_I,T_I ^ N_{I+1}) C_I = Encrypt(Kb_I,N_I,C_{I+1}) T'_I = T_I with C_{n+1} = M and N_{n+1}=0. Once encrypted, the node passes C_I and N_I along the circuit towards the OP. 3.2.2. Routing to the Origin When a relay cell arrives at an OP, the OP decrypts the payload with the stream cipher as follows: OP receives relay cell from node 1: For I=1...n, where n is the end node on the circuit: C_{I+1} = Decrypt(Kb_I,N_I,C_I) T_I = Digest(Khb_I,T'_I||C_{I+1}) N_{I+1} = T_I ^ D(Ktb_I,T_I ^ N_I) T'_I = T_I If the payload is recognized (see Section 4.1), then: The sending node is I. Stop, process the payload and authenticate. 4. Application connections and stream management 4.1. Relay cells Within a circuit, the OP and the end node use the contents of RELAY packets to tunnel end-to-end commands and TCP connections ("Streams") across circuits. End-to-end commands can be initiated by either edge; streams are initiated by the OP. The payload of each unencrypted RELAY cell consists of: Relay command [1 byte] 'Recognized' [2 bytes] StreamID [2 bytes] Length [2 bytes] Data [PAYLOAD_LEN-23 bytes] The 'recognized' field is used as a simple indication that the cell is still encrypted. It is an optimization to avoid calculating expensive digests for every cell. When sending cells, the unencrypted 'recognized' MUST be set to zero. When receiving and decrypting cells the 'recognized' will always be zero if we're the endpoint that the cell is destined for. For cells that we should relay, the 'recognized' field will usually be nonzero, but will accidentally be zero with P=2^-16. If the cell is recognized, the node moves to verifying the authenticity of the message as follows(*): forward direction (executed by the end node): T_{n+1} = Digest(Khf_n,T'_{n+1}||C_{n+1}) Tag = T_{n+1} ^ D(Ktf_n,T_{n+1} ^ N_{n+1}) T'_{n+1} = T_{n+1} The message is authenticated (i.e., M = C_{n+1}) if and only if Tag = 0 backward direction (executed by the OP): The message is authenticated (i.e., C_{n+1} = M) if and only if N_{n+1} = 0 The old Digest field is removed since sufficient information for authentication is now included in the nonce part of the payload. (*) we should consider dropping the 'recognized' field altogether and always try to authenticate. Note that this is an optimization question and the crypto works just as well either way. The 'Length' field of a relay cell contains the number of bytes in the relay payload which contain real payload data. The remainder of the payload is padding bytes. 4.2. Appending the encrypted nonce and dealing with version-homogenic and version-heterogenic circuits When a cell is prepared to be routed from the origin (see Section 3.1.1) the encrypted nonce N is appended to the encrypted cell (occupying the last 16 bytes of the cell). If the cell is prepared to be sent to a node supporting the new protocol, S is combined with other sources to generate the layer's nonce. Otherwise, if the node only supports the old protocol, n is still appended to the encrypted cell (so that following nodes can still recover their nonce), but a synchronized nonce (as per the old protocol) is used in CTR-mode. When a cell is sent along the circuit in the 'backward' direction, nodes supporting the new protocol always assume that the last 16 bytes of the input are the nonce used by the previous node, which they process as per Section 3.2.1. If the previous node also supports the new protocol, these cells are indeed the nonce. If the previous node only supports the old protocol, these bytes are either encrypted padding bytes or encrypted data. 5. Security 5.1. Resistance to crypto-tagging attacks A crypto-tagging attack involves a circuit with two colluding nodes and at least one honest node between them. The attack works when one node makes a change to the cell (tagging) in a way that can be undone by the other colluding party. In between, the tagged cell is processed by honest nodes which do not detect the change. The attack is possible due to the malleability property of CTR-mode: a change to a ciphertext bit effects only the respective plaintext bit in a predicatble way. This proposal frustrates the crypto-tagging attack by linking the nonce to the encrypted message such that any change to the ciphertext results in a random nonce and hence, random plaintext. Let us consider the following 3-hop scenario: the entry and end nodes are malicious and colluding and the middle node is honest. 5.1.1. forward direction Suppose that node I tags the ciphertext part of the message (C'_{I+1} != C_{I+1}) then forwards it to the next node (I+1). As per Section 3.1.2. Node I+1 digests C'_{I+1} to generate T_{I+1} and N_{I+2}. Since C'_{I+2} is different than it should be, so are the resulting T_{I+1} and N_{I+2}. Hence, decrypting C'_{I+2} using these values results in a random string for C_{I+2}. Since C_{I+2} is now just a random string, it is decrypted into a random string and cannot be 'recognized' nor authenticated. Furthermore, since C'_{I+1} is different than what it should be, T'_{I+1} (i.e., the running digest of the middle node) is now out of sync with that of the OP, which means that all future cells sent through this node will decrypt into garbage (random strings). Likewise, suppose that instead of tagging the ciphertext, Node I node tags the encrypted nonce N'_{I+1} != N_{I+1}. Now, when Node I+1 digests the payload the tweak T_{I+1} is find, but using it to decrypt N'_{I+1} again results in a random nonce for N_{I+2}. This random nonce is used to decrypt C_{I+1} into a random C'_{I+2} which is not recognized by the end node. Since C_{I+2} is now a random string, the running digest of the end node is now out of sync, which prevents the end node from decrypting further cells. 5.1.2. Backward direction In the backward direction the tagging is done by Node I+2 untagging by the Node I. Suppose first that Node I+2 tags the ciphertext C_{I+2} and sends it to Node I+1. As per Section 3.2.1, Node I+1 first digests C_{I+2} and uses the resulting T_{I+1} to generate a nonce N_{I+1}. From this it is clear that any change introduced by Node I+2 influences the entire payload and cannot be removed by Node I. Unlike in Section 5.1.1., the cell is blindly delivered by Node I to the OP which decrypts it. However, since the payload leaving the end node was modified, the message cannot be authenticated by the OP which can be trusted to tear down the circuit. Suppose now that tagging is done by Node I+2 to the nonce part of the payload, i.e., N_{I+2}. Since this value is encrypted by Node I+1 to generate its own nonce N_{I+1}, again, a random nonce is used which affects the entire keystream of CTR-mode. The cell again cannot be authenticated by the OP and the circuit is torn down. We note that the end node can modify the plain message before ever encrypting it and this cannot be discovered by the Tor protocol. This vulnerability is outside the scope of this proposal and users should always use TLS to make sure that their application data is encrypted before it enters the Tor network. 5.2. End-to-end authentication Similar to the old protocol, this proposal only offers end-to-end authentication rather than per-hop authentication. However, unlike the old protocol, the ADL-construction is non-malleable and hence, once a non-authentic message was processed by an honest node supporting the new protocol, it is effectively destroyed for all nodes further down the circuit. This is because the nonce used to de/encrypt all messages is linked to (a digest of) the payload data. As a result, while honest nodes cannot detect non-authentic messages, such nodes still destroy the message thus invalidating its authentication tag when it is checked by edge nodes. As a result, security against crypto-tagging attacks is ensured as long as an honest node supporting the new protocol processes the message between two dishonest ones. 5.3 The Running Digest Unlike the old protocol, the running digest is now computed as the output of a GHASH call instead of a hash function call (SHA256). Since GHASH does not provide the same type of security guarantees as SHA256, it is worth discussing why security is not lost from computing the running digest differently. The running digets is used to ensure that if the same payload is encrypted twice, then the resulting ciphertext does not remain the same. Therefore, all that is needed is that the digest should repeat with low probability. GHASH is a universal hash function, hence it gives such a guarantee assuming its key is chosen uniformly at random.

3 4

Release: obfs4proxy-0.0.10
by Yawning Angel 02 Jun '19

02 Jun '19

Hello, I just tagged obfs4proxy-0.0.10. The primary changes are a minor fix to the meek_lite behavior when using `utls` as the TLS implementation, and a series of updates (primarily following upstream) to the `utls` library. Tarball/Signature: https://people.torproject.org/~yawning/releases/obfs4proxy/obfs4proxy-0.0.1… https://people.torproject.org/~yawning/releases/obfs4proxy/obfs4proxy-0.0.1… Changes in version 0.0.10 - 2019-04-12: - Disable behavior distinctive to crypto/tls when using utls. - Bump the version of the utls fork. Regards, -- Yawning Angel

5 6

Merger and Mainline Handovers
by teor 30 May '19

30 May '19

Hi Nick, George, David, (I'm sending this email to tor-dev so everyone knows how Core Tor merges are going.) Mainline Mergers David is back from leave, so I'm going to stop doing mainline merges. But please let me know if there's a merge I can help with. (Email or Signal is best, IRC has a lot of backlog.) Do we need to do a handover some time? The next team meeting might be a good time. Mainline Merge Ready Tickets I moved my mainline merge trac wiki queries to this page: https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/Mainlin… That page should show all of the mainline merge_ready tickets, sorted by owner and reviewer. Your name is in bold, so you can work out which tickets you should merge. (We want 3 people to look at every ticket before it merges, except for trivial changes.) Here is our full list of task tracking wiki pages: https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam#TaskTra… When does 0.4.0 stop being mainline? It looks like people aren't merging backports to 0.4.0 any more. That's probably a good idea: we should minimise release candidate changes. When should I start doing 0.4.0 merges as part of the backports? Backport Status We released 0.4.0.4-rc last week, so I'm going to backport some low-risk changes to 0.2.9 and later. Most of these changes have been tested in 0.4.0.3-alpha. I should be able to do the backports tomorrow or Tuesday. Here are the backports for the next few days: https://trac.torproject.org/projects/tor/wiki/user/teor#Backports:0.5dayspe… Here are the backports I will do after I get back from my leave in May: https://trac.torproject.org/projects/tor/wiki/user/teor/HiddenBackports T -- teor ----------------------------------------------------------------------

2 2

Denial of service defences for onion services
by George Kadianakis 30 Apr '19

30 Apr '19

Hello list, This is a thread summarizing and brainstorming various defences about denial of service defences for onion services after an in-depth discussion with David Goulet. We've been thinking about denial of service defences for onion services lately. This has been a recurrent topic that has been creeping up every once in a while: Last time we had to tackle this issue it was back in early 2018 when we had to design a DoS mitigation subsystem because the network was crumbling down (https://trac.torproject.org/projects/tor/ticket/24902). Unfortunately, while the DoS mitigation subsystem improved the health of the network and stopped the DoS attacks back then, it did not address the total space of possible attacks, and onion services and the network is still open to various attacks. The main DoS attack right now is the naive attack of flooding the service with too many introduction requests, and this is the attack that this post is gonna be dealing with. We don't like DoS attacks because they cause two issues to Tor: a) They damage the health of the Tor network impacting every user b) They kill availability of legitimate onion services. In this thread we will handle these two issues independently, as there is no single solution that improves both areas at once. We have some pretty good ideas on (a), but we would appreciate ideas on (b), so feel free to give us your input. == a) Minimizing the damage to the network caused by DoS attacks: Most of the damage caused during DoS attacks is from the circuits created by the attacker to introduce/rendezvous to the victim onion service, and also by the circuits created by the victim onion service as it tries to rendezvous with all those clients. An attacker can literally create tens of thousands of introduction circuits in less than a minute, which get amplified by the service launching that many rendezvous circuits. Not good. Here are a few ways to reduce the damage to the network: == 1) Rate limiting introduction circuits There should be a way to rate-limit introductions so that services do not get overwhelmed. There are various places where we can rate-limit: we could rate-limit on the guard-layer, or on the intro-point layer or on the service-layer. We have already attempted at rate-limiting on the guard-layer with #24902, but it's hard to go deeper there because the guard does not know if the circuit is a DoS attacker, or a busy onion service, or 150 Tor users in an airport. We also think that rate-limiting on the service-layer won't do much good since that's too far down the circuit, and we are trying to reduce the operations it has to do so that it doesn't get overwhelmed (see #15463 for various queue-management approaches for rate-limiting on the service side). So we've been thinking of rate-limiting on the introduction point layer, since it's a nice soaking point that does not do much right now. See #15516 (comment 28) for a concrete proposal by arma which results in far less damage to the network (since evil traffic does not get carried through to the service-side introduction circuit, and no extra rendezvous circuits get launched), and also a swifter way for legit clients to know that an onion-service circuit won't work. == 2) Stop needless circuit rotation on service-side Right now, services will rotate their introduction circuits after a certain number of introductions (#26294). This means that during an attack, the service not only needs to handle thousands of fake introduction circuits, but also continuously tear down and recreate introduction circuits and publish new descriptors. See comment 8 on that ticket for a short-term proposal on how to improve the situation here, by not continuously rotating introduction points. == 3) Optimize CPU performance on the service-side Right now, onion services during an attack are actually CPU bound. See #30221 for various improvements we can do to improve the performance of services. However, improving CPU performance might have the opposite effect, since processing cells quicker means that the service will make even more rendezvous circuits. == 4) Make sure attackers don't take shortcuts around the protocol We should make sure that attackers don't take shortcuts around the Tor protocol to launch their attacks. Examples here involve requiring a proof-of-rendezvous from clients (#25066), and not allowing single-hop proxies to do introductions (#22689). The above suggestions (maybe in priority order) are ways we can improve the damage dealt to the network by DoS attackers. But that still does not make DoS attacks less effective. So here follows the section about improving service availability: == b) Improve service availability during DoS attacks Unfortunately, it's really hard to accurately stop DoS attacks in the Tor protocol. There is just no good way to distinguish between innocent clients trying to access content, and a bad actor trying to disable an onion service. Here is the main way we've thought of addressing this issue: == 1) Binding the application-layer with the Tor introduction-layer We think that the Tor protocol layer might not be the right place for handling DoS attacks. There are literally million-dollar companies trying hard to tackle this issue on the application-layer, where it's easier since you can do machine learning, give out captchas, zone out users, etc. And that's why we think that the solution to this issue lies on the application-layer and not on the Tor protocol layer. In particular, a plausible solution here might involve for the client to embed application-layer information (e.g. a username/password) in its INTRODUCE1 cell, which then gets passed to the service. The service, can then check whether the given username/password should be allowed to connect (see "rendezvous approver" concept at #16059), and allow or reject the connection as it wishes. This way onion service operators can have complicated application-layer software that analyzes the activity of users and decide whether users should be allowed in or not (based on the number of introductions, or their application-layer (web) activity). +===========================================+ | Tor network | +===========================================+ ^ ^ | +-----+ | +-------->| Tor |-------------------+ INTRO2 | HS | rendezvous circuit with +-----+ only if approved user/pass ^ | | v +----------+ +-------+ |Rendezvous|<------->|sqlite?| |approver | +-------+ +----------+ We think that this is a solution that could allow onion services to continue existing under high-load scenarios, since no rendezvous circuits would be established during DoS scenarios (and we know that rendezvous circuits is what causes the most CPU/network/availability damage). However, this is a very complicated solution from an engineering perspective given that it requires changes on the client-side (to enhance INTRO1 cells with application-layer data), and also involves various enhancements on the service-side (various control port commands to interact with the (nonexistent) "rendezvous approver" software, which in turn needs to interact with other application-layer software (e.g. sql databases to manage membership). There is also serious UX concerns with how this would look like on the client-side? Also, how does this interact with client auth? And how does this interact with intro-point-level rate limiting proposed above (onions should be given the option to disable intro-layer rate limiting)? How is this related to #17254? All in all, we feel like we have pretty good options for reducing the damage that DoS attacks cause on our network, but we are still lacking easy and practical solutions for ensuring availability of onion services that are under DoS. For the next months, we plan to focus on reducing the damage on the network, since the damage on the network has a cummulative effect as circuits fail and get endlessly retried, where nothing ends up working right. At the same time, we will be thinking of good solutions for keeping a high availability on services that receive DoS attacks. We would love your feedback and suggestions. Thanks!

1 0

Proposing sbws changes
by teor 24 Apr '19

24 Apr '19

Hi all, We finished our first working version of sbws in March, and deployed it to a directory authority. We're now working on deploying it to a few more directory authorities: https://trac.torproject.org/projects/tor/ticket/29290 We're also working on archiving and analysing the bandwidth files produced by sbws and Torflow: https://trac.torproject.org/projects/tor/ticket/21378 During this work, we've discovered some missing sbws features: https://trac.torproject.org/projects/tor/ticket/30255 We need a better process for proposing and reviewing sbws changes. At the moment, I am spending a lot of time reviewing and designing sbws changes. And that's not sustainable. We need a process that works when I go on leave, or get busy with other tasks. I suggest that we use the tor proposals process: https://gitweb.torproject.org/torspec.git/tree/proposals/001-process.txt We can submit small changes as diffs to the bandwidth file spec: https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-spec.txt But large changes, controversial changes, and changes with multiple competing options should have their own proposal. Then, once we decide what to do, we can integrate those changes into the spec. T -- teor ----------------------------------------------------------------------

1 1

PrivCount Status
by teor 24 Apr '19

24 Apr '19

Hi, Nick asked me to send a status email about PrivCount, before I go on leave for a few weeks. Plan We want to add the following counters to a PrivCount Proof of Concept: * check counters (zero, relay count, time in seconds) * consumed bandwidth Nick also suggested adding connection counts. That seems like a good counter, but we want to make sure we do bandwidth in the first release, because it's a high-risk statistic. Status In March and April, I deferred PrivCount tasks to work on chutney for one of our other sponsors. I also delayed these tasks, because I was waiting for #29017 and #29018 to merge: * #29017 PaddingStatistics should be disabled when ExtraInfoStatistics is 0 * #29018 Make all statistics depend on ExtraInfoStatistics Tickets The top-level ticket is: PrivCount proof of concept with existing statistics https://trac.torproject.org/projects/tor/ticket/27908 I was mainly working on code for these tickets: PrivCount proof of concept: implement check counters https://trac.torproject.org/projects/tor/ticket/29004 PrivCount proof of concept: implement consumed bandwidth counters https://trac.torproject.org/projects/tor/ticket/29005 Make relays report bandwidth usage more often in test networks https://trac.torproject.org/projects/tor/ticket/29019 Code I have incomplete branches for #29004, #29005, and #29019 here: https://github.com/teor2345/tor/tree/ticket29004-wip https://github.com/teor2345/tor/tree/ticket29005 https://github.com/teor2345/tor/tree/ticket29019 I think all the necessary code is present in these branches. (But maybe it's not???) But it needs some cleanup: * rebase on to the current master, * put the commits on the right branches * make sure it does what these tickets say it should do I'm happy to do that after I come back from leave. I am also happy if Nick wants to clean up this code. See also my previous email about BridgeDB and PrivCount. Maybe we can save ourselves some effort by using PrivCount's obfuscation on BridgeDB's statistics. T -- teor ----------------------------------------------------------------------

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

tor-dev April 2019