I've been working on a major OnionShare release that, among other things, will use v3 onion services by default. But it appears that either something in stem or in Tor deals with v3 onions differently than v2 onions, and causes a critical bug in OnionShare. It took a lot of work to track down exactly how to reproduce this bug, and I haven't opened an upstream issue for either stem or tor because I feel like I don't understand it enough yet.
Here is the OnionShare issue [1]. Does anyone know if this is an issue with stem or tor, and how to go about mitigating it? Here's some background:
When you share files with OnionShare it starts a local web server and then makes it an ephemeral onion service. Someone else loads this onion service in Tor Browser and downloads your files. If the setting "Stop sharing after first download" is checked (this is the default behavior), then as soon as the download completes, OnionShare stops the onion service and web server.
If I share a 1mb file and a Tor Browser client makes an HTTP request to download it, OnionShare will respond with a 1mb HTTP response. As soon as it's done sending the 1mb, OnionShare stops the server and alerts the user that transfer is complete -- however, it's not actually complete yet. If you look at the client in Tor Browser, it's still downloading. It depends on the Tor circuit, but it might be only ~32% done, and you have to wait a few seconds for it to finish. This is because when the final byte of the 1mb file leaves the OnionShare web server, it takes a few seconds for that final byte to make it through the onion circuit and into Tor Browser download.
This works fine with v2 onion services.
But with v3 onion services, as soon as the OnionShare web server finishes sending the full HTTP response, the torified HTTP client stops downloading. I made a small python script, onion-bug.py, that reproduces the issue that you can test [2].
This script connects to the default Tor control port for Tor Browser, so open Tor Browser in the background. It then start an HTTP server and creates an onion service, and if you make a GET request to the server it responds with 2mb of "A" characters, and then immediately stops the web server and onion service. You have to pass in either "v2" or "v3", depending on which type of onion service to make. Like OnionShare, this script uses stem, and specifically the Controller.create_ephemeral_hidden_service method [3].
I just ran "./onion-bug.py v2" in one terminal and waited for it to start:
``` $ ./onion-bug.py v2 http service: http://127.0.0.1:8080/
starting onion service with: key_type='NEW', key_content='RSA1024' http://d7vomh45i7ryhhfw.onion/ ```
Then in a second terminal, I made the GET request:
``` $ torify curl http://d7vomh45i7ryhhfw.onion/ > out2 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1024k 0 1024k 0 0 125k 0 --:--:-- 0:00:08 --:--:-- 245k ```
The script in the first terminal finished and quit (while curl on the second term finished the download for a few more seconds):
``` 127.0.0.1 - - [24/Nov/2018 21:12:52] "GET / HTTP/1.1" 200 - shutting down http server ```
The file, out1, is exactly 1mb.
Now I tried doing the same, but with a v3 onion service:
``` $ ./onion-bug.py v3 http service: http://127.0.0.1:8080/
starting onion service with: key_type='NEW', key_content='ED25519-V3' http://htrzngqgb7ogyifsvah6qz7t6spe3rr7bikdws7d3rigkpwy5kuyrlqd.onion/
127.0.0.1 - - [24/Nov/2018 21:14:22] "GET / HTTP/1.1" 200 - shutting down http server ```
And in the other terminal:
``` $ torify curl http://htrzngqgb7ogyifsvah6qz7t6spe3rr7bikdws7d3rigkpwy5kuyrlqd.onion/ > out2 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 29830 0 29830 0 0 3399 0 --:--:-- 0:00:08 --:--:-- 3399 ```
It only downloaded 29830 bytes before the download connection stopped, so out2 is only 30kb. I just tried it again, and this time only got 12kb downloaded. And I tried it again, and got 41kb downloaded. It appears to depend on the speed of the Tor circuit.
The download only stops when the v3 onion service is shut down -- if you uncomment out the time.sleep(5) before calling remove_ephemeral_hidden_service at the end of the script, you'll be able to download the full 1mb using the v3 onion server.
I'm pretty stumped on how to fix this and I'd appreciate any help.
Until it's fixed, we'll either have to not include support for v3 onion services in OnionShare in the next major release (version 2.0), or include support but with the "Stop sharing after first download" only available for v2 onion services -- so probably we'd default to using v2 onion services, and users could optionally use v3 if they don't need the autostop feature. Of course, I'd much prefer to find a way to solve this and release with v3 onion services by default.
[1] https://github.com/micahflee/onionshare/issues/812 [2] https://gist.github.com/micahflee/bade960e96d35007bc5c182a0ca61b56 [3] https://stem.torproject.org/api/control.html#stem.control.Controller.create_...
On 2018-11-25 05:30, Micah Lee wrote:
I've been working on a major OnionShare release that, among other things, will use v3 onion services by default. But it appears that either something in stem or in Tor deals with v3 onions differently than v2 onions, and causes a critical bug in OnionShare. It took a lot of work to track down exactly how to reproduce this bug, and I haven't opened an upstream issue for either stem or tor because I feel like I don't understand it enough yet.
Hi Micah and all,
Thanks for the heads-up!
I write here only to confirm that I can reproduce the issue without stem (using bulb) [1]. It seems that the underlying issue should be in little-t-tor and not in stem. Or yes, maybe it's not even a bug (though it seems weird to me).
[1] https://github.com/nogoegst/onion-abort-issue
-- Ivan Markin
On 24 Nov (21:30:16), Micah Lee wrote:
[snip]
Greetings Micah!
But with v3 onion services, as soon as the OnionShare web server finishes sending the full HTTP response, the torified HTTP client stops downloading. I made a small python script, onion-bug.py, that reproduces the issue that you can test [2].
This is definitely on "tor" side. Here is what is going on:
When a DEL_ONION command is received, the v3 subsystem will close _all_ related circuits including the rendezvous circuit (where the data is being transferred).
This is something the v2 subsystem does *not* do so there is your difference between the two versions. Not closing the rendezvous circuit has the side effect that the connected client can still talk to the .onion as long as the application is still running behind. In the case of OnionShare, I don't think it matters since the Web server is simply gone by then.
That being said, I see that your Python script waits until everything has been given to "tor" before sending a DEL_ONION (correct me if I'm wrong). So then the question is how can the circuit dies _before_ everything was sent to the client if tor has recevied everything?
This is due to how tor handles cells. A DESTROY cell (which closes the circuit down the path) can be sent even if cells still exist on the circuit queue. In other words, once a DESTROY cell is issued, it will be high priority and thus can leave behind some data cells. There are reasons for that so this isn't a "weird behavior" but by design.
The solution here is to make v3 act like v2 does, that is close everything except the existing established RP circuit(s) so any transfer can be finalized and let the application on the other side close connections or simply stop serving.
I've opened this and marked it for backport: https://trac.torproject.org/projects/tor/ticket/28619
Big thanks to everyone on that OnionShare ticket for the thorough report! David
On 11/26/18 7:55 AM, David Goulet wrote:
I've opened this and marked it for backport: https://trac.torproject.org/projects/tor/ticket/28619
Big thanks to everyone on that OnionShare ticket for the thorough report! David
Thank you!