Dear Tim,
See? This is why we need documentation like Aaron Johnson asked for. It was Aaron Gibson's project and several months ago Leif, Donncha and I participated in design discussion with Aaron.
It's been months since I worked on this project but I can tell you that our design discussions didn't have the shitty design where you hose the tor relays with 7000 sequential connections. When Daira Hopwood came to the onion space here in Berlin I specifically asked her to help us by designing an algorithm for lazily generating unordered/randomly-ordered lists of 2 hop circuits with a partitioning scheme so that the total list is made up of several list partitions which can be computed in parallel.
Which is why we put this in the CHANGELOG: """ Daira Hopwood - Wrote the algorithm for lazily generating tor circuit permutations, with parallelizeable partitioning scheme optimized for low CPU and memory consumption. """
I'm sorry the code is sloppy and buggy right now. It's perfectly understandable that you would write this e-mail to me because some of the code is obviously not written correctly... and we have no docs so how would you have known we don't intend to hose the tor network ;-p
In a few days I'm going to Croatia to go sailing with some friends. I'll be gone for a few weeks. When I get back I'll try to work on this some more. Donncha and Meejah can also merge pull requests for this project.
Tim, Thanks for pointing out that trac ticket btw. I was unaware of it.
Cheers, David
On Sun, Jul 10, 2016 at 09:14:45AM +1000, Tim Wilson-Brown - teor wrote:
On 9 Jul 2016, at 07:03, dawuud dawuud@riseup.net wrote:
Hey Aaron,
there's this; it's a work in progress https://github.com/TheTorProject/bwscanner
I want to detect various types of attacks on the tor network.
The "detect partitions" script looks interesting: https://github.com/TheTorProject/bwscanner/blob/develop/bwscanner/detect_par...
It seems similar to this trac ticket: https://trac.torproject.org/projects/tor/ticket/19068
And it seems to suffer from some of the issues I just described on that ticket:
- network load - running these tests as fast as you can puts significant load on the network
I think detect_partitions.py has an 0.2 second delay. I used 1 second when I did it.
- what if a relay is down, rather than blocked?
This might be able to be detected using multiple runs, or using the consensus (after an hour). Of course, if the script itself brings the relay down...
- making ~7000 connections through a single relay might overload it, particularly if it's low-bandwidth, file-descriptor limited, or behind a NAT box
I think detect_partitions.py uses the same first relay for all ~7000 second relays, then switches to another. This might cause connections through that relay to fail after a few hundred or thousand rapid attempts. But there's a "key" variable that could be used to permute this order.
Also, you might want to consider using only "Fast" relays, to avoid overwhelming low-end boxes, middleboxes, or networks.
- connection testing is directional - sometimes relay A can initiate a connection to relay B, but relay B can't initiate a connection to relay A. But once they're connected, they can both exchange cells.
I have no idea how to find out if a connection only works one way. Particularly when connections that are already open are bidirectional. Do multiple tests on different days with different orders? I think detect_partitions.py always uses the same order - it would be interesting to see if you get different results with the order inverted.
Tim