I think that many of my previous scans were not useful and showed inaccurate results because the IP address i was scanning from might have gotten black listed by dir-auths? or perhaps blocked by many relays by the anti-denial-of-service mechanisms in tor? i got rid of that virtual server and lost use of it's IP address... so we'll never know.
Katharina and I are interested in doing lots more thorough scans of the Tor network rather than this limited methodology i've been using.
What are the guidelines to avoid getting blocked by the tor network? Is it possible to check the consensus to see if a client IP has been blocked?
On Fri, Apr 27, 2018 at 09:12:59PM +0000, dawuud wrote:
Greetings,
( Meejah and I made txtorcon report the reason for circuit build failures here: https://github.com/meejah/txtorcon/pull/299 My scanner now uses this txtorcon feature: https://github.com/david415/tor_partition_scanner )
I used a collector consensus file: 2018-04-27-19-00-00-consensus
wget https://collector.torproject.org/recent/relay-descriptors/consensuses/2018-0...
and extracted the top 100 relays with the highest consensus weights with stable AND fast flags.
./helpers/query_fingerprints_from_consensus_file.py 2018-04-27-19-00-00-consensus > top100.relays
and then performed the scan, building 9900 2-hop tor circuits:
detect_partitions.py --tor-control unix:/var/run/tor/control --log-dir ./ --status-log ./status_log \ --relay-list top100.relays --secret secretTorEmpireOfRelays --partitions 1 --this-partition 0 \ --build-duration .25 --circuit-timeout 60 --log-chunk-size 1000 --max-concurrency 100
This resulted in only 307 circuit build failures:
echo "select reason from scan_log where status = 'failure'
;" | sqlite3 scan1.db | wc -l
307
And out of these failures, 301 of them the circuit build failure REASON was reported by little-t tor as TIMEOUT:
echo "select reason from scan_log where status = 'failure';" | sqlite3 scan1.db | grep -i timeout | wc -l 301
Here's the non-timeout REASONs for these circuit build failures:
echo "select reason from scan_log where status = 'failure';" | sqlite3 scan1.db | grep -vi timeout
DESTROYED, FINISHED DESTROYED, FINISHED DESTROYED, CHANNEL_CLOSED DESTROYED, CHANNEL_CLOSED DESTROYED, CHANNEL_CLOSED DESTROYED, CHANNEL_CLOSED
I'm curious to try this scan at different times of day to see if results vary.
Cheers,
David
On Tue, Mar 13, 2018 at 11:48:30PM +0000, dawuud wrote:
I did another scan, this time with 3 seconds between each circuit build and set the max connections to 50 with similar results as yesterday:
9354 failure 2 timeout 544 success
most of the circuit build failures happened in under a second:
echo "select (end_time - start_time) / 1000 as duration from scan_log where duration < 1 AND status = 'failure';" | sqlite3 scan1.db | wc -l 9344
txtorcon does expose both the 'reason' and the 'remote_reason' flags returned by the failure messages. In fact, it returns all flags that Tor sent during stream or circuit failures.
The **kwargs in stream_closed, circuit_closed or circuit_failed notifications should all include "REASON" and many times will also include "REMOTE_REASON" (e.g. if the "other" relay closed the connection). For convenience, txtorcon also includes lower-cased versions of all the flags.
ah ok! I will take a look at this. I'd like to do another scan while collecting this additional information.
Would it be better, then, to pick one first hop and scan (sequentially) every second-hop using that first hop? (And maybe have say 5 or 10 such things going on at once?)
Maybe it's ok to make 7,000+ tor circuits sequentially from the same relay if it's done very slowly?
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev