Hello,
In [1] David describes his preliminary results from scanning a portion of the tor network to detect connectivity problems (partitions) in the presumed tor mesh network (where every relay is expected to be able to reach every other relay).
I believe this information (How far off are we from a complete mesh network?) is crucial for the anonymity properties of the tor network and collecting the data to answer that question should be an integral and continuous part of tor.
Actively scanning the tor network for connectivity issues is resource intensive. What if we could collect this data without actively scanning for it?
This could be achieved by collecting reachability information passively by relays themselves and uploaded via extra-info descriptors.
This data could help reduce the overall scanning effort. So instead of scanning _all_ relays we can reduce the scanning to relays where some threshold of relays consistently said that they are hardly reachable. (to confirm their measurements)
Due to the mutual nature of the information collection, single or a minor number of lying relays would not be a big issue.
relays could:
- aggregate reachability issues over the past 24 hours (or week?) per outbound destination relay (which relay failed what percentage of circuit build attempts) - if a relay failed more than - some threshold value - check if the relay in question: - had his uptime reset during that timeframe (reachability problem expected) - dropped out of consensus during that timeframe - only include it in uploaded data if it didn't drop out of consensus and did not reset his uptime
(using a week instead of a day would help with reducing the amount of data that we need to process after collecting it)
In addition to that passive collection, relays - when idle/not overloaded - could actively attempt to create circuits to measure their reachability to relays for which they did not collect any any passive data. (maybe limit it to fast and stable destination relays)
To reduce the load on the tor network we could limit the active outbound connection tests based on relay flags: - Guard-only relays are more likely to establish outbound connections to non-exit relays - middle-only relays (no guard and exit flag) are more likely to establish connections to exits than to guards - exits are more likely to get inbound connections from non-guards (maybe skip active probes from exits since they are a bottleneck already?)
Reachability issues could also be displayed to the relay operator warning them about the potential problem via log entries. So they could actively work on debugging problems themselfes.
There is no doubt that this information is also valuable for (powerful) adversaries (it could help them reduce their effort when they know weak spots in the network). So you would have to decide what data you collect an what you would publish (collector.tpo) - even if that means I might never get to see that interesting data ;)
This is a medium/long term goal, with the usual steps:
- proposal - implementation (- deployment - data collection - the amount of data could be huge - data analysis)
If there is a consensus that this makes sense and if someone would actually implement it I would be happy to work/help on a proposal.
This entire idea would be an opt-in torrc setting at the beginning and a opt-out feature once we are more confident about its implications and safety.
Please let me know what you think about this idea.
regards, nusenu
[1] https://lists.torproject.org/pipermail/tor-project/2017-October/001492.html
related trac tickets: https://trac.torproject.org/projects/tor/ticket/12131 https://trac.torproject.org/projects/tor/ticket/19068
Are there plans to implement PeerFlow in Tor? Connectivity information like this would be an automatic byproduct.
- Ian
On 19 Oct 2017, at 00:04, Ian Goldberg iang@cs.uwaterloo.ca wrote:
Are there plans to implement PeerFlow in Tor?
As far as I am aware, we are planning on: * stabilising the current system * working out how to test changes * specifying what a good bandwidth allocation system has to do * incrementally replacing old scanners with suitable alternatives
I'd expect that dawuud's bwscanner, isis' bridge bandwidth scanner, and PeerFlow are all contenders. But I'm not aware of any follow-up research or implementation on PeerFlow.
Connectivity information like this would be an automatic byproduct.
Connectivity information is an automatic byproduct of the current bandwidth scanners (TorFlow), and any future replacements.
T