I am working on a Python-based exit relay scanner which should detect malicious and misbehaving exits. The design should have a reasonable balance between being fast/parallel and stressing the network as little as possible.
I came up with the following three steps:
1. Spawn a "parent" Tor process to get an up-to-date consensus.
2.1 For every selected exit relay, spawn a lightweight Tor process.
2.2 The consensus is copied from the "parent" process to the lightweight process' data directory. That way, the consensus has to be downloaded only once.
2.3 Every lightweight Tor process has the following configuration:
--- SOCKSPort auto ControlPort 0 __DisablePredictedCircuits 1 UseEntryGuards 0 FetchServerDescriptors 0 DataDirectory <data_directory> ExitNodes <exit_relay> ---
Entry guards are not used to distribute the load. Predicted circuits are disabled to prevent expensive creation of circuits which would not be used anyway. In addition, I am considering adding "EntryNodes" or "Bridge" to concentrate the first hop's load on machines under my control.
3. torsocks is then used to establish decoy connections over the respective exit relay. After that, the process is terminated.
Any thoughts on how to further improve the design or ideas for a better one?
Cheers, Philipp