Found a serious bug in the
updateFallbackDirs.py
script used to generate the fallback relay candidate list. The OnionOO retrieval used to pull flag history returns the entire history of each relay rather than the 120 days requested.
This results in 145 relays left off the list as too-old history is averaged into the percentages.
While this bug appears to be in either OnionOO or in the constructed request URL (not sure which), I opted to add a maximum history age to the script. The result is 558 relays instead of the original 413. (My relay Binnacle now easily makes the cut with guard flag at 97.4 percent.)
In addition several data validation bugs were discovered empirically and corrected.
Changes:
* support maximum history age in _avg_generic_history()
* create and apply MAX_AGE_DAYS=120 in calls to _avg_generic_history()
* fix division-by-zero trap in _avg_generic_history()
* skip missing (i.e. null/None) intervals in _avg_generic_history()
* Python timedelta.total_seconds() function not available in 2.6; replace with equivalent expression
* set DEBUG logging level to make relay exclusion reasons visible
* move CUTOFF_GUARD test to end in order to expose more exclusion reasons
Attached to this message are
1) revised script 2) script patch
Posted to PasteBin (one month expire):
3) fallback_dirs.inc as-of 2016/01/11 http://pastebin.com/8cxKCEP6
4) debug output showing causes of relay exclusions http://pastebin.com/raw/3SBpgECm
follow up:
Looking at the URL it appears that the selection criteria indicates which records to select NOT the range of history to return. So the having the script filter by MAX_AGE (per the patch) is correct. A quick examination of
https://onionoo.torproject.org/protocol.html
reveals no obvious way to restrict the history time-range returned.
On 12 Jan 2016, at 10:14, starlight.2015q3@binnacle.cx wrote:
Found a serious bug in the
updateFallbackDirs.py
script used to generate the fallback relay candidate list. The OnionOO retrieval used to pull flag history returns the entire history of each relay rather than the 120 days requested.
This results in 145 relays left off the list as too-old history is averaged into the percentages.
Thanks, logged as #18035 https://trac.torproject.org/projects/tor/ticket/18035 https://trac.torproject.org/projects/tor/ticket/18035
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP 968F094B
teor at blah dot im OTR CAD08081 9755866D 89E2A06F E3558B7F B5A9D14F
On 12 Jan 2016, at 12:11, Tim Wilson-Brown - teor teor2345@gmail.com wrote:
On 12 Jan 2016, at 10:14, starlight.2015q3@binnacle.cx mailto:starlight.2015q3@binnacle.cx wrote:
Found a serious bug in the
updateFallbackDirs.py
script used to generate the fallback relay candidate list. The OnionOO retrieval used to pull flag history returns the entire history of each relay rather than the 120 days requested.
This results in 145 relays left off the list as too-old history is averaged into the percentages.
Thanks, logged as #18035 https://trac.torproject.org/projects/tor/ticket/18035 https://trac.torproject.org/projects/tor/ticket/18035
I've reviewed this patch and it looks good. I updated it for the latest version of the script in the tor git repository, as it was based on an old version.
I've also logged the more general issue with OnionOO and date ranges as #18036. https://trac.torproject.org/projects/tor/ticket/18036 https://trac.torproject.org/projects/tor/ticket/18036
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP 968F094B
teor at blah dot im OTR CAD08081 9755866D 89E2A06F E3558B7F B5A9D14F
tor-relays@lists.torproject.org