On Sat, Apr 5, 2014 at 8:58 AM, Karsten Loesing karsten@torproject.org wrote:
Right now, the script sums up all graphs contained in Onionoo's bandwidth, clients, uptime, and weights documents. It also limits the range of the new graphs to max(first) to max(last) of given input graphs.
For example, assume we want to know the total bandwidth provided by the following 2 relays participating in the relay challenge:
datetime: 0, 1, 2, 3, 4, 5, ...
relay 1: [5, 4, 5, 6] relay 2: [4, 3, 5, 4]
combined: [8, 9, 9, 6]
This is not perfect for various reasons, but it's the best I came up with yesterday. Also, as we all know, perfect is the enemy of good.
(If you're curious, reason #1: the graph goes down at the end, and we can't say whether it's because relay 2 disappeared or did not report data yet; reason #2: we're weighting both relays' B/s equally, though relay 1 might have been online 24/7 and relay 2 only long enough that Onionoo doesn't put in null; there may be more reasons.)
For the relay challenge, wouldn't you want to include the entire period that data is available for (i.e., min(first) to max(last))? Otherwise, if you are looking at a month's worth of data and a new relay arrives on the last day, your graph would only contain that day.
Also, I think you would want to do datetime.strptime(max(first), ...) here: https://github.com/kloesing/challenger/blob/master/challenge.py#L177-L178 Otherwise you're just taking the last relay's first and last values as the new_first and new_last.
Cheers, - Nikita