On 05/04/14 16:42, Nikita Borisov wrote:
On Sat, Apr 5, 2014 at 8:58 AM, Karsten Loesing karsten@torproject.org wrote:
Right now, the script sums up all graphs contained in Onionoo's bandwidth, clients, uptime, and weights documents. It also limits the range of the new graphs to max(first) to max(last) of given input graphs.
For example, assume we want to know the total bandwidth provided by the following 2 relays participating in the relay challenge:
datetime: 0, 1, 2, 3, 4, 5, ...
relay 1: [5, 4, 5, 6] relay 2: [4, 3, 5, 4]
combined: [8, 9, 9, 6]
This is not perfect for various reasons, but it's the best I came up with yesterday. Also, as we all know, perfect is the enemy of good.
(If you're curious, reason #1: the graph goes down at the end, and we can't say whether it's because relay 2 disappeared or did not report data yet; reason #2: we're weighting both relays' B/s equally, though relay 1 might have been online 24/7 and relay 2 only long enough that Onionoo doesn't put in null; there may be more reasons.)
For the relay challenge, wouldn't you want to include the entire period that data is available for (i.e., min(first) to max(last))? Otherwise, if you are looking at a month's worth of data and a new relay arrives on the last day, your graph would only contain that day.
Very good point!
The reason why I didn't include everything from min(first) to max(last) is that any graph covers the last $time_period of the relay or bridge being online and reporting data. So, the "3_days" graph of a specific relay could show a 3-day period weeks ago, and we wouldn't want to merge that with other 3-day periods which are more recent. Of corse, you're right that a new relay covering only a few hours in their "3_days" graph would reduce our combined graph to just that. Oops.
So, I guess what we want to do is include everything from $(now - 3 days) to $now in the combined graph. Will fix.
Also, I think you would want to do datetime.strptime(max(first), ...) here: https://github.com/kloesing/challenger/blob/master/challenge.py#L177-L178 Otherwise you're just taking the last relay's first and last values as the new_first and new_last.
Another very good point. Will fix.
Thanks for the review!
All the best, Karsten