On 05/04/14 12:19, Lukas Erlacher wrote:
Hi Karsten,
On 04/05/2014 09:58 AM, Karsten Loesing wrote:
On second thought, and after sleeping over this, I'm less convinced that we should use an external library for the caching. We should rather start with a simple dict in memory and flush it based on some simple rules. That would allow us to tweak the caching specifically for our use case. And it would mean avoiding a dependency. We can think about moving to onion-py at a later point. That gives you the opportunity to unspaghettize your code, and once that is done we'll have a better idea what caching needs we have for the challenger tool to decide whether to move to onion-py or not. Would you still want to help write the simple caching code for challenger?
I cleaned up the caching code and added a simple in-memory dict caching provider that has no further dependencies to onion-py. (it also has no provisions for eviction/flushing at all, but I will add that next. Right now everything is cached forever, but of course a new response from OnionOO replaces an old one.)
Yeah, I think we'll want to define a maximum lifetime of cache entries, or the poor cache will explode pretty soon.
I can write the OnionOO API code and caching code for challenger, if I can use Python 3 and the requests library. (See below)
Great, your help would be much appreciated! Want to send me a pull request whenever you have something to merge?
See my response regarding Python 3 below.
Of course I'd really like to actually have a user for onion-py, since it would help getting the necessary feedback and polish to push the library to version 1.0, but I understand if that isn't appropriate for this project.
My hope with challenger is that it's written quickly, working quietly for a year, and then disappearing without anybody noticing. I'd rather not want to maintain yet another thing. So, maybe Weather is a better candidate for using onion-py than challenger.
I don't really understand what the code does. What is meant by "combining" documents? What exactly are we trying to measure? Once I know that and have thought of a sensible way to integrate it into onion-py I'm confident I can infact write that glue code :)
Right now, the script sums up all graphs contained in Onionoo's bandwidth, clients, uptime, and weights documents. It also limits the range of the new graphs to max(first) to max(last) of given input graphs.
For example, assume we want to know the total bandwidth provided by the following 2 relays participating in the relay challenge:
datetime: 0, 1, 2, 3, 4, 5, ...
relay 1: [5, 4, 5, 6] relay 2: [4, 3, 5, 4]
combined: [8, 9, 9, 6]
This is not perfect for various reasons, but it's the best I came up with yesterday. Also, as we all know, perfect is the enemy of good.
(If you're curious, reason #1: the graph goes down at the end, and we can't say whether it's because relay 2 disappeared or did not report data yet; reason #2: we're weighting both relays' B/s equally, though relay 1 might have been online 24/7 and relay 2 only long enough that Onionoo doesn't put in null; there may be more reasons.)
Ah, I see! :) So for scalar attributes of relays (such as consensus_weight_fraction) it's just a sum, and for histories it's the graphs combined as you just outlined. That makes sense, thank you!
Right. Though details documents are not included, so just graphs, no scalar attributes.
I'm not also sure about Python 3. Whatever we write needs to run on Debian Wheezy with whatever libraries are present there. If they're all Python 3, great. If not, can't do.
I would strongly prefer to use Python 3. I understand wanting to use debian stable (I use it myself), but Python 3 is 6 years old and Python 2 is completely dead and its use for new projects is not recommended. The only mandatory dependency for onion-py, and for me, is requests (I really dislike using urllib* directly - if you want to know why, check https://gist.github.com/kennethreitz/973705), and the python3-requests package in Wheezy is from 2012, and there is no python3-flask. :-(
Is there anything standing against using pip (python3-pip package) to install requests and flask from pypi?
If there's a way to build it only with packages coming out of Wheezy's apt-get, our sysadmins will like us more, and that's a good thing.
Installing packages using Python-specific package managers is going to make our sysadmins sad, so we should have a very good reason for wanting such a package. In general, we don't need the latest and greatest package. Unless we do.
All the best, Karsten