On 6/4/12 8:47 AM, Fabio Pietrosanti (naif) wrote:
I would suggest to rewrite onionoo in Python on the basis of STEM (or extension of txtorcon):
https://gitweb.torproject.org/stem.git by atagar
It now have extensive parsers for Tor's cached-consensus i think it may quickly provide the REST interface that Atlas need: https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/server_des... https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/extrainfo_...
I agree that somebody should rewrite Onionoo in Python. I expect that to take a few months though. I'm going to keep maintaining the Java Onionoo until we have a stable and maintained Python replacement. I'm happy to guide people through the tricky parts of writing an Onionoo kind of thing.
For example, being able to parse Tor descriptors is a great start for writing a Python Onionoo. But you'll also have to aggregate bandwidth data in an efficient way. The Java Onionoo uses flat files to store bandwidth data and aggregate them more the higher they lie in the past. That code [0] wasn't exactly trivial to write, and it's also not the most beautiful piece of code I ever wrote. But it's efficient, which is key. Of course, you could start by porting the Java code 1:1 to Python and think about optimizations later.
That way you can have a "single package" with Python App + Atlas, without the overhead of Java given by onionoo.
I'd prefer keeping the (Python) Onionoo and Atlas in two distinct packages. Or rather, the REST interface should stay as open and independent from Atlas as it is now. The Atlas website isn't the only client that could use Onionoo's data. Think of non-JavaScript-based websites that cache Onionoo data rather than letting the client do all the work, mobile clients, Vidalia/arm extensions, tray icons, social network site plugins, command-line tools, etc.
Best, Karsten
[0] https://gitweb.torproject.org/onionoo.git/blob/HEAD:/src/org/torproject/onio...