-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Damian Johnson atagar@torproject.org writes:
(If this type of mail isn't appropriate for tor-dev please let me know...)
On a side note, do you think that any txtorcon/stem work would be appropriate? They're both aiming to be a library that does largely the same things. The twisted/threading differences mean that our controller classes are incompatible, but other bits of the parsing and such should be interchangeable. For instance, I've invested an immense amount of effort into parsing (and tests) for descriptor content...
I was thinking about this a little last week -- it would certainly be nice to abstract more of the "general parsing stuff". There are a few gotchas since the threaded versus event-based way to get information from the protocol is pretty different. For authentication, for example, SAFECOOKIE is a two-part affair and you have to wait for a response half way which is quite different in an event-based vs. threaded API.
I've tried to imagine a threaded-friendly wrapper around at least txtorcon.TorControlProtocol which might not be hard for the simple command-response things (but see below).
Certainly at least the parsing should be able to be shared somehow. Further also to naif's email, I would imagine this would be most useful as a "Python utilities for Tor" library. The only thing I can really imagine abstracting from txtorcon is the simple descriptors, like what "getinfo ns/all" returns. Most of the other parsing is pretty protocol-specific, IMO
The main issue with abstracting more than that in a controller is that at some point there will be a need in the API to wait for something from Tor -- and at that point, you have to make the API event-based or threaded. txtorcon.TorState is so far pretty de-coupled from the underlying networking library. txtorcon.TorConfig is less so. As things like TorState generate callbacks (e.g. stream added, deleted, etc) via listeners, there's also probably a slight issue that these callbacks would need to execute "fast" (i.e. can't wait for disk/net IO) and this would probably be surprising to threaded implementors.
so that things like "GETINFO desc/*" will provide usefully parsed information. We could probably also share connection and authentication code.
Like I said, the main issue will be "how do I wait for things I need from the protocol"? For example, I can imagine a Twisted / event-based "low-level" TorControlProtocl class being wrapped by a threading-friendly API of some sort (which just pauses the caller thread until Twisted gets back with the answer) with the "nicer" classes layered on top (TorState, etc) which could take either one and hence be implemented in a threaded or event-based fashion, as they like.
I don't really see that this gains a whole bunch, though: then you're depending on Twisted but not using the event-based stuff "outside". One big "pro" for a threaded version like stem that I see is only standard-lib dependencies. Besides, anyone excited about a Twisted dependency probably wants Deferrred's returned, not a threaded API... ;)
So, I see a use for a good Python utility + parsing library which stem + txtorcon (+ whatever) could use to do their heavy lifting, and the network/protocol details would be "all" that's in the controller libs.
*Ideally*, such a library could leverage the parsing code in Tor itself -- if at least the "utility" methods in Tor could be published as a shared library, a "ctypes" wrapper could easily be made with a more-Pythonic interface around that. Then, there's only one chunk of "parse descriptors" (for example) code, and it would be used by Tor and the controller, so no chance of being out of sync. Perhaps there are other reasons not to do shared libraries...and I haven't actually looked at these C methods very hard; but routerparse.c has 5200+ lines of code that'd be nice to leverage.
Another thing I think would be really nice is to be able to get grouping and documentation information about config options from Tor, or from the tor-spec file (i.e. by parsing it). This would keep documentation that users see consistent across Tor control protocol clients, and make it easier to more-automatically generate GUIs (i.e. with grouping and maybe ordering information). Anyway, just brainstorming here.
I'm mostly-away until around the 18th, but perhaps we could meet on #tor-dev after that and discuss further? Are there specific things besides descriptors that you think could be easily abstracted out of stem (and/or useful for txtorcon)?
p.s. may I encourage you to consider the way-more-standard 4 spaces for indenting...? I've never seen Python code with 2-space indenting before.
- -- mike