Hi Ravi. This is a nice first draft and please keep in mind that I'm pretty green with PathSupport (I've never used it myself) so feel free to push back on any suggestions.
The high level approach that you seem to be taking is to copy PathSupport into stem, then refactor and test it. Is that right? If so then a few questions...
* Did you get Mike's permission for that? TorCtl is under the BSD license (I think) and stem is LGPLv3. * Is this the design that we want? PathSupport is modeled as a narrow object hierarchy built upon TorCtl.EventHandler. We have the opportunity to make any API we want so, as a user, what would you find to be the most intuitive?
My suggestion for starting tasks would be to...
1. Write a simple script to use PathSupport to, say, run wget from a target locale ('./my_script FR http://www.torproject.org/'). See where the pain points were in using PathSupport and what, as a user, you would rather that it did differently.
My understanding is that PathSupport is highly focused on experimentation since that is what Mike needed for his work. However, that is just one consumer and I'm most interested in providing an elegant, simple API that handles basic use cases (like the wget example) easily and can be *extended* for experiments.
2. Talk with the users of PathSupport to figure out their use cases. We should either include those capabilities in our PathSupport counterpart *or* provide what they need to easily make it themselves (if it's a specialized use case). Only three people or places to contact come to mind...
* Mike for SoaT and the bandwidth authorities * Sebastian for TorBEL * tor-dev@ for researchers and other developers using PathSupport, Roger might have some suggestions
3. Part of why I was dubious about this being a quick and easy project is that Stem currently lacks the controller capabilities that you need. You mention using stem.control.BaseController at several points which makes sense since it... well, exists. However, as its pydocs say this is not the class you are looking for...
"Don't use this directly - subclasses provide higher level functionality."
... or they will once we have them. Part of this project would be to start the general controller class to provide the capabilities that you need (plus tests of course). On first glance the things that a PathSupport copy would need are...
* Event handling for, at least, NEWCONSENSUS and NEWDESC. * A Network Status class. This would be similar to stem.descriptor.server_descriptor but *far* easier (there's only around three network status lines).
These are easy and I'm happy to work on them with you. We will, of course, need more before actually migrating any clients.
Their feedback will ensure that the API will be usable.
Don't count on it. This will give a nice first draft but expect to rewrite things quite a few times as we go along. Actually using your API for real clients will certainly reveal some things that we could do better. ;)
I also will communicate with my mentor about my progress and hopefully, will have an intuitive, easy to use API design ready before the coding period starts.
I would like to see a rough first draft of an API as part of the application, which we could then incrementally refine. Maybe a trac subpage under stem would be the best place for this?
Implementation implies writing the code, tests and the documentation.
Yay!
An amalgamation of the PathSupport.PathBuilder and the PathSupport.ConsensusTracker classes.
I understand why Mike made them separate. A few things to think about...
a. The ConsensusTracker is useful as a standalone class by providing the current consensus and descriptors. I used this for a short time with arm but stopped due to 'b'.
b. Loading all of the consensus and descriptor data is... a lot.
atagar@morrigan:~$ du -h ~/.tor/cached-consensus ~/.tor/cached-descriptors 672K /home/atagar/.tor/cached-consensus 3.1M /home/atagar/.tor/cached-descriptors
When I did this with arm a couple years ago it choked the application for several seconds and caused high memory usage. I've heard that this is better, but still we should figure out what is really necessary for the PathSupport functionality that we want.
c. This will be moot, of course, if we go with a different design.
TorCtl.PathSupport.PathBuilder uses a TorCtl.PathSupport.SelectionManager. A helper class for handling (router) configuration updates. I will merge a part of this into stem.path.PathController too
Not quite following. I thought that the SelectionManager was an argument for the configuration the user wanted to run PathSupport with. Keeping those separate conceptually seems like a good idea, though again I haven't actually tried it in practice.
Is a direct subclass of stem.control.BaseController
Why?
A major change would be to make PathController fully thread-safe instead of an event/queue system.
Slight correction, stem uses almost the exact same event/queue based model as TorCtl. The difference is that it also adds read/write locks to provide more complete thread safety.
The following will be ported to use Stem:
- Torflow
Woah, bad idea. Torflow = SoaT + Bandwidth Authorities. That is both way bigger than you want to take on, and probably the last things that will migrate (if they ever do at all). Doesn't TorBEL manually construct circuits? If so then that would be a far better client.
That said, I see where you're getting this from and I might be completely misunderstanding how TorBEL works...
04:34 < logan> please recommend some TorCtl clients which use the PathSupport module 04:42 < Sebastian> logan: I think there's just torflow 04:44 < logan> what about torbel ? 04:47 < logan> and SoaT ? 04:50 < Sebastian> soat is a part of torflow 04:50 < Sebastian> torbel doesn't use it 04:51 < Sebastian> torbel uses TorCtl.Router and TorCtl.TorUtil
There are some unimplemented parts of the general controller class that are required for the implementation of PathSupport, such as the Router class. atagar is currently working on this.
Oh, good that you spotted this. In an ideal world I'd be working on this but, if the last couple months are any guide, I wouldn't count on it.
I will help with implementing these so that they will be ready before the coding period begins.
Great. The top slot on my dance card usually goes to anything that has people actively offering to help. At the moment that's mostly around descriptor parsing, but I'm happy to swap back to the controller if you want to work on it with me.
Port Torflow to use Stem. This will consume a part of week 11,
/me chokes, realizing that ten days are being allocated to this
... er, ambitious
I have written a few patches for some Tor Project projects, #1667 (Tor), #5032 (Thandy). Two to Stem, which have been committed to the repository #5199 and #5472.
Many thanks for those, btw. :)
Do you have any standalone code samples (preferably python) that you've written? Possibly for school?
I have exams until the 29th of April, so I will be missing a few days of the community bonding period...
No problem.
Stem, like all libraries implementing an API for a moving target, requires maintenance. I will co-maintain Stem in the future. By the time I'm done with the SoC program, I would've also gained familiarity with other related projects such as Torflow, TorBEL and Arm. I'll be in a position where I can help out with those if there is a need.
Great, we're always glad when people stick around after GSoC. It's unpleasantly rare, but always good to hope for.
I will keep people informed about my progress by sending (probably monthly, or as often as required) reports the mailing list.
Last year we did bi-weekly status updates. I think that I'd like to work directly with whoever is selected rather than just having code tossed over the fence, but we'll see if that works out (it's not everyone's cup of tea). If you'd rather work on things more independently then let me know.
I'm a little uncomfortable with how nebulous the individual PathSupport tasks are. Please more concretely say what they include and your approach. Alternatively, feel free to make this a "semi-PathSupport and other stem tasks" proposal, taking on some general stem tasks (like Safe Cookie, metrics-lib migration, general controller work, etc) plus _exploratory_ work on PathSupport.
* The advantage of that approach would be a better defined tasks without the unknowns that often derail projects. * The disadvantage is that you'd finish lots of small, useful features rather than a big one (personally I count this as a plus, but some people like just having a single big goal).
Completely up to you. Feel free to continue focusing your application on PathSupport if you want, the above is just a potential alternative.
Cheers! -Damian