Hi Fabio:
I'd suggest for the "output" to produce a set of patches against Tor (development version), including modifications to:
- build system
- documentation (man pages)
- unit-tests
everything following the coding guidelines of Tor with the goal/hope to be integrated.
I have not developed too much unix code (mostly at the university for linux some time ago), so I am doing this mostly for windows. As matters of fact I compiled Tor using a visual studio project (Used to it and easy for me to trace/debug). I am going to leave the VS project in order to attract developers from the windows world. But I'll try to get into the build system and such.
I was working on top of 2.4-maint but had to lower to 2.3-maint as it was crashing (and lower to libevent 1.4) since there was a null pointer access at the microdescriptor cache coming from who knows where. I plan to use the most stable build and integrate over time stable releases keeping up with Tor stable releases. Problem is with unstable releases introduced bugs could be hard to determine if it was caused by code modified by me or was just a bug introduced in Tor. It is frenetic what is happening at git. Would be too difficult for me to work on top of moving ground. Fortunately both versions doesn't seems to be too different.
I think unit testing could be done better with a library as well.
It would be very valuable to define the "use case" of your library, in particular what you are going to support and for which context of use.For example are you going to simply support "outgoing anonymous connection" or "are you going to support caching descriptors to avoid excessive load on Tor network?" or "Are you going to support Tor Hidden Service exposure to receive inbound connections" ?
My aim is to just go for outgoing anonymous connection but I don't see any reason to not support the rest. Maybe bring initially anonymous connections since is the most common use case to put the ball to roll and then bring support for the rest. To look for inclusion that would have to be done in the long run.
Maybe step by step stack socks/transparent/control server code on top of the library in order to clean up the code and make it simpler separating concerns.
ATM I can create streams, get notifications when data arrives and when they are closed. Still looking how to write to streams. Doesn't seems easy and I am worried about concurrency. Tor seems to be using threads only with CPU workers and not built with much multithreading in mind.
For control, replace control commands with library calls. So for instance Tor control handling code could call the API after parsing network data. And be plugged/unplugged. I also abstracted logging. In this way it could be printed to console, stored in a database, printed in app window, syslog or whatever your choice.
I am looking to simplify Tor and abstract it somehow. Easier to audit, modify and enhance. So far I've seen very ugly patches to implement DNS resolver and transparent proxy stacking on top of part of socks handling. A more abstract interface using streams and circuits seems cleaner to me.
After a first prototype is achieved, it would be the best to: a) define the API b) document the API c) submit the API for review (to tor-dev mlist)
Ok I'll be doing that. I would like to get peer review from Tor developers. My idea is to keep it simple. Maybe get version, initialize library, connect and notify me when an event happens and shutdown library.
Additionally, due to the paranoia-level of Tor environment, it would be useful a documents describing a "Threat model" with a set of risks represented by the use of Tor as a library and how they are managed/mitigated.
Good point to consider. I think Christopher comments will be useful here. If someone wants to add some possible risk please do so.
If you will need to modify some existing core pieces of Tor code here and there, always open a relevant ticket on http://trac.torproject.orgexplaining why a modification is relevant/useful with a commit of the patch well documented (to stimulate/facilitate the integration).
Good idea. I am using a macro when defined you could obtain the library and when undefined you get original Tor. But I guess I will leave this for the future. My need is just get anonymous connections out, maybe help Tor evolve for the better could be harder to obtain.
As a prototype example, i'd suggest you to provide an example "Python Binding" that use your "libtor", due to the heavy use of Python within the Tor Environment, that show how to embed Tor within a Python application.
I am not too much Python inclined but I know some Python. Ok thanks for the recommendation. I'll keep this as a future task in order to attract developers from/to the Tor community.
Regards Waldo