On Wed, Jul 24, 2013 at 7:20 PM, Damian Johnson atagar@torproject.org wrote: [...]
On a side note the appearance of your project has kinda funny timing. Just last week I was thinking "Gah! Why does tor's reference implementation need to be C?". In my not-so-humble opinion that's dragging the application down in terms of maintainability and continued development...
- Tor has only three people (mostly just Nick) routinely touching the
core codebase. This means effectively no code reviews and little collaboration.
This part isn't actually true. We review each other's code, and don't merge stuff without reviewing it. Further, Andrea is full-time on the tor codebase, just like me. The code review slows us down a fair bit, but we do do it.
- Mocking is a pain with C. Nick had some ideas six months back to get
around this, but I'm not sure if they ever really took off or itself is maintainable.
I merged it. It's in master.
- C is simply difficult to get right. Besides the risk of stack
overflows and memory leaks, there's countless pitfalls that necessitate years of C development experience before touching a line of code.
Agreed.
- Tor really doesn't *need* to be in C. Descriptors, controller,
consensus voting, and much of its other functionality would do better with a higher level language, with small C modules for networking and crypto parts that truly need Libevent and such. This would be fine with Java's JNI, Python, Ruby, or any of a handful of languages.
All this said Nick no doubt could list a dozen reasons why this is a terrible idea, not the least being the monumental amount of work and wanting a tor executable without the need for an interpretor. Oh well, I can still dream.
Actually, I think we have a path to get to a less-pure-C Tor implementation. For sandboxing reasons, we'll want to move Tor to work as a set of multiple processes that communicate over well-defined IPC interfaces via a master process. Once we get there, it's no longer too much to think about doing some of those processes in a language other than C.
(What I'm *not* thrilled about is the idea of using an embedded interpreter for this kind of stuff, or embarking on any direction that requires us to rewrite too much of the program at once. That way, in my opinion, lies long-term destabilization.)
The main obstacle for most of these cases is that Tor hasn't been written with modularity in mind from the start, and so therefore some of the parts of Tor which we would do well to disentangle into implementations in other languages are not easily split off from the rest of the codebase. There's interest in doing this for some particular modules, though, and I suspect that once we get started, we'll be able to do it more easily for others.
best wishes,