On Fri, Jul 31, 2015 at 04:22:19PM -0400, l.m wrote:
I know I've already mentioned some thoughts on this subject. I would be interested in your thoughts on the types of challenging questions such a hypothetical DSL might answer. I've already put some effort into this (forking metrics-lib), but I'm still new to working with tor network data. There's around a terabyte of it and I can't possibly imagine every interesting scenario at this point. Right now I'm trying to be as flexible (and general) as possible in the implementation. Besides what you've already mentioned, what other types of uses do you envision? I'm interested in being able to answer questions which can only be answered by looking at macroscopic details level over time. Things like how to draw interesting facts from performance data, and how to improve collection (signalling, messaging, new metrics, etc) towards making attacks more visible.
My work on finding Sybil groups has brought me to this problem, which is why I usually find myself asking questions such as:
- "Which relays satisfy a given pattern, e.g., ORPort = n, DirPort = n+1?"
- "Are these n relays run by the same operator? What are the similarities between their descriptors?"
A database might be sufficiently good at answering many of these questions. See the discussion I recently had with Karsten on #tor-project: http://meetbot.debian.net/tor-project/2015/tor-project.2015-07-29-14.01.log.html
Ignoring my own interests, I could imagine several other parties to be interested in your language. Relay operators might want to learn things about their relays, Onionoo is unable to provide. Researchers might want to use it to empirically justify design decisions.
I'm probably stating the obvious here, but keep in mind that the usefulness of your language also depends on how easy it is to use. Not everybody might be willing to learn a new language, regardless of its flexibility, if you might as well write your own scripts to achieve the same task in some hours.
Cheers, Philipp