On Fri, Jul 31, 2015 at 10:00:27AM -0700, Damian Johnson wrote:
Hi Philipp, sorry about the delay! Spread pretty thin right now. Would you mind discussing more about the use cases, and give a mockup for what this new domain specific language would look like in practice?
My first thought is "would such a language be useful enough to be worth investing time to learn?". I've made lots of things that flopped because they didn't serve a true need, and while a domain specific language for descriptors sounds neat I'm not sure if I'm seeing a need for it.
I'm not quite sure yet myself. After talking to Karsten, a simple database might be good enough. Or simply reorganising the directory structure of archived data to efficiently find the consensuses, a given relay fingerprint shows up in. Either way, thanks for your thoughts!
Ideally, zoossh should do the heavy lifting as it's implemented in a compiled language.
This is assuming zoossh is dramatically faster than Stem by virtue of being compiled. I know we've discussed this before but I forget the results - with the latest tip of Stem (ie, with lazy loading) how do they compare? I'd expect time to be mostly bound by disk IO, so little to no difference.
zoossh's test framework says that it takes 36364357 nanoseconds to lazily parse a consensus that is cached in memory (to eliminate the I/O bottleneck). That amounts to approximately 27 consensuses a second.
I used the following simple Python script to get a similar number for Stem:
with open(file_name) as consensus_file: for router in stem.descriptor.parse_file(consensus_file, 'network-status-consensus-3 1.0', document_handler = stem.descriptor.DocumentHandler.ENTRIES): pass
This script manages to parse 24 consensus files in ~13 seconds, which amounts to 1.8 consensuses a second. Let me know if there's a more efficient way to do this in Stem.
Cheers, Philipp