-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Damian,
I'm digging out this old thread, because I think it's still relevant.
I started writing some performance evaluations for metrics-lib and got some early results. All examples read a monthly tarball from CollecTor and do something trivial with each contained descriptor that requires parsing them. Here are the average processing times by type:
server-descriptors-2015-11.tar.xz: 0.334261 ms server-descriptors-2015-11.tar: 0.285430 ms extra-infos-2015-11.tar.xz: 0.274610 ms extra-infos-2015-11.tar: 0.215500 ms consensuses-2015-11.tar.xz: 255.760446 ms consensuses-2015-11.tar: 246.713092 ms microdescs-2015-11.tar.xz[*]: 0.099397 ms microdescs-2015-11.tar[*]: 0.066566 ms
[*] The microdescs* tarballs contain microdesc consensuses and microdescriptors, but I only cared about the latter; what I did is extract tarballs, delete microdesc consensuses, and re-create and re-compress tarballs
These evaluations were all run on a Core i7 with 2GHz using an SSD as storage.
Any surprises in these results so far?
Would you want to move forward with the comparison and also include Stem? (And, Philipp, would you want to include Zoossh?)
All the best, Karsten
On 01/10/15 09:28, Karsten Loesing wrote:
Hello Philipp and iwakeh, hello list,
Damian and I sat down yesterday at the dev meeting to talk about doing a comparison of the various descriptor-parsing libraries with respect to capabilities, run-time performance, memory usage, etc.
We put together a list of things we'd like to compare and tests we'd like to run that we thought we'd want to share with you. Damian and I will both be working on these for metrics-lib for a short while and then switch to Stem. Please feel free to join us in these effort. The result is supposed to live on Stem's home page unless somebody comes up with a better place.
Thanks!
All the best, Damian and Karsten
On 30/09/15 10:57, Karsten Loesing wrote:
- capabilities - supported descriptor types - all the ones on
CollecTor's formats.html - hidden service descriptors (have an agreed @type for that) - getting/producing descriptors - reading from file/directory - reading from tarballs - reading from CollecTor's .xz-compressed tarballs - fetching from CollecTor - downloading from directories (authorities or mirrors) - generating (for unit test) - recognizing @type annotation - inferencing from file name - keeping reading history - user documentation - validation (format, crypto, successful sanitization) - packages available - how much usage by (large) applications
- performance (CPU time, memory overhead) - compression:
.xz-compressed tarballs/decompressed tarballs/plain-text - descriptor type: consensus, server descriptor, extra-info descriptor, microdescriptors - validation: on or off (allows lazy loading)
- tests by descriptor type - @type server-descriptor 1.0 -
Stem's "List Outdated Relays" - average advertised bandwidth - fraction of relays that can exit to port 80 - @type extra-info 1.0 - sum of all written and read bytes from write-history/read-history - number of countries from which v3 requests were received - @type network-status-consensus-3 - average number of relays with Exit flag - @type network-status-vote-3 - Stem's "Votes by Bandwidth Authorities" - @type dir-key-certificate-3 - @type network-status-microdesc-consensus-3 1.0 - @type microdescriptor 1.0 - look at single microdesc cons and microdescs, compile list of extended families - fraction of relays that can exit to port 80 - @type network-status-2 1.0 - @type directory 1.0 - @type bridge-network-status - @type bridge-server-descriptor - @type bridge-server-descriptor 1.0 - @type bridge-extra-info 1.3 - @type bridge-pool-assignment - @type tordnsel 1.0 - @type torperf 1.0
- action items - get in touch with Dererk for packaging
metrics-lib for Debian