-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
Hi,
I'm planing to store relay data in a database for analysis. I assume others have done so as well, so before going ahead and designing a db schema I'd like to make sure I didn't miss pre-existing db schemas one could build on.
Data to be stored: - - (most) descriptor fields - - everything that onionoo provides in a details record (geoip, asn, rdns, tordnsel, cw, ...) - - historic records
I didn't find something matching so far, so I'll go ahead, but if you know of other existing relay db schemas I'd like to hear about it.
thanks, nusenu
"GSoC2013: Searchable Tor descriptor archive" (Kostas Jakeliunas) https://www.google-melange.com/gsoc/project/details/google/gsoc2013/wfn/ 5866452879933440 https://lists.torproject.org/pipermail/tor-dev/2013-May/004923.html https://lists.torproject.org/pipermail/tor-dev/2013-September/005357.htm l https://github.com/wfn/torsearch (btw, someone knows the license of this?)
This is true: the summary/details documents (just like in Onionoo proper) deal with the *last* known info about relays.
ernie https://gitweb.torproject.org/metrics-db.git/plain/doc/manual.pdf (didn't find db/tordir.sql mentioned in the pdf)
"Instructions for setting up relay descriptor database" https://lists.torproject.org/pipermail/tor-dev/2010-March/001783.html
"Set up descriptor database for other researchers" https://trac.torproject.org/projects/tor/ticket/1643
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 15/04/15 21:18, nusenu wrote:
Hi,
I'm planing to store relay data in a database for analysis. I assume others have done so as well, so before going ahead and designing a db schema I'd like to make sure I didn't miss pre-existing db schemas one could build on.
Data to be stored: - (most) descriptor fields - everything that onionoo provides in a details record (geoip, asn, rdns, tordnsel, cw, ...) - historic records
I didn't find something matching so far, so I'll go ahead, but if you know of other existing relay db schemas I'd like to hear about it.
thanks, nusenu
"GSoC2013: Searchable Tor descriptor archive" (Kostas Jakeliunas) https://www.google-melange.com/gsoc/project/details/google/gsoc2013/wfn/
5866452879933440
https://lists.torproject.org/pipermail/tor-dev/2013-May/004923.html
https://lists.torproject.org/pipermail/tor-dev/2013-September/005357.htm
l https://github.com/wfn/torsearch (btw, someone knows the license of this?)
Cc'ing Kostas for this question.
This is true: the summary/details documents (just like in Onionoo proper) deal with the *last* known info about relays.
ernie https://gitweb.torproject.org/metrics-db.git/plain/doc/manual.pdf (didn't find db/tordir.sql mentioned in the pdf)
That file lives here now:
https://gitweb.torproject.org/metrics-web.git/tree/modules/legacy/db/tordir....
A better schema might be the following one though. It's smaller, but it's better documented:
https://gitweb.torproject.org/exonerator.git/tree/db/exonerator.sql
"Instructions for setting up relay descriptor database" https://lists.torproject.org/pipermail/tor-dev/2010-March/001783.html
That's
five years old. I'd say ignore that one.
"Set up descriptor database for other researchers" https://trac.torproject.org/projects/tor/ticket/1643
Also five years old. Better ignore.
Hope that helps.
All the best, Karsten
On Thu, Apr 16, 2015 at 4:53 PM, Karsten Loesing karsten@torproject.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 15/04/15 21:18, nusenu wrote:
Hi,
I'm planing to store relay data in a database for analysis. I assume others have done so as well, so before going ahead and designing a db schema I'd like to make sure I didn't miss pre-existing db schemas one could build on.
Data to be stored: - (most) descriptor fields - everything that onionoo provides in a details record (geoip, asn, rdns, tordnsel, cw, ...) - historic records
I didn't find something matching so far, so I'll go ahead, but if you know of other existing relay db schemas I'd like to hear about it.
thanks, nusenu
"GSoC2013: Searchable Tor descriptor archive" (Kostas Jakeliunas) https://www.google-melange.com/gsoc/project/details/google/gsoc2013/wfn/
5866452879933440
https://lists.torproject.org/pipermail/tor-dev/2013-May/004923.html
https://lists.torproject.org/pipermail/tor-dev/2013-September/005357.htm
l https://github.com/wfn/torsearch (btw, someone knows the license of this?)
Cc'ing Kostas for this question.
Hi nusenu,
I've been going through old mail, and on 2015-04-16 you asked about about a license (see above).
Just added a LICENSE file - can't hurt (standard BSD 3-clause).
If you're still by any chance collating (ha) and/or want to talk about schema design for descriptors (I personally would not lose hope for RDBMSes for large datasets - not until one gets into *actually* big data - say, terabytes at least, or more - but of course it gets nuanced real fast).
--
Kostas.
0x0e5dce45 @ pgp.mit.edu
This is true: the summary/details documents (just like in Onionoo proper) deal with the *last* known info about relays.
ernie https://gitweb.torproject.org/metrics-db.git/plain/doc/manual.pdf (didn't find db/tordir.sql mentioned in the pdf)
That file lives here now:
https://gitweb.torproject.org/metrics-web.git/tree/modules/legacy/db/tordir....
A better schema might be the following one though. It's smaller, but it's better documented:
https://gitweb.torproject.org/exonerator.git/tree/db/exonerator.sql
"Instructions for setting up relay descriptor database" https://lists.torproject.org/pipermail/tor-dev/2010-March/001783.html
That's
five years old. I'd say ignore that one.
"Set up descriptor database for other researchers" https://trac.torproject.org/projects/tor/ticket/1643
Also five years old. Better ignore.
Hope that helps.
All the best, Karsten -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: GPGTools - http://gpgtools.org
iQEcBAEBAgAGBQJVL9rcAAoJEJD5dJfVqbCrFZgIAIEv/Yi4sNoa8clYVAxuk0Sh FFbRDT0kLs19t/DgTwUtB6jD4Lh0akMc806AaIFgfCdL+QwcG0llBfZnSsrbszoH Xoi226PRx9lPITrA7KYds4PUZfqIqg3ECpNsKNa4PLB7SlQdNfJQ1wDngcwu2CrF Hk+zHbu0gfSkfZRBqxt5aJLTFXR0aBYybF4d6sPJ4OW5Al2U8r9DYysXc0xALvwq bvEDFctV1wkDgA3mP3guRrXImXYT1AQPFFlz0TR1eBruuSJBiPKIv7Fs/ocns4aR OhxIEaKBaAO+HkvyxDcZ1ukXldR13s3MUPD0XvvZ8xQRCBZpNMygqTMi6pIjTN4= =a0Nb -----END PGP SIGNATURE-----