-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
Hi Karsten,
can one download historic onionoo documents (details.json) archived somewhere or would one have to setup onionoo + feed old data into it to achieve that?
thanks, Nusenu
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Nusenu,
On 05/04/15 23:35, Nusenu wrote:
can one download historic onionoo documents (details.json) archived somewhere or would one have to setup onionoo + feed old data into it to achieve that?
There are no archives of Onionoo's documents.
But of course there are CollecTor's archives which you'd feed into Onionoo. Though it's very likely easier to parse those directly (possibly using Stem) rather than setting up an Onionoo instance for the exact time you're interested in.
Speaking of, what historic data are you looking for? Maybe it's something that we should add to Onionoo itself?
All the best, Karsten
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
On 05/04/15 23:35, Nusenu wrote:
can one download historic onionoo documents (details.json) archived somewhere or would one have to setup onionoo + feed old data into it to achieve that?
There are no archives of Onionoo's documents.
But of course there are CollecTor's archives which you'd feed into Onionoo. Though it's very likely easier to parse those directly (possibly using Stem) rather than setting up an Onionoo instance for the exact time you're interested in.
Speaking of, what historic data are you looking for? Maybe it's something that we should add to Onionoo itself?
I've a few use case for onionoo data, one of them uses onionoo to find groups of relays run by one entity. First_seen combined with last_restarted has proven to be a rather good datapoint for that. Using an array of all restarts (instead of just one) would likely reduce false-positives even more. That is one field but I'll consider (historic) changes to other fields (contactinfo, orport, dirport, ...) as well.
Although I could imagine atlas displaying data like "these were previously provided contact details: ..." most of my use cases are to uncommon to add to onionoo.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/04/15 17:39, Nusenu wrote:
On 05/04/15 23:35, Nusenu wrote:
can one download historic onionoo documents (details.json) archived somewhere or would one have to setup onionoo + feed old data into it to achieve that?
There are no archives of Onionoo's documents.
But of course there are CollecTor's archives which you'd feed into Onionoo. Though it's very likely easier to parse those directly (possibly using Stem) rather than setting up an Onionoo instance for the exact time you're interested in.
Speaking of, what historic data are you looking for? Maybe it's something that we should add to Onionoo itself?
I've a few use case for onionoo data, one of them uses onionoo to find groups of relays run by one entity. First_seen combined with last_restarted has proven to be a rather good datapoint for that. Using an array of all restarts (instead of just one) would likely reduce false-positives even more.
So, the problem here is that an array of restarts doesn't scale, so it would have to be limited to the last 10 restarts or so. But even that is not trivial to implement in Onionoo and, as you note below, it's quite specific and not very useful for the average Onionoo client.
What you could try is evaluate uptime documents and see if two relays had similar uptime patterns over time:
https://onionoo.torproject.org/protocol.html#uptime
That is one field but I'll consider (historic) changes to other fields (contactinfo, orport, dirport, ...) as well.
Although I could imagine atlas displaying data like "these were previously provided contact details: ..." most of my use cases are to uncommon to add to onionoo.
I think I agree with you here. Also, you're much more flexible by using descriptors directly and adapting what you're extracting from them rather than having to wait for me to add a new field to an Onionoo document.
By the way, are you aware of Philipp Winter's work on a better Sybil attack detector? He's cc'ed in case you want to brainstorm about good criteria for comparing relays.
All the best, Karsten
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
By the way, are you aware of Philipp Winter's work on a better Sybil attack detector?
If you mean http://notebooks.nymity.ch/detecting_sybils.html then yes, I've seen it.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
Hi Karsten,
Though it's very likely easier to parse those directly (possibly using Stem) rather than setting up an Onionoo instance for the exact time you're interested in.
can you say something about what amount of minimal memory and disk space one would probably need for a non-public onionoo instance?
thanks, nusenu
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 21/04/15 22:06, nusenu wrote:
Hi Karsten,
Though it's very likely easier to parse those directly (possibly using Stem) rather than setting up an Onionoo instance for the exact time you're interested in.
can you say something about what amount of minimal memory and disk space one would probably need for a non-public onionoo instance?
I'd say 8G RAM and 100G disk space could work, though 16G RAM and 250G disk would save you some trouble during the initialization phase when you feed tons of descriptors into it.
If you want to give this a try, I'd want to take that opportunity and improve the documentation and maybe also the process for setting up Onionoo, if you're interested in helping with that.
All the best, Karsten
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Improved documentation and process of setting up Onionoo would be welcomed by more people, including myself. Busy setting up Compass and with Atlas and Globe mirrors active the cherry on the pie would be an own Onionoo instance (if needed also as backup for onionoo.torproject.org).
At the moment having onionoo running at thecthulhu.com is a nice backup, I do not know if any more of such mirrors are acceptable or preferable.
On 21-4-2015 22:16, Karsten Loesing wrote:
On 21/04/15 22:06, nusenu wrote:
Hi Karsten,
Though it's very likely easier to parse those directly (possibly using Stem) rather than setting up an Onionoo instance for the exact time you're interested in.
can you say something about what amount of minimal memory and disk space one would probably need for a non-public onionoo instance?
I'd say 8G RAM and 100G disk space could work, though 16G RAM and 250G disk would save you some trouble during the initialization phase when you feed tons of descriptors into it.
If you want to give this a try, I'd want to take that opportunity and improve the documentation and maybe also the process for setting up Onionoo, if you're interested in helping with that.
All the best, Karsten _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
- -- Tim Semeijn Babylon Network pgp 0x5B8A4DDF
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello Tim,
On 21/04/15 23:01, Tim Semeijn wrote:
Improved documentation and process of setting up Onionoo would be welcomed by more people, including myself. Busy setting up Compass and with Atlas and Globe mirrors active the cherry on the pie would be an own Onionoo instance (if needed also as backup for onionoo.torproject.org).
At the moment having onionoo running at thecthulhu.com is a nice backup, I do not know if any more of such mirrors are acceptable or preferable.
Having another mirror or two sure would not hurt. I know that iwakeh was trying to set up a mirror, but I'm not sure whether they succeeded. I have also been thinking about adding a feature to Onionoo where we configure a list of fallback mirrors that are returned in case of a server problem. This would become more relevant if we actually had such fallback mirrors in place.
So, yes, if you'd like to set up another Onionoo mirror, that would be really cool!
I wrote down some instructions based on Onionoo's INSTALL file but in more detail. It's supposed to become the new INSTALL file. Would you want to refine these instructions (or add more questions) while trying to get an Onionoo mirror running?
https://pad.riseup.net/p/YG9eWPopYDJM
(If other people on this list would want to help make these instructions better, please feel free.)
Let me know if you're running into any problems.
All the best, Karsten
Hi,
Actually I've been meaning to ask a question related to this. I've been wondering if, during the development of Onionoo, you considered any other frameworks? I'm not familiar with the history of Onionoo so I don't know if you made the choice based on some constraint. I read the design doc which made me curious.
--leeroy
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 25/04/15 22:33, l.m wrote:
Actually I've been meaning to ask a question related to this. I've been wondering if, during the development of Onionoo, you considered any other frameworks? I'm not familiar with the history of Onionoo so I don't know if you made the choice based on some constraint. I read the design doc which made me curious.
Most design choices were made in favor of making the web front-end part scale. It's a response to building some services like the (discontinued) relay search that started with reasonable performance and degraded a lot over time. It's why, in Onionoo, responses are written to disk by the hourly updater and not put together on-the-fly. And it's why all requests are handled by an in-memory index of all documents rather than by a database. I'm not saying that no other design can achieve the same performance, but I find that much harder, in particular with respect to performance variance.
Not sure what frameworks you have in mind. But I'm happy to hear more about frameworks that would make Onionoo easier to extend and not perform worse (or even better) than now. If you have something in mind, please say so.
All the best, Karsten
Hi Karsten,
Not sure what frameworks you have in mind. But I'm happy to hear more about frameworks that would make Onionoo easier to extend and not perform worse (or even better) than now. If you have something in mind, please say so.
Thanks for the clarification. I'm not against the choice of Java, nor claiming better choices. I have fond memories of Java. In particular I've been working a lot with Django recently. I didn't want to redo works that may have already been performed. I was thinking of some recreational uses of a server. I started looking at the onionoo documentation and my curiosity was piqued. Precisely because the first thing I thought of was reusing a cloned server for, well, a onionoo-clone.
The JSON formatted files could be used as fixtures for setup. The two apps could be run separately as you've already mentioned.
The other development specifics are: nginx-gunicorn(greenlets/aiohttp) postgresql-pgbouncer
Is it an experiment worth pursuing? Your thoughts are appreciated. Thanks in advance.
--leeroy
Hi,
Django (and by implication, python) are an accepted technology at tor, but as much as I wish it would be different, the tor web infrastructure is still based on python 2.7 (basically, you can only depend on whatever is in wheezy and wheezy-backports if you want something to run on tor's infrastructure). Of course if you don't intend for your project to ever replace tor's own onionoo deployment, that doesn't matter.
Best, Luke
PS: I'm also going to take this opportunity to plug my onionoo client library that you can use to check that your onionoo clone performs to spec ;-) https://github.com/duk3luk3/onion-py
Hi Luke,
Django (and by implication, python) are an accepted technology at tor, but as much as I wish it would be different, the tor web infrastructure is still based on python 2.7 (basically, you can only depend on whatever is in wheezy and wheezy-backports if you want something to run on tor's infrastructure). Of course if you don't intend for your project to ever replace tor's own onionoo deployment, that doesn't matter.
Thanks for pointing that out. I think that won't be a big problem as this isn't intended to be a replacement. If it ends up being a fruitful experiment then it's a success. It's a success if it demonstrates some improvement over the currently deployed design. If I stay away from python3 then the main difference is the use of postgresql+pgbouncer/pgpool. My instincts are telling me that python3 is needed for aiohttp to demonstrate that asynchronous io, lightweight concurrency, and various database optimizations can yield improvements. If the results end up meriting reproduction then virtual environments can be used for testing without breaking existing infrastructure.
PS: I'm also going to take this opportunity to plug my onionoo client library that you can use to check that your onionoo clone performs to spec ;-) https://github.com/duk3luk3/onion-py
I saw that. I'll definitely keep it in mind for comparison. Thanks again.
--leeroy
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 27/04/15 23:39, l.m wrote:
Hi Karsten,
Not sure what frameworks you have in mind. But I'm happy to hear more about frameworks that would make Onionoo easier to extend and not perform worse (or even better) than now. If you have something in mind, please say so.
Thanks for the clarification. I'm not against the choice of Java, nor claiming better choices. I have fond memories of Java. In particular I've been working a lot with Django recently. I didn't want to redo works that may have already been performed. I was thinking of some recreational uses of a server. I started looking at the onionoo documentation and my curiosity was piqued. Precisely because the first thing I thought of was reusing a cloned server for, well, a onionoo-clone.
The JSON formatted files could be used as fixtures for setup. The two apps could be run separately as you've already mentioned.
The other development specifics are: nginx-gunicorn(greenlets/aiohttp) postgresql-pgbouncer
Is it an experiment worth pursuing? Your thoughts are appreciated. Thanks in advance.
Regarding Python, we already tried rewriting Onionoo in Python a few years back and failed. It's a larger project than it seems, and the possible benefits probably won't justify that. (Just think of the many new features we could write while rewriting existing ones.)
Using Java for the back-end and Python for the front-end is a bit ugly, but could work if there's a true benefit in that. Though we might be able to re-use the concepts from the Python experiment and incorporate them in the current Java implementation. I very much doubt that performance advantages would be attributed to Python vs. Java, but here I am starting to argue about programming languages, which I shouldn't.
I think the best way to improve things is to look into switching to a SQL database. I already started experimenting with that in the past two weeks and just wrote down my findings here:
https://trac.torproject.org/projects/tor/ticket/15844
If you're interested in some database performance hacking, you'll love this ticket! Much appreciated!
All the best, Karsten
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hey Karsten,
Good to hear more onionoo mirrors are welcome. The instructions are nicely detailed so I will be able to try set up onionoo most likely next weekend. I will provide you with feedback on the instructions.
A feature with a list of fallback mirrors available for the main onionoo instance sounds like something awesome but of course multiple mirrors have to be available.
Best regards,
On 25-4-2015 21:37, Karsten Loesing wrote:
Hello Tim,
On 21/04/15 23:01, Tim Semeijn wrote:
Improved documentation and process of setting up Onionoo would be welcomed by more people, including myself. Busy setting up Compass and with Atlas and Globe mirrors active the cherry on the pie would be an own Onionoo instance (if needed also as backup for onionoo.torproject.org).
At the moment having onionoo running at thecthulhu.com is a nice backup, I do not know if any more of such mirrors are acceptable or preferable.
Having another mirror or two sure would not hurt. I know that iwakeh was trying to set up a mirror, but I'm not sure whether they succeeded. I have also been thinking about adding a feature to Onionoo where we configure a list of fallback mirrors that are returned in case of a server problem. This would become more relevant if we actually had such fallback mirrors in place.
So, yes, if you'd like to set up another Onionoo mirror, that would be really cool!
I wrote down some instructions based on Onionoo's INSTALL file but in more detail. It's supposed to become the new INSTALL file. Would you want to refine these instructions (or add more questions) while trying to get an Onionoo mirror running?
https://pad.riseup.net/p/YG9eWPopYDJM
(If other people on this list would want to help make these instructions better, please feel free.)
Let me know if you're running into any problems.
All the best, Karsten _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
- -- Tim Semeijn Babylon Network pgp 0x5B8A4DDF
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 27/04/15 23:04, Tim Semeijn wrote:
Hey Karsten,
Good to hear more onionoo mirrors are welcome. The instructions are nicely detailed so I will be able to try set up onionoo most likely next weekend. I will provide you with feedback on the instructions.
Great! Let me know how that goes.
A feature with a list of fallback mirrors available for the main onionoo instance sounds like something awesome but of course multiple mirrors have to be available.
Well, we already have two mirrors, and usually one of them works. I just created a ticket for this:
https://trac.torproject.org/projects/tor/ticket/15843
All the best, Karsten
Best regards,
On 25-4-2015 21:37, Karsten Loesing wrote:
Hello Tim,
On 21/04/15 23:01, Tim Semeijn wrote:
Improved documentation and process of setting up Onionoo would be welcomed by more people, including myself. Busy setting up Compass and with Atlas and Globe mirrors active the cherry on the pie would be an own Onionoo instance (if needed also as backup for onionoo.torproject.org).
At the moment having onionoo running at thecthulhu.com is a nice backup, I do not know if any more of such mirrors are acceptable or preferable.
Having another mirror or two sure would not hurt. I know that iwakeh was trying to set up a mirror, but I'm not sure whether they succeeded. I have also been thinking about adding a feature to Onionoo where we configure a list of fallback mirrors that are returned in case of a server problem. This would become more relevant if we actually had such fallback mirrors in place.
So, yes, if you'd like to set up another Onionoo mirror, that would be really cool!
I wrote down some instructions based on Onionoo's INSTALL file but in more detail. It's supposed to become the new INSTALL file. Would you want to refine these instructions (or add more questions) while trying to get an Onionoo mirror running?
(If other people on this list would want to help make these instructions better, please feel free.)
Let me know if you're running into any problems.
All the best, Karsten _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
_______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev