Re: [tor-dev] Onionoo protocol/implementation nuances / Onionoo-connected metrics project stuff

31 Jul 2013


      A couple of thoughts to add from my experience working with DB's:
-Consider a partitioning strategy for the data if it makes sense. Queries
which will hit only a single partition will be much faster in general.
-Check out advanced indexing strategies like Solr. this can sometimes speed
up certain queries by orders of magnitude, and works well with batch
systems. Often it is easier to setup & maintain than advanced DB tuning.
-I second Karsten's comment on multi-column indexes. Designing the right
indexes & columns to include isn't simple though - you have to know what is
going to be searched.
-If you know in advance what kind's of where clauses you will see, consider
implementing partial indexes.
If you have specific problem queries w/ explain plans, feel free to post
them and I would be glad to spend a few minutes looking them over.
Charlie
On Tue, Jul 30, 2013 at 3:35 AM, Karsten Loesing karsten@torproject.orgwrote:
...
On 7/29/13 10:15 PM, Kostas Jakeliunas wrote:
...
It should also be possible to do efficient *estimated* COUNTs (using
reltuples [1, 2], provided the DB can be regularly VACUUMed + ANALYZEd
(postgres-specific awesomeness)) - i.e. if everything is set up right,
doing COUNTs would be efficient. This would be nice not only because one
could run very quick queries asking e.g. "how many consensuses include
nickname LIKE %moo% between [daterange1, daterange2]?" (if e.g. full text
search is set up) but also, if we have to resort to sometimes returning
an
...
arbitrary subset of results (or sorted however we wish, but the sorting
being done already on a small subset of results, if that makes sense),
we'd
...
be able to also supply info how many other results matching these
particular criteria there are, and so on. The usefulness of all this
really
...
depends on intended use cases, and I suppose here some discussion could
be
...
had who / how would an Onionoo system covering all / most of all the
descriptor+consensus archives and hopefully having an extended set of
filter / result options be used?
I can see how estimated counts could be valuable information.  Or not.
Do you want to first specify what type of queries you're planning to
support?
(I didn't spot anything in the other two mails that requires a reply.
If there are still open questions, please let me know.)
Best,
Karsten
...

tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [tor-dev] Onionoo protocol/implementation nuances / Onionoo-connected metrics project stuff