Hi Karsten,
(not sure whom to CC and whom not to, I have a couple of fairly specific technical questions / comments (which (again) I should have delved into earlier), but then again, maybe the scope of the tor-dev mailing list includes such cases..)
@tor-dev: This is in regard to the Searchable metrics archive project, intersected with Onionoo stuff.
I originally wanted to ask two questions, but by the time I reached the middle of the email, I wasn't anymore sure if they were questions, so I think these are simply my comments / observations about a couple of specific points, and I just want to make sure I'm making general sense. :)
..So it turns out that it is very much possible to avoid Postgres sequential scans - to construct such queries/cases which are best executed using indexes (the query planner thinks so, and it makes sense as far as I can tell) and whatever efficient (hm, < O(n)?) search algorithms are deemed best - for the metrics search backend/database; the two things potentially problematic to me seem to be:
- when doing ORDER BY, making sure that the resulting SELECT covers / would potentially return a relatively small amount of rows (from what I've been reading / trying out, whenever a SELECT happens which may cover > ~10-15% of all the table's rows, sequential scan is preferred, as using indexes would result in even more disk i/o, whereas the seq.scan can read in massive(r) chunks of data from disk because it's, well, sequential). This means that doing "ORDER BY nickname" (or fingerprint, or any other non-unique but covering-a-relatively-small-part-of-all-the-table column in the network status table) is much faster than doing e.g. "ORDER BY validafter" (where validafter refers to the consensus document's "valid after" field) - which makes sense, of course, when you think about it (ordering a massive amount of rows, even with proper LIMIT etc. is insane.) We construct separate indexes (e.g. even if 'fingerprint' is part of a composite (validafter + fingerprint) primary key), and all seems to be well.
I've been looking into the Onionoo implementation, particularly into ResourceServlet.java [1], to see how ordering etc. is done there. I'm new to that code, but am I right in saying that as of now, the results (at least for /summary and /details) are generally fairly unsorted (except for the possibility of "order by consensus_weight"), with "valid_after" fields appearing in an unordered manner? This of course makes sense in this case, as Onionoo is able to return *all* the results for given search criteria (or, if none given, all available results) at once.
The obvious problem with the archival metrics search project (argh, we are in need of a decent name I daresay :) maybe I'll think of something.. no matter) is then, of course, the fact that we can't return all results at once. I've been so far assuming that it would therefore make sense to return them, whenever possible (and if not requested otherwise, if "order" parameter is later implemented), in "valid_after" descending order. I suppose this makes sense? This would be ideal, methinks. So far, it seems that we can do that, as long as we have a WHERE clause that is restricting-enough.
select * from (select distinct on (fingerprint) fingerprint, validafter,
nickname from statusentry where nickname like 'moria1' order by fingerprint, validafter desc) as subq order by validafter desc limit 100;
, for example, works out nicely, in terms of efficiency / query plan, postgres tells me. (The double ORDER BY is needed, as postgres' DISTINCT needs an ORDER BY, and that ORDER BY's leftmost criterion has to match DISTINCT ON (x))
Now, you mentioned / we talked about what is the (large, archival metrics) backend to do about limiting/cutoff? Especially if there are no search criteria specified, for example. Ideally, it may return a top-most list of statuses (which in /details include info from server descriptors), sorted by "last seen" / "valid after"? Thing is, querying a database for 100 (or 1000) of items with no ORDER BY Is really cheap; introducing ORDER BYs which would still produce tons of results is considerably less so. I'm now looking into this (and you did tell me this, i.e. that, I now think, a large part of DB/backend robustness gets tested at these particular points; this should have been more obvious to me).
But in any case, I have the question: what *should* the backend return (when no search parameters/filters specified, or very loose ones (nickname LIKE "a") are?
What would be cheap / doable: ORDER BY fingerprint (or digest (== hashed descriptor), etc.) It *might* (well, should) work to order by fingerprint, limit results, and *then* reorder by validafter - with no guarantee that the topmost results would be with highest absolute validafters. I mean, Onionoo is doing this kind of reordering / limiting itself, but it makes sense as it can handle all the data at once (or I'm missing something, i.e. the file I linked to already interacts with a subset of data; granted, I haven't thoroughly read through.) In our/this case, it makes sense to try and outsource all relational logic to ORM (but of course if it turns out we can cheaply get 100 arbitrary results and more easily juggle with them ourselves / on the (direct) python side, then sure.)
But would such arbitrary returned results make sense? It would look just like Onionoo results, but - a (small) subset of them.
Ramble #2: Onionoo is doing more or less direct (well, compiled, so efficient) regexps on fingerprint, nickname, etc. By default, "LIKE %given_nickname_part%" is again (sequentially) expensive; Postgres does offer full text search extensions (would need to build additional tables, etc.), and I think this makes sense to be done; it would cover all our "can supply a subset of :param" bases. I'll see into this. (It's also possible to construct functional(istic) indexes, e.g. with regexp, but need to know the template - I wonder if using Onionoo's regexpressions would work / make sense - will see.) For now, LIKE/= exact_string is very nice, but of course the whole idea is that it'd be possible to supply substrings. Of note is the fact that in this case, LIKE %substring% is O(n) in the sense that query time correlates with row count, afaict. As of now, full text search extensions would solve the problem I think, even if they may look a bit like an overkill at first.
End of an email without a proper direction/point. :)
[1]: https://gitweb.torproject.org/onionoo.git/blob/HEAD:/src/org/torproject/onio...
But would such arbitrary returned results make sense? It would look just like Onionoo results, but - a (small) subset of them.
Meant,
But would such arbitrary returned results make sense? It would look just
like Onionoo results, but - a (small) subset of *all* the results in the new-Onionoo backend, in our case.
i.e., it'd be an arbitrary subset of a very large set of possible results.
It should also be possible to do efficient *estimated* COUNTs (using reltuples [1, 2], provided the DB can be regularly VACUUMed + ANALYZEd (postgres-specific awesomeness)) - i.e. if everything is set up right, doing COUNTs would be efficient. This would be nice not only because one could run very quick queries asking e.g. "how many consensuses include nickname LIKE %moo% between [daterange1, daterange2]?" (if e.g. full text search is set up) but also, if we have to resort to sometimes returning an arbitrary subset of results (or sorted however we wish, but the sorting being done already on a small subset of results, if that makes sense), we'd be able to also supply info how many other results matching these particular criteria there are, and so on. The usefulness of all this really depends on intended use cases, and I suppose here some discussion could be had who / how would an Onionoo system covering all / most of all the descriptor+consensus archives and hopefully having an extended set of filter / result options be used?
[1]: http://www.varlena.com/GeneralBits/120.php [2]: http://wiki.postgresql.org/wiki/Slow_Counting
Last link of an unbroken email flood chain --
my (and perhaps only that one) last sentence may be aimed at a more general audience, reiterating:
The usefulness of all this really depends on intended use cases, and I
suppose here some discussion could be had who / *how would an Onionoo system* *covering all* [...] *the descriptor+consensus archives* [and hopefully having an extended set of filter / result options] *be used*?
As far as myself and my GSoC project is concerned, the most obvious use case is the proposed frontend (described in general terms, but with IMO some crucial specific points in the original proposal [1]). I suppose having the original planned use cases (what would this frontend require from the backend and so on, I mean) in mind makes the largest amount of sense, and that's what I'm aiming for at the moment.
On 7/29/13 10:15 PM, Kostas Jakeliunas wrote:
It should also be possible to do efficient *estimated* COUNTs (using reltuples [1, 2], provided the DB can be regularly VACUUMed + ANALYZEd (postgres-specific awesomeness)) - i.e. if everything is set up right, doing COUNTs would be efficient. This would be nice not only because one could run very quick queries asking e.g. "how many consensuses include nickname LIKE %moo% between [daterange1, daterange2]?" (if e.g. full text search is set up) but also, if we have to resort to sometimes returning an arbitrary subset of results (or sorted however we wish, but the sorting being done already on a small subset of results, if that makes sense), we'd be able to also supply info how many other results matching these particular criteria there are, and so on. The usefulness of all this really depends on intended use cases, and I suppose here some discussion could be had who / how would an Onionoo system covering all / most of all the descriptor+consensus archives and hopefully having an extended set of filter / result options be used?
I can see how estimated counts could be valuable information. Or not. Do you want to first specify what type of queries you're planning to support?
(I didn't spot anything in the other two mails that requires a reply. If there are still open questions, please let me know.)
Best, Karsten
A couple of thoughts to add from my experience working with DB's:
-Consider a partitioning strategy for the data if it makes sense. Queries which will hit only a single partition will be much faster in general. -Check out advanced indexing strategies like Solr. this can sometimes speed up certain queries by orders of magnitude, and works well with batch systems. Often it is easier to setup & maintain than advanced DB tuning. -I second Karsten's comment on multi-column indexes. Designing the right indexes & columns to include isn't simple though - you have to know what is going to be searched. -If you know in advance what kind's of where clauses you will see, consider implementing partial indexes.
If you have specific problem queries w/ explain plans, feel free to post them and I would be glad to spend a few minutes looking them over.
Charlie
On Tue, Jul 30, 2013 at 3:35 AM, Karsten Loesing karsten@torproject.orgwrote:
On 7/29/13 10:15 PM, Kostas Jakeliunas wrote:
It should also be possible to do efficient *estimated* COUNTs (using reltuples [1, 2], provided the DB can be regularly VACUUMed + ANALYZEd (postgres-specific awesomeness)) - i.e. if everything is set up right, doing COUNTs would be efficient. This would be nice not only because one could run very quick queries asking e.g. "how many consensuses include nickname LIKE %moo% between [daterange1, daterange2]?" (if e.g. full text search is set up) but also, if we have to resort to sometimes returning
an
arbitrary subset of results (or sorted however we wish, but the sorting being done already on a small subset of results, if that makes sense),
we'd
be able to also supply info how many other results matching these particular criteria there are, and so on. The usefulness of all this
really
depends on intended use cases, and I suppose here some discussion could
be
had who / how would an Onionoo system covering all / most of all the descriptor+consensus archives and hopefully having an extended set of filter / result options be used?
I can see how estimated counts could be valuable information. Or not. Do you want to first specify what type of queries you're planning to support?
(I didn't spot anything in the other two mails that requires a reply. If there are still open questions, please let me know.)
Best, Karsten
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On 7/29/13 9:45 PM, Kostas Jakeliunas wrote:
Hi Karsten,
(not sure whom to CC and whom not to, I have a couple of fairly specific technical questions / comments (which (again) I should have delved into earlier), but then again, maybe the scope of the tor-dev mailing list includes such cases..)
@tor-dev: This is in regard to the Searchable metrics archive project, intersected with Onionoo stuff.
I originally wanted to ask two questions, but by the time I reached the middle of the email, I wasn't anymore sure if they were questions, so I think these are simply my comments / observations about a couple of specific points, and I just want to make sure I'm making general sense. :)
..So it turns out that it is very much possible to avoid Postgres sequential scans - to construct such queries/cases which are best executed using indexes (the query planner thinks so, and it makes sense as far as I can tell) and whatever efficient (hm, < O(n)?) search algorithms are deemed best - for the metrics search backend/database; the two things potentially problematic to me seem to be:
- when doing ORDER BY, making sure that the resulting SELECT covers /
would potentially return a relatively small amount of rows (from what I've been reading / trying out, whenever a SELECT happens which may cover > ~10-15% of all the table's rows, sequential scan is preferred, as using indexes would result in even more disk i/o, whereas the seq.scan can read in massive(r) chunks of data from disk because it's, well, sequential). This means that doing "ORDER BY nickname" (or fingerprint, or any other non-unique but covering-a-relatively-small-part-of-all-the-table column in the network status table) is much faster than doing e.g. "ORDER BY validafter" (where validafter refers to the consensus document's "valid after" field) - which makes sense, of course, when you think about it (ordering a massive amount of rows, even with proper LIMIT etc. is insane.) We construct separate indexes (e.g. even if 'fingerprint' is part of a composite (validafter + fingerprint) primary key), and all seems to be well.
I've been looking into the Onionoo implementation, particularly into ResourceServlet.java [1], to see how ordering etc. is done there. I'm new to that code, but am I right in saying that as of now, the results (at least for /summary and /details) are generally fairly unsorted (except for the possibility of "order by consensus_weight"), with "valid_after" fields appearing in an unordered manner? This of course makes sense in this case, as Onionoo is able to return *all* the results for given search criteria (or, if none given, all available results) at once.
This is correct. Onionoo only allows ordering by consensus_weight, because I only added features when we came up with new use cases.
The obvious problem with the archival metrics search project (argh, we are in need of a decent name I daresay :) maybe I'll think of something.. no matter) is then, of course, the fact that we can't return all results at once. I've been so far assuming that it would therefore make sense to return them, whenever possible (and if not requested otherwise, if "order" parameter is later implemented), in "valid_after" descending order. I suppose this makes sense? This would be ideal, methinks. So far, it seems that we can do that, as long as we have a WHERE clause that is restricting-enough.
select * from (select distinct on (fingerprint) fingerprint, validafter,
nickname from statusentry where nickname like 'moria1' order by fingerprint, validafter desc) as subq order by validafter desc limit 100;
, for example, works out nicely, in terms of efficiency / query plan, postgres tells me. (The double ORDER BY is needed, as postgres' DISTINCT needs an ORDER BY, and that ORDER BY's leftmost criterion has to match DISTINCT ON (x))
Now, you mentioned / we talked about what is the (large, archival metrics) backend to do about limiting/cutoff? Especially if there are no search criteria specified, for example. Ideally, it may return a top-most list of statuses (which in /details include info from server descriptors), sorted by "last seen" / "valid after"? Thing is, querying a database for 100 (or 1000) of items with no ORDER BY Is really cheap; introducing ORDER BYs which would still produce tons of results is considerably less so. I'm now looking into this (and you did tell me this, i.e. that, I now think, a large part of DB/backend robustness gets tested at these particular points; this should have been more obvious to me).
But in any case, I have the question: what *should* the backend return (when no search parameters/filters specified, or very loose ones (nickname LIKE "a") are?
What would be cheap / doable: ORDER BY fingerprint (or digest (== hashed descriptor), etc.) It *might* (well, should) work to order by fingerprint, limit results, and *then* reorder by validafter - with no guarantee that the topmost results would be with highest absolute validafters. I mean, Onionoo is doing this kind of reordering / limiting itself, but it makes sense as it can handle all the data at once (or I'm missing something, i.e. the file I linked to already interacts with a subset of data; granted, I haven't thoroughly read through.) In our/this case, it makes sense to try and outsource all relational logic to ORM (but of course if it turns out we can cheaply get 100 arbitrary results and more easily juggle with them ourselves / on the (direct) python side, then sure.)
But would such arbitrary returned results make sense? It would look just like Onionoo results, but - a (small) subset of them.
Can you rephrase this question, say, in a sentence or two? I'm having a hard time getting what you're asking for.
Making a few random statements here, in case one of them addresses the issue you're having:
- Ordering is important even if no WHERE statement is given or if the WHERE statement matches a lot of rows. Think of the relay-search application when you look for all relays with nickname U* (like in "Unnamed" which is very frequent). You want to be sure to find the last relays in the database, not a random subset.
- You might want to look into multi-key indexes, e.g., an index on (validafter, nickname). It could be that PostgreSQL uses them instead of a sequential search.
- It might make sense to write a set of test selects and check them against your expected query strategies. Maybe there are tools for this, but if there are no tools, simply run each query using EXPLAIN ANALYZE and parse the output for keywords indicating sequential scan, index scan. Then create indexes and re-run your test script to find out if they improve things or not.
- Also look into insertion speed when you create lots of indexes. That is, once you're happy with query speed, import another month of data to see if that still happens in reasonable time.
Ramble #2: Onionoo is doing more or less direct (well, compiled, so efficient) regexps on fingerprint, nickname, etc. By default, "LIKE %given_nickname_part%" is again (sequentially) expensive; Postgres does offer full text search extensions (would need to build additional tables, etc.), and I think this makes sense to be done; it would cover all our "can supply a subset of :param" bases. I'll see into this. (It's also possible to construct functional(istic) indexes, e.g. with regexp, but need to know the template - I wonder if using Onionoo's regexpressions would work / make sense - will see.) For now, LIKE/= exact_string is very nice, but of course the whole idea is that it'd be possible to supply substrings. Of note is the fact that in this case, LIKE %substring% is O(n) in the sense that query time correlates with row count, afaict. As of now, full text search extensions would solve the problem I think, even if they may look a bit like an overkill at first.
Again, not sure if I understand the question. A few possible answers:
- Feel free to restrict searches to things you can implement efficiently. It's better to exclude an expensive query and force the user to think about a better way to query, than to offer that search option and let the user sit waiting for a long time.
- You can prepare columns for searching to make queries much faster. For example, you could store LOWER(nickname) in addition to nickname if you're going to do lower-case searches for nickname anyway; that's what relay search does. Or you can store the first 16 bits of IP addresses in a separate column to narrow down search results quickly; that's what ExoneraTor in metrics-web.git does. Or you can store descriptors in monthly tables to keep indexes small; that's what relay search does.
End of an email without a proper direction/point. :)
Well, now looking into your other mails in this thread.
(Hint: in general, if you want me to reply quickly, please write a single mail with clear comments or questions, not one that you need to correct/extend three times within an hour. Maybe simply leave it sitting in your drafts folder for an hour until you're sure it's what you want to say. Thanks!)