Order of columns in a compound index in PostgreSQL (and query order)
As a part of my work I maintain a fairly large PostgreSQL database (around 120gb on disk, several multi-million-row tables) and have collected a few tricks on how to speed up the queries. First some comments on your assumptions:
- Yes, order is important, but it's only the first one that is really different, the rest are second class indexes.
- I'm not sure it will always use both, my guess is that the query planner will use #1, then do something clever with the rest.
- I have no experience with GIST.
- Yes, add all indexes first, see what is used the most and what gives the best performance.
- I would sugest that you try both and measure what works best. Try rewriting the sql with different subqueries, maybe country and time in one, then join with the intersect-query. I have not noticed any performance problem with IN-clauses, as long as the IN-list is not thousands of elements long. My guess is that a few different queries specifically tuned depending on the input criteria available will give the best results.
I would suggest against making a 4-way index. Try creating one and then check the size, they can get really huge. In my experience, four 1-key indexes have been almost as fast as a single 4-way index. A trick that does work nicely for some specific queries are partial indexes, ie something like this:
CREATE INDEX ON table_x (key1, key2, key3) WHERE some_x_column = 'XXXX';
I have created aliases in my .psqlrc-file with queries to help find what indexes to add or remove. Feel free to have a look at them over at GitHub: .psql
I use the :seq_scans and :bigtables a lot, and then \d table_name to get details about the table. Don't forget to reset the statistics after you've done some changes, select pg_stat_reset();