Slow query in PostgreSQL selecting a single row from between a range defined in two columns

This is a little bit of a different solution to those already offered which involved using spacial indexes to do some tricks.

Instead its worth remembering that with IP addresses you cannot have overlapping ranges. That is A -> B cannot intersect X -> Y in any way. Knowing this you can change your SELECT query slightly and take advantage of this trait. In taking advantage of this trait you do not have to have any "clever" indexing at all. In fact you only need to index your ip_from column.

Previously, the query being analyzed was:

SELECT * FROM ip2location_db11 WHERE 2538629520 BETWEEN ip_from AND ip_to;

Lets assume that the range that 2538629520 falls into happens to be 2538629512 and 2538629537.

Note: It really does not matter what the range is, this is just to help demonstrate the pattern we can take advantage of.

From this we can assume that the next ip_from value is 2538629538. We actually dont need to worry about any records above this ip_from value. Indeed all we actually care about is the range where ip_from equals 2538629512.

Knowing this fact, our query actually becomes (in English):

Find me the maximumip_from value where my IP address is higher than ip_from. Show me the record where you find this value.

Or in otherwords: Find me the ip_from value just before my IP address and give me that record

Because we never have overlapping ranges of ip_from to ip_to this holds true and allows us to write the query as:

SELECT * 
FROM ip2location
WHERE ip_from = (
    SELECT MAX(ip_from)
    FROM ip2location
    WHERE ip_from <= 2538629520
    )

Back to the indexing to take advantage of all this. All we're actually looking across is ip_from and we are doing integer comparisons. The MIN(ip_from) have PostgreSQL find the first record available. This is good because we can seek right to that and then not worry about any other records at all.

All we really need is an index like:

CREATE UNIQUE INDEX CONCURRENTLY ix_ip2location_ipFrom ON public.ip2location(ip_from)

We can make the index unique because we will not have overlapping records. I would even make this column the primary key myself.

With this index and this query, the explain plan is:

Index Scan using ix_ip2location_ipfrom on public.ip2location  (cost=0.90..8.92 rows=1 width=69) (actual time=0.530..0.533 rows=1 loops=1)
Output: ip2location.ip_from, ip2location.ip_to, ip2location.country_code, ip2location.country_name, ip2location.region_name, ip2location.city_name, ip2location.latitude, ip2location.longitude, ip2location.zip_code, ip2location.time_zone
Index Cond: (ip2location.ip_from = $1)
InitPlan 2 (returns $1)
    ->  Result  (cost=0.46..0.47 rows=1 width=8) (actual time=0.452..0.452 rows=1 loops=1)
        Output: $0
        InitPlan 1 (returns $0)
            ->  Limit  (cost=0.43..0.46 rows=1 width=8) (actual time=0.443..0.444 rows=1 loops=1)
                Output: ip2location_1.ip_from
                ->  Index Only Scan using ix_ip2location_ipfrom on public.ip2location ip2location_1  (cost=0.43..35440.79 rows=1144218 width=8) (actual time=0.438..0.438 rows=1 loops=1)
                        Output: ip2location_1.ip_from
                        Index Cond: ((ip2location_1.ip_from IS NOT NULL) AND (ip2location_1.ip_from >= '2538629520'::bigint))
                        Heap Fetches: 0

To give you an idea of improvement in query performance with this approach I tested this on my Raspberry Pi. The original approach took approx 4 secs. This approach takes approx 120ms. The big win is from individual row seeks verses some scans. The original query would suffer EXTREMELY from low range values as more of the table needs to be considered in the results. This query will exhibit consistent performance across the range of values.

Hope this helps and my explanation makes sense to you all.

Slow query in PostgreSQL selecting a single row from between a range defined in two columns

Tags:

Performance

Postgresql

Query Performance

Execution Plan

Related

Recent Posts