SQL NOT IN [list of ids] (performance)

Here is wacky rewrite of that query that might perform a little better

SELECT * FROM foos
LEFT JOIN
(
    SELECT 2 id UNION
    SELECT 4    UNION
    SELECT 5    UNION
    SELECT 6    UNION
    SELECT 7
) NOT_IDS
USING (id) WHERE NOT_IDS.id IS NULL;

The NOT_IDS subquery does work as shown by the following:

mysql> SELECT * FROM
    -> (
    ->     SELECT 2 id UNION
    ->     SELECT 4    UNION
    ->     SELECT 5    UNION
    ->     SELECT 6    UNION
    ->     SELECT 7
    -> ) NOT_IDS;
+----+
| id |
+----+
|  2 |
|  4 |
|  5 |
|  6 |
|  7 |
+----+
5 rows in set (0.00 sec)

mysql>

Just for fun, and given your update, I'm going to suggest a different strategy:

You could join across tables like so ...

insert into db1.foos (cols) 
  select cols
    from db2.foos src
  left join db1.foos dst
    on src.pk = dst.pk
  where dst.othercolumn is null

I'm not sure how the optimizer will handle this or if it's going to be faster (depends on your indexing strategy, I guess) than what you're doing.


Yes, the amount of IDs in the list will impact performance. A network packet is only so big, for example, and the database has to parse all that noise and turn it into a series of:

WHERE foo.ID <> 2
AND foo.ID <> 4
AND foo.ID <> 5
AND ...

You should consider other ways to let your query know about this set.

Tags:

Mysql