SQL NOT IN [list of ids] (performance)
Here is wacky rewrite of that query that might perform a little better
SELECT * FROM foos
LEFT JOIN
(
SELECT 2 id UNION
SELECT 4 UNION
SELECT 5 UNION
SELECT 6 UNION
SELECT 7
) NOT_IDS
USING (id) WHERE NOT_IDS.id IS NULL;
The NOT_IDS subquery does work as shown by the following:
mysql> SELECT * FROM
-> (
-> SELECT 2 id UNION
-> SELECT 4 UNION
-> SELECT 5 UNION
-> SELECT 6 UNION
-> SELECT 7
-> ) NOT_IDS;
+----+
| id |
+----+
| 2 |
| 4 |
| 5 |
| 6 |
| 7 |
+----+
5 rows in set (0.00 sec)
mysql>
Just for fun, and given your update, I'm going to suggest a different strategy:
You could join across tables like so ...
insert into db1.foos (cols)
select cols
from db2.foos src
left join db1.foos dst
on src.pk = dst.pk
where dst.othercolumn is null
I'm not sure how the optimizer will handle this or if it's going to be faster (depends on your indexing strategy, I guess) than what you're doing.
Yes, the amount of IDs in the list will impact performance. A network packet is only so big, for example, and the database has to parse all that noise and turn it into a series of:
WHERE foo.ID <> 2
AND foo.ID <> 4
AND foo.ID <> 5
AND ...
You should consider other ways to let your query know about this set.