Is there a performance difference between BETWEEN and IN with MySQL or in SQL in general?
BETWEEN
should outperform IN
in this case (but do measure and check execution plans, too!), especially as n
grows and as statistics are still accurate. Let's assume:
m
is the size of your tablen
is the size of your range
Index can be used (n
is tiny compared to m
)
In theory,
BETWEEN
can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at mostn
index leaf nodes. The complexity will beO(n + log m)
IN
is usually implemented as a series (loop) ofn
"range scans" on the primary key index. Withm
being the size of the table, the complexity will always beO(n * log m)
... which is always worse (neglibile for very small tablesm
or very small rangesn
)
Index cannot be used (n
is a significant portion of m
)
In any case, you'll get a full table scan and evaluate the predicate on each row:
BETWEEN
needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity isO(m)
IN
needs to evaluate at mostn
predicates. The complexity isO(m * n)
... which is again always worse, or perhapsO(m)
if the database can optimise theIN
list to be a hashmap, rather than a list of predicates.
a between b and c
is a macro that expands to b <= a and a <= c
.
a in (b,c,d)
is a macro that expands to a=b or a=c or a=d
.
Assuming your n
and nk
are integer, both should end up meaning the same. The between
variant should be much faster because it's only two compares, versus nk - n
compares for the in
variant.