In Django, what is the most efficient way to check for an empty query set?
exists() is generally faster than count(), though not always (see test below). count() can be used to check for both existence and length.
Only use qs[0]
if you actually need the object. It's significantly slower if you're just testing for existence.
On Amazon SimpleDB, 400,000 rows:
- bare
qs
: 325.00 usec/pass qs.exists()
: 144.46 usec/passqs.count()
144.33 usec/passqs[0]
: 324.98 usec/pass
On MySQL, 57 rows:
- bare
qs
: 1.07 usec/pass qs.exists()
: 1.21 usec/passqs.count()
: 1.16 usec/passqs[0]
: 1.27 usec/pass
I used a random query for each pass to reduce the risk of db-level caching. Test code:
import timeit
base = """
import random
from plum.bacon.models import Session
ip_addr = str(random.randint(0,256))+'.'+str(random.randint(0,256))+'.'+str(random.randint(0,256))+'.'+str(random.randint(0,256))
try:
session = Session.objects.filter(ip=ip_addr)%s
if session:
pass
except:
pass
"""
query_variatons = [
base % "",
base % ".exists()",
base % ".count()",
base % "[0]"
]
for s in query_variatons:
t = timeit.Timer(stmt=s)
print "%.2f usec/pass" % (1000000 * t.timeit(number=100)/100000)
query.exists()
is the most efficient way.
Especially on postgres count()
can be very expensive, sometimes more expensive then a normal select query.
exists()
runs a query with no select_related, field selections or sorting and only fetches a single record. This is much faster then counting the entire query with table joins and sorting.
qs[0]
would still includes select_related, field selections and sorting; so it would be more expensive.
The Django source code is here (django/db/models/sql/query.py RawQuery.has_results):
https://github.com/django/django/blob/60e52a047e55bc4cd5a93a8bd4d07baed27e9a22/django/db/models/sql/query.py#L499
def has_results(self, using):
q = self.clone()
if not q.distinct:
q.clear_select_clause()
q.clear_ordering(True)
q.set_limits(high=1)
compiler = q.get_compiler(using=using)
return compiler.has_results()
Another gotcha that got me the other day is invoking a QuerySet in an if statement. That executes and returns the whole query !
If the variable query_set may be None
(unset argument to your function) then use:
if query_set is None:
#
not:
if query_set:
# you just hit the database