cursor.fetchall() vs list(cursor) in Python
If you are using the default cursor, a MySQLdb.cursors.Cursor
, the entire result set will be stored on the client side (i.e. in a Python list) by the time the cursor.execute()
is completed.
Therefore, even if you use
for row in cursor:
you will not be getting any reduction in memory footprint. The entire result set has already been stored in a list (See self._rows
in MySQLdb/cursors.py).
However, if you use an SSCursor or SSDictCursor:
import MySQLdb
import MySQLdb.cursors as cursors
conn = MySQLdb.connect(..., cursorclass=cursors.SSCursor)
then the result set is stored in the server, mysqld. Now you can write
cursor = conn.cursor()
cursor.execute('SELECT * FROM HUGETABLE')
for row in cursor:
print(row)
and the rows will be fetched one-by-one from the server, thus not requiring Python to build a huge list of tuples first, and thus saving on memory.
Otherwise, as others have already stated, cursor.fetchall()
and list(cursor)
are essentially the same.
cursor.fetchall()
and list(cursor)
are essentially the same. The different option is to not retrieve a list, and instead just loop over the bare cursor object:
for result in cursor:
This can be more efficient if the result set is large, as it doesn't have to fetch the entire result set and keep it all in memory; it can just incrementally get each item (or batch them in smaller batches).
list(cursor)
works because a cursor is an iterable; you can also use cursor
in a loop:
for row in cursor:
# ...
A good database adapter implementation will fetch rows in batches from the server, saving on the memory footprint required as it will not need to hold the full result set in memory. cursor.fetchall()
has to return the full list instead.
There is little point in using list(cursor)
over cursor.fetchall()
; the end effect is then indeed the same, but you wasted an opportunity to stream results instead.