Is COUNT faster than pulling the records and counting in code?
Two things should be considered
QUERY #1
SELECT COUNT(DISTINCT userid) from users;
This query will go a whole lot faster with an index on userid
; If you do not have an index on userid
and none of the indexes you already have begin with userid
, then run this:
ALTER TABLE user ADD INDEX (userid);
This will make the Query Optimizer choose to look through the index rather than touch the table.
QUERY #2
SELECT * from users;
Why bother to fetch every column in each row just to count the row?
You can replace that with
SELECT COUNT(id) FROM users;
where id is the PRIMARY KEY or
SELECT COUNT(1) FROM users;
You will have to benchmark which query is faster, SELECT COUNT(id)
or SELECT COUNT(1)
EPILOGUE
Unless you actually need the data while counting, let the counting happen in the server.
If you know you need the data, go ahead and pull it and count it in code. However, if you only need the count, it is significantly faster to pull the count from the database than it is to actually retrieve rows. Also it is standard practice to only pull what you need.
For instance, if you are counting all the rows in a table, most database implementations do not need to look at any rows. Tables know how many rows they have. If the query has filters in the where
clause and it can use an index, it again will not need to look at the actual rows' data, just counts the rows from the index.
And all this is not counting the less data transferred.
A rule of thumb about database speeds is go ahead and try it for yourself. General rules are not always a good indicator. For instance, if the table was 10 rows and only a few columns, I might just pull the whole thing anyway on the off chance I needed it, since 2 round trips to the database would outweigh the cost of the query.