Optimizing WHERE condition for TIMESTAMP field in MySQL SELECT statement
For the specific query:
select sum(diff_ms)
from writetest_table
where time_on > '2015-07-13 15:11:56' ; -- use single quotes, not double
an index on (time_on, diff_ms)
would be the best option. So, if the query runs often enough or its efficiency is crucial to your application, add this index:
ALTER TABLE writetest_table
ADD INDEX time_on__diff_ms__ix -- pick a name for the index
(time_on, diff_ms) ;
(Not related to the question)
And really, change the table's engine to InnoDB. It's 2015 and MyISAM's funeral was some years ago.
(/rant)
I think I am starting to understand.
When I asked you to run
SELECT time_on FROM writetest_table ORDER BY time_on LIMIT 1;
You said it was 2015-07-13 15:11:56
which you have in your WHERE
clause
When you did the query
select sum(diff_ms) from writetest_table;
It performed a full table scan of 35.8 million rows.
When you did the query
select sum(diff_ms) from writetest_table where time_on > ("2015-07-13 15:11:56");
It performed a full index scan of 35.8 million rows.
It totally makes sense that the query without the WHERE clause is faster. Why ?
The table scan would read 35.8 million rows in one linear pass.
The EXPLAIN on the query with the WHERE also turned up 35.8 million rows. An index scan would behave a little different. While the BTREE keeps the order of the keys, it is horrible for doing range scans. In your particular case, you are performing the worst possible range scan, which would have the same number of BTREE entries as there are rows in the table. MySQL has to traverse the BTREE pages (at least across the leaf nodes) to read the values. In addition, the time_on
column has to be compared along the way in the order dictated by the index. Therefore, non-leaf BTREE nodes must be traversed as well.
Please see my posts on BTREEs
Aug 06, 2013
: In MySQL if column X has unique values what's the difference between UNIQUE index and B-Tree indexJun 28, 2012
: Benefits of BTREE in MySQL
If the query was as of midnight today
select sum(diff_ms) from writetest_table where time_on >= ("2015-07-14 00:00:00");
or even noon today
select sum(diff_ms) from writetest_table where time_on >= ("2015-07-14 12:00:00");
it should take less time.
MORAL OF THE STORY : Do not use a WHERE clause that does an ordered range scan equal to the number of rows in the target table.