Why is this MySQL query taking so long?
Instead of
'2020-04-30' = date_add(op.dataDate, INTERVAL 14 DAY);
Use
op.dataDate = date_sub('2020-04-30, INTERVAL 14 DAY);
Your first statement will be interpreted as "add 14 days to all dataDate
and return when that is 2020-04-30." This will require a full scan of the table.
The second statement will evaluate to: "return records where the dataDate
is 2020-04-16." This allows the engine to perform a seek on your index that begins with dataDate
.
Do whatever weird stuff you want to do to expDate
since that won't factor much into how the query engine will optimize.
The formulation is the main problem. After rewriting the query, the index you have can be used for the entire WHERE
clause.
SELECT *
FROM op.prices
WHERE ticker = 'AAPL'
AND expDate >= '2020-04-30'
AND expDate < '2020-04-30' + INTERVAL 3 MONTH
AND dataDate = '2020-04-30' - INTERVAL 14 DAY
See Sargeable in Wikipedia. Phrased differently "Don't hide a column in a function call -- it may not be usable by an index."
Shrinking the file size will help some:
- Consider switching from 8-byte
DOUBLE
(about 16 significant digits) to 4-byteFLOAT
(about 7 significant digits) for the metrics. - A 1-byte
ENUM('put','call')
would save 3 bytes per row. - Normalizing the Option (a long string).
There may be more tips. To see how important shrinking is, answer these: How much RAM do you have? What is the value of innodb_buffer_pool_size
? How big (GB) is the table?
You really should have a PRIMARY KEY
.