Why is this query not using my nonclustered index, and how can I make it?

Expressing the query using different syntax can sometimes help communicate your wish to use a non-clustered index to the optimizer. You should find the form below gives you the plan you want:

SELECT
    [ID],
    [DeviceID],
    [IsPUp],
    [IsWebUp],
    [IsPingUp],
    [DateEntered]
FROM [dbo].[Heartbeats]
WHERE
    [ID] IN
(
    -- Keys
    SELECT TOP (1000)
        [ID]
    FROM [dbo].[Heartbeats]
    WHERE 
        [DateEntered] >= CONVERT(datetime, '2011-08-30', 121)
        AND [DateEntered]  < CONVERT(datetime, '2011-08-31', 121)
);

Query Plan

Compare that plan with the one produced when the non-clustered index is forced with a hint:

SELECT TOP (1000) 
    * 
FROM [dbo].[Heartbeats] WITH (INDEX(CommonQueryIndex))
WHERE 
    [DateEntered] BETWEEN '2011-08-30' and '2011-08-31';

Forced Index Hint Plan

The plans are essentially the same (a Key Lookup is nothing more than a seek on the clustered index). Both plan forms will only ever perform one seek on the non-clustered index and a maximum of 1000 lookups into the clustered index.

The important difference is in the position of the Top operator. Positioned between the two seeks, the Top prevents the optimizer from replacing the two seeking operations with a logically-equivalent scan of the clustered index. The optimizer works by replacing parts of a logical plan with equivalent relational operations. Top is not a relational operator, so the rewrite prevents the transformation to a clustered index scan. If the optimizer were able to reposition the Top operator, it would still prefer the scan over the seek + lookup because of the way cost estimation works.

Costing of scans and seeks

At a very high level, the optimizer's cost model for scans and seeks is quite simple: it estimates that 320 random seeks cost the same as reading 1350 pages in a scan. This probably bears little resemblance to the hardware capabilities of any particular modern I/O system, but it does work reasonably well as a practical model.

The model also makes a number of simplifying assumptions, a major one being that every query is assumed to start with no data or index pages already in cache. The implication is that every I/O will result in a physical I/O - though this will rarely be the case in practice. Even with a cold cache, pre-fetching and read-ahead mean that the pages needed are actually quite likely to be in memory by the time the query processor needs them.

Another consideration is that the first request for a row that is not in memory will cause the whole page to be fetched from disk. Subsequent requests for rows on the same page will very likely not incur a physical I/O. The costing model does contain logic to take some account of effects like this, but it is not perfect.

All these things (and more) means the optimizer tends to switch to a scan earlier than in probably should. Random I/O is only 'much more expensive' than 'sequential' I/O if a physical operation results - accessing pages in memory is very fast indeed. Even where a physical read is required, a scan may not result in sequential reads at all due to fragmentation, and seeks may be collocated such that the pattern is essentially sequential. Add to that the changing performance characteristic of modern I/O systems (especially solid-state) and the whole thing starts to look very shaky.

Row Goals

The presence of a Top operator in a plan modifies the costing approach. The optimizer is smart enough to know that finding 1000 rows using a scan will likely not require scanning the whole clustered index - it can stop as soon as 1000 rows have been found. It sets a 'row goal' of 1000 rows at the Top operator and uses statistical information to work back from there to estimate how many rows it expects to need from the row source (a scan in this case). I wrote about the details of this calculation here.

The images in this answer were created using SQL Sentry Plan Explorer.


the first query does a table scan based on the threshold I earlier explained in: Is it possible to increase query performance on a narrow table with millions of rows?

(most likely your query without the TOP 1000 clause will return more then 46k rows. or some where between 35k and 46k. (the grey area ;-) )

the second query, must be ordered. Since you're NC index is ordered in the order you want, it's cheaper for the optimiser to use that index, and then to the bookmark lookups to the clustered index to get the missing columns as compaired to doing a clustered index scan and then needing to order that.

reverse the order of the columns in the ORDER BY clause and you are back to a clustered index scan since the NC INDEX is then useless.

edit forgot the answer to your second question, why you DON'T want this

Using a non clustered non covering index means that a rowID is looked up in the NC index and then the missing columns have to be looked up in the clustered index (the clustered index contains all columns of a table). IO's to lookup the missing columns in the clustered index are Random IOs.

The key here is RANDOM. because for every row found in the NC index, the access methods have to go look up a new page in the clustered index. This is random, and therefore very expensive.

Now, on other hand the optimiser could also go for a clustered index scan. It can use the allocation maps to lookup scan ranges and just start reading the Clustered index in large chunks. This is sequential and much cheaper. (as long as your table isn't fragmented :-) ) The downside is, the WHOLE clustered index needs to be read. This is bad for your buffer and potentially a huge amount of IOs. but still, sequential IOs.

In your case, the optimiser decides somewhere between 35k and 46k rows, it's less expensive to to a full clustered index scan. Yeah, it's wrong. And in a lot of cases with narrow non clustered indexes with not to selective WHERE clauses or large table for that matter this goes wrong. (Your table is worse, because it's also a very narrow table.)

Now, adding the ORDER BY makes it more expensive to scan the full clustered index and then order the results. Instead, the optimiser assumes it's cheaper to use the allready ordered NC index and then pay the random IO penalty for the bookmark lookups.

So your order by is a perfect "query hint" kind of solution. BUT, at a certain point, once your query results are so big, the penalty for the bookmark lookup random IOs will be so big it becomes slower. I assume the optimiser will change plans back to the clustered index scan before that point but you never know for sure.

In your case, as long as your inserts are ordered by entereddate, as discussed in chat and the previous question (see link) you are better of creating the clustered index on the enteredDate column.