Why is my EXISTS query doing an index scan instead of an index seek?
If you want good results from the query optimizer, it pays to be careful about data types.
Your variables are typed as datetime2:
DECLARE @OrderStartDate datetime2 = '27 feb 2016';
DECLARE @OrderEndDate datetime2 = '28 feb 2016';
But the column these are compared to is typed smalldatetime (as the sdtm prefix suggests!):
[sdtmOrdCreated] SMALLDATETIME NOT NULL
The type incompatibility makes it hard for the optimizer to work out the resulting cardinality estimate through a type conversion, as shown in the execution plan xml:
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes([@OrderStartDate],NULL,(22))">
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes(NULL,[@OrderEndDate],(10))">
The current estimate may or may not be accurate (probably not). Fixing the type incompatibility may or may not completely solve your plan selection problem, but it is the first (easy!) thing I would fix before looking deeper into the issue:
DECLARE @OrderStartDate smalldatetime = CONVERT(smalldatetime, '20160227', 112);
DECLARE @OrderEndDate smalldatetime = CONVERT(smalldatetime, '20160228', 112);
Always check the accuracy of cardinality estimates, and the reason for any discrepancy before deciding to rewrite the query or use hints.
See my SQLblog.com article, "Dynamic Seeks and Hidden Implicit Conversions" for more details on the dynamic seek.
Update: Fixing the data type got you the seek plan you wanted. The cardinality estimation errors caused by the type conversion before gave you the slower plan.
SQL Server is doing an index scan since it thinks that is cheaper than seeking to each required row. Most likely, SQL Server is correct, given the choices it has in your setup.
Be aware SQL Server may actually be doing a range scan on the index, as opposed to scanning the entire index.
If you provide the DDL for both tables, along with the other indexes you may have, we may be able to help you make this much less resource intensive.
As a side note, never ever use date literals like that. Instead of:
DECLARE @OrderStartDate DATETIME2 = '27 feb 2016';
DECLARE @OrderEndDate DATETIME2 = '28 feb 2016';
use this:
DECLARE @OrderStartDate DATETIME2 = '2016-02-27T00:00:00.0000';
DECLARE @OrderEndDate DATETIME2 = '2016-02-28T00:00:00.0000';
Aaron's post may help clarify that.
To add to Max's answer, I would probably try to split your query into two parts:
DECLARE @OrderStartDate DATETIME2 = {d '2016-02-27'};
DECLARE @OrderEndDate DATETIME2 = {d '2016-02-28'};
--- Work variable declarations:
DECLARE @minOrderNo varchar(20), @maxOrderNo varchar(20);
--- Find the lowest and highest order number respectively for
--- your date range:
SELECT @minOrderNo=MIN(strBxOrderNo),
@maxOrderNo=MAX(strBxOrderNo)
FROM dbo.tblBOrder o
WHERE o.sdtmOrdCreated >= @OrderStartDate AND
o.sdtmOrdCreated < @OrderEndDate;
--- Join orders and order items on their respective clustering keys.
SELECT o.strBxOrderNo
, o.sintOrderStatusID
, o.sintOrderChannelID
, o.sintOrderTypeID
, o.sdtmOrdCreated
, o.sintMarketID
, o.strOrderKey
, o.strOfferCode
, o.strCurrencyCode
, o.decBCShipFullPrice
, o.decBCShipFinal
, o.decBCShipTax
, o.decBCTotalAmount
, o.decWrittenTotalAmount
, o.decBCWrittenTotalAmount
, o.decBCShipOfferDisc
, o.decBCShipOverride
, o.decTotalAmount
, o.decShipTax
, o.decShipFinal
, o.decShipOverride
, o.decShipOfferDisc
, o.decShipFullPrice
, o.lngAccountParticipantID
, CONVERT(DATE, o.sdtmOrdCreated, 120) as OrderCreatedDateConverted
FROM dbo.tblBOrder AS o
INNER /*MERGE*/ JOIN dbo.tblBOrderItem AS oi ON
o.strBxOrderNo>=@minOrderNo AND --- OrderNo filter on "orders"
o.strBxOrderNo<=@maxOrderNo AND
oi.strBxOrderNo=o.strBxOrderNo AND --- Equijoin
oi.strBxOrderNo>=@minOrderNo AND --- OrderNo filter on "order items"
oi.strBxOrderNo<=@maxOrderNo AND
oi.decCatItemPrice > 0 --- Item price filter on "order items"
OPTION (RECOMPILE);
This query will (a) eliminate the Sort operator (which is expensive because it is blocking and requires a memory grant), (b) create a Merge Join (which you could force with a join hint, but it should happen automatically with enough data). As a bonus, it'll also (c) eliminate the Index Scan.
All in all, the MIN/MAX query uses a highly optimal index on the Orders table to identify a range of order numbers (included in the clustering key) from a non-clustered index on the date column:
Then, you can Merge Join the two tables on their respective clustered indexes:
Obviously, I don't have your data to test with, but I imagine this should be a really well-performing solution.