Filtered index only used when the filtered part is in the JOIN not the WHERE

For the optimizer to match a predicate to an index (filtered or otherwise), the predicate must appear adjacent to the Get operation in the logical query tree. To facilitate this, predicates are generally pushed as close as possible to the leaves of the logical tree before optimization begins.

To greatly simplify, the physical index strategy implementation does this:

Predicate + Logical Get -> Physical Get (using Index)

The query you are interested in starts with the predicate above an outer join:

Predicate on T2 --+-- LOJ -- Get (T1)
                       |
                       +---- Get (T2)

This shape does not match the index strategy rule because the predicate is not adjacent to the Get. So, the first part of the answer is that filtered index matching will fail unless the predicate can be pushed past the outer join.

The second part is simply that the optimizer does not contain the necessary exploration rule to move a predicate past an outer join on the preserved side, because the transformation is so rarely valid. It is a general feature of the optimizer that only the most frequently useful rules are implemented.

As a result, matching the filtered index fails in this case. To be clear, the rewrite would be valid in the very specific case you cite (second query).

For the first query form (with different semantics), the predicate is associated with the join from the start, and the predicate push-down logic can move this the short distance to the Get because it does not have to move past an outer join as explained above.

Background and further information:

Query Optimizer Deep Dive (in four parts)

These are not semantically the same queries, since one can filter before the join the other can filter after. Let me illustrate with a simpler example:

CREATE TABLE dbo.Lefty(LeftyID INT PRIMARY KEY);

CREATE TABLE dbo.Righty(LeftyID INT, SomeList INT);

INSERT dbo.Lefty(LeftyID) VALUES(1),(2),(3);

INSERT dbo.Righty(LeftyID, SomeList) VALUES(1,1),(1,NULL),(2,2);

Query 1 returns all three rows:

SELECT l.LeftyID, r.SomeList
FROM dbo.Lefty AS l
LEFT OUTER JOIN dbo.Righty AS r
ON l.LeftyID = r.LeftyID
AND r.SomeList IS NULL;

Query 2, however, leaves out LeftyID 2:

SELECT l.LeftyID, r.SomeList
FROM dbo.Lefty AS l
LEFT OUTER JOIN dbo.Righty AS r
ON l.LeftyID = r.LeftyID
WHERE r.SomeList IS NULL;

SQLfiddle proof

If you're trying perform an anti-semi join, the tested column needs to be not nullable. Moving criteria between ON and WHERE makes no logical difference when you're dealing with INNER joins only, but with OUTER there is a significant difference. And you should care more that your results are correct than whether or not a filtered index could even be used.

The two queries are different - in meaning and results. Here is a rewriting, so it's more obvious what the two queries are doing:

-- 1st query
SELECT  
    a.LIST_ID
FROM    
      ( SELECT LIST_ID 
        FROM   DBO.PATIENT_LISTS
        WHERE  LIST_ID = @LIST_ID
      ) AS a
    LEFT JOIN  
      ( SELECT LIST_ID                    -- the filtered index
        FROM   DBO.PATIENT_LIST_BESPOKE   -- can be used
        WHERE  END_DTTM IS NULL           -- for the subquery
      ) AS b
    ON  a.LIST_ID = b.LIST_ID ;           -- and the join

and 2nd:

-- 2nd query
SELECT  
    a.LIST_ID
FROM    
      ( SELECT LIST_ID 
        FROM   DBO.PATIENT_LISTS
        WHERE  LIST_ID = @LIST_ID
      ) AS a
    JOIN  
      ( SELECT LIST_ID                    -- the filtered index
        FROM   DBO.PATIENT_LIST_BESPOKE   -- can be used
        WHERE  END_DTTM IS NULL           -- for the subquery
      ) AS b
    ON  a.LIST_ID = b.LIST_ID             -- and the join

UNION ALL

SELECT  
    a.LIST_ID
FROM    
      ( SELECT LIST_ID 
        FROM   DBO.PATIENT_LISTS
        WHERE  LIST_ID = @LIST_ID
      ) AS a
WHERE NOT EXISTS  
      ( SELECT *
        FROM   DBO.PATIENT_LIST_BESPOKE AS b
        WHERE  a.LIST_ID = b.LIST_ID         -- but not for this
      ) ;

I think it's now pretty obvious that for the 2nd part of the 2nq query, the filtered index cannot be used.

In detail, regarding these queries, there are 4 types of LIST_ID values in the first table:

(a) values that have matching rows in the second table, all with END_DTTM IS NULL.
(b) values that have matching rows in the second table, both with END_DTTM IS NULL and with END_DTTM IS NOT NULL .
(c) values that have matching rows in the second table, all with END_DTTM IS NOT NULL.
(d) values that have no matching rows in the second table.

Now, the 1st query will return all values of type (a) and (b) possibly many times (as many as they have a matching row in second table with END_DTTM IS NULL) and all rows of type (c) and (d) exactly once (that's the non-matching part of the outer join).

The 2nd query will return all values of type (a) and (b) possibly many times (as many as they have a matching row in second table with END_DTTM IS NULL) and all rows of type (d) exactly once.
It will not return any value of type (c) because the join will find matching rows in second table (but these will have END_DTTM IS NOT NULL) and they will be removed by the subsequent WHERE clause.

Filtered index only used when the filtered part is in the JOIN not the WHERE

Tags:

Performance

Sql Server

Query Performance

Filtered Index

Index

T Sql

Related

Recent Posts