Why doesn’t this recursive CTE with a parameter use an index when it does with a literal?
Randi Vertongen's answer correctly addresses how you can get the plan you want with the parameterized version of the query. This answer supplements that by addressing the title of the question in case you are interested in the details.
SQL Server rewrites tail-recursive common table expressions (CTEs) as iteration. Everything from the Lazy Index Spool down is the runtime implementation of the iterative translation. I wrote a detailed account of how this section of an execution plan works in answer to Using EXCEPT in a recursive common table expression.
You want to specify a predicate (filter) outside the CTE and have the query optimizer push this filter down inside the recursion (rewritten as iteration) and have it applied to the anchor member. This will mean the recursion starts with only those records that match ParentId = @Id
.
This is quite a reasonable expectation, whether a literal value, variable, or parameter is used; however, the optimizer can only do things for which rules have been written. Rules specify how a logical query tree is modified to achieve a particular transformation. They include logic to make sure the end result is safe - i.e. it returns exactly the same data as the original query specification in all possible cases.
The rule responsible for pushing predicates on a recursive CTE is called SelOnIterator
- a relational selection (= predicate) on an iterator implementing recursion. More precisely, this rule can copy a selection down to the anchor part of recursive iteration:
Sel(Iter(A,R)) -> Sel(Iter(Sel(A),R))
This rule can be disabled with the undocumented hint OPTION(QUERYRULEOFF SelOnIterator)
. When this is used, the optimizer can no longer push predicates with a literal value down to the anchor of a recursive CTE. You don't want that, but it illustrates the point.
Originally, this rule was limited to working on predicates with literal values only. It could also be made to work with variables or parameters by specifying OPTION (RECOMPILE)
, since that hint enables the Parameter Embedding Optimization, whereby the runtime literal value of the variable (or parameter) is used when compiling the plan. The plan is not cached, so the downside of this is a fresh compilation on each execution.
At some point, the SelOnIterator
rule was improved to also work with variables and parameters. To avoid unexpected plan changes, this was protected under the 4199 trace flag, database compatibility level, and query optimizer hotfix compatibility level. This is quite a normal pattern for optimizer improvements, which are not always documented. Improvements are normally good for most people, but there is always a chance that any change will introduce a regression for someone.
I want to put the CTE in a reusable view
You could use an inline table-valued function instead of a view. Provide the value you want to push down as a parameter, and place the predicate in the recursive anchor member.
If you prefer, enabling trace flag 4199 globally is also an option. There are many optimizer changes covered by this flag, so you would need to carefully test your workload with it enabled, and be prepared to handle regressions.
Though at the moment I don't have the title of the actual hotfix, the better query plan will be used when enabling the query optimizer hotfixes on your version (SQL Server 2012).
Some other methods are:
- Using
OPTION(RECOMPILE)
so the filtering happens earlier, on the literal value. - On SQL Server 2016 or higher the hotfixes before this version are applied automatically and the query should also run equivalent to the better execution plan.
Query optimizer Hotfixes
You can enable these fixes with
- Traceflag 4199 before SQL Server 2016
ALTER DATABASE SCOPED CONFIGURATION SET QUERY_OPTIMIZER_HOTFIXES=ON;
starting from SQL Server 2016. (not needed for your fix)
The filtering on @id
is applied earlier to both the recursive and anchor members in the execution plan with the hotfix enabled.
The traceflag can be added at the query level:
OPTION(QUERYTRACEON 4199)
When running the query on SQL Server 2012 SP4 GDR or SQL Server 2014 SP3 with Traceflag 4199 the better query plan is chosen:
ALTER PROCEDURE #c (@Id BIGINT) AS BEGIN;
WITH descendants AS (SELECT
t.ParentId Id
,t.Id DescendantId
FROM #tree t
WHERE t.ParentId IS NOT NULL
UNION ALL
SELECT
d.Id
,t.Id DescendantId
FROM descendants d
JOIN #tree t ON d.DescendantId = t.ParentId)
SELECT d.*
FROM descendants d
WHERE d.Id = @Id
ORDER BY d.Id, d.DescendantId
OPTION( QUERYTRACEON 4199 );
END;
GO
EXEC #c 24;
Query Plan on SQL Server 2014 SP3 With traceflag 4199
Query Plan on SQL Server 2012 SP4 GDR With traceflag 4199
Query Plan on SQL Server 2012 SP4 GDR without traceflag 4199
The main consensus is to enable traceflag 4199 globally when using a version before SQL Server 2016. Afterwards it is open for discussion whether to enable it or not. A Q/A on that here.
Compatibility level 130 or 140
When testing the parameterized query on a database with compatibility_level
= 130 or 140, the filtering happens earlier:
Due to the fact that the 'old' fixes from traceflag 4199 are enabled on SQL Server 2016 and higher.
OPTION(RECOMPILE)
Even though a procedure is used, SQL Server will be able to filter on the literal value when adding OPTION(RECOMPILE);
.
ALTER PROCEDURE #c (@Id BIGINT) AS BEGIN;
WITH descendants AS (SELECT
t.ParentId Id
,t.Id DescendantId
FROM #tree t
WHERE t.ParentId IS NOT NULL
UNION ALL
SELECT
d.Id
,t.Id DescendantId
FROM descendants d
JOIN #tree t ON d.DescendantId = t.ParentId)
SELECT d.*
FROM descendants d
WHERE d.Id = @Id
ORDER BY d.Id, d.DescendantId
OPTION(
RECOMPILE )
END;
GO
Query Plan on SQL Server 2012 SP4 GDR With OPTION(RECOMPILE)