Concatenation Physical Operation: Does it guarantee order of execution?
No, there is no documentation from Microsoft guaranteeing the behavior, therefore it is not guaranteed.
Additionally, assuming that the Simple Talk article is correct, and that the Concatenation physical operator always processes inputs in the order shown in the plan (very likely to be true), then without a guarantee that SQL Server will always generate plans that keep the same the order between the query text and the query plan, you're only slightly better off.
We can investigate this further though. If the query optimizer was able to reorder the Concatenation operator input, there should exist rows in the undocumented DMV, sys.dm_exec_query_transformation_stats
corresponding to that optimization.
SELECT * FROM sys.dm_exec_query_transformation_stats
WHERE name LIKE '%CON%' OR name LIKE '%UNIA%'
On SQL Server 2012 Enterprise Edition, this produces 24 rows. Ignoring the false matches for transformations related to constants, there is one transformation related to the Concatenation Physical Operator UNIAtoCON
(Union All to Concatenation). So, at the physical operator level, it appears that once a concatenation operator is selected, it will be processed in the order of the logical Union All operator it was derived from.
In fact that is not quite true. Post-optimization rewrites exist that can reorder the inputs to a physical Concatenation operator after cost-based optimization has completed. One example occurs when the Concatenation is subject to a row goal (so it may be important to read from the cheaper input first). See UNION ALL
Optimization by Paul White for more details.
That late physical rewrite was functional up to and including SQL Server 2008 R2, but a regression meant it no longer applied to SQL Server 2012 and later. A fix has been issued that reinstates this rewrite for SQL Server 2014 and later (not 2012) with query optimizer hotfixes enabled (e.g. trace flag 4199).
But about the Logical Union All operator (UNIA
)? There is a UNIAReorderInputs
transformation, which can reorder the inputs. There are also two physical operators that can be used to implement a logical Union All, UNIAtoCON
and UNIAtoMERGE
(Union All to Merge Union).
Therefore it appears that the query optimizer can reorder the inputs for a UNION ALL
; however, it doesn't appear to be a common transformation (zero uses of UNIAReorderInputs
on the SQL Servers I have readily accessible. We don't know the circumstances that would make the optimizer use UNIAReorderInputs
; though it is certainly used when a plan guide or use plan hint is used to force a plan generated using the row goal physical reordered inputs mentioned above.
Is there a way to have the engine process more than one input at a time?
The Concatenation physical operator can exist within a parallel section of a plan. With some difficulty, I was able to produce a plan with parallel concatenations using the following query:
SELECT userid, regdate FROM ( --Users table is around 3mil rows
SELECT userid, RegDate FROM users WHERE userid > 1000000
UNION
SELECT userid, RegDate FROM users WHERE userid < 1000000
UNION all
SELECT userid, RegDate FROM users WHERE userid < 2000000
) d ORDER BY RegDate OPTION (RECOMPILE)
So, in the strictest sense, the physical Concatenation operator does seem to always process inputs in a consistent fashion (top one first, bottom second); however, the optimizer could switch the order of the inputs before choosing the physical operator, or use a Merge union instead of a Concatenation.
According to Craig Freedman the order of execution for the concatenation operator is guaranteed.
From his blog post Viewing Query Plans on MSDN Blogs:
Note that when an operator has more than one child, the order of the children matters. The topmost child is the first child while the bottommost child is the second. The concatenation operator processes the children in this order.
And from books online Showplan Logical and Physical Operators Reference
The Concatenation physical operator has two or more inputs and one output. Concatenation copies rows from the first input stream to the output stream, then repeats this operation for each additional input stream.
Community wiki answer:
I don't know if you can prove that any observed behavior is always guaranteed, one way or the other, unless you can manufacture a counter-example. In the absence of that, the way to fix the order that results are returned, of course, is to add an ORDER BY
.
I don't know if there is a "fix", or that there exists a need for a fix, if you can demonstrate that in some scenarios the queries are processed in a different order.
The lack of any explicit, official documentation suggests to me that you should not depend on this. This is exactly the kind of thing that got people into trouble with ORDER BY
in a view, and GROUP BY
without ORDER BY
, 8 years ago when SQL Server 2005's optimizer was released.
With all of the new features in the newer versions of SQL Server (with more coming), even if you think you can guarantee a specific behavior today, I wouldn't expect it to hold true (until it is documented to do so).
Even if you're not depending on this behavior, what are you going to do with the results? Anyway, I wouldn't call a Simple Talk article by an outsider official. For all we know this is just a guess based on observation.
Microsoft is never going to publish official documentation saying 'x' is not guaranteed to do 'y'. This is one of the reasons we still, almost a decade later, have trouble convincing people that they can't rely on observed ordering without ORDER BY
- there is no documentation that states "it is not guaranteed."