Does ORDER BY on a clustered key effect performance?
It almost certainly will affect performance.
If you just do a query like
Select *
From Table
Order by PrimaryKey
It likely won't affect anything at all.
Bear in mind, though, that this only determines the order of the rows at the leaf level of the clustered index. If you do JOIN
s, or use other indexes that avoid key lookups, then the ORDER BY
will generate additional work.
If every index and every table that you will ever JOIN
or reference when querying this table are all ordered by the same key (in the same direction) then it probably won't impact performance. Outside that very limited scenario, you will see a hit.
Only use ORDER BY
when the order of the results actually matters. In my experience it's seldom necessary.
For a complex query, SQL Server may decide that it needs to sort portions of data in your query, and may even require multiple sort operators based on which indexes are available and the complexity of your query.
These sorts can definitely impact the performance of your query especially if you are working with large tables, and in some cases you'll find that adding an Order By statement will significantly change the execution plan SQL Server generates for your query.
If you're interested in seeing the impact of Sorting the queries you're tuning, run the Actual Plan with SQL Sentry Plan Explorer (free) or using SSMS with "Include Actual Execution Plan" turned on and look at the Ordered Indicator by hovering over the individual data steps in the plan. In Plan Explorer you can also look at the Ordered column in the Plan Tree or Top Operations view.
For something as simple as:
Select *
From Table
Order by PrimaryKey
You should find that while the Order By doesn't seem to change the query plan of your query, that the Ordered indicator in the execution plan does change from False to True when you add the Order By.
You'll may have heard that "the order of the results returned from a SELECT statement cannot be guaranteed without an ORDER BY clause"; One reason for this is a feature of the Enterprise Edition of SQL Server that allows SQL Server to piggyback one table scan on top of another, sometimes called a "Merry-go-round" scan and also called "Advanced Scanning".
For further research, I highly recommend both of the following books which are currently available as free ebooks from RedGate:
- SQL Server Execution Plans By Grant Fritchey
- Inside the SQL Server Query Optimizer By Benjamin Nevarez
Hope this helps!
Look at the query plan for both queries, you will see that the "order by" clause will result in a sort of the data that is returned by the remainder of the query (which should be in memory, but could be paged if memory insufficient). The time that sort takes is related to the amount of data (it has to walk it at least once) and how well ordered the data already is (it may be correctly ordered if you are sorting on an indexed column, or data joined on a indexed column)