ORDER BY clause is allowed over column that is not in SELECT list?

It's not always possible:

If DISTINCT is specified, the expression in the ORDER BY clause have access only to the virtual table returned by the previous step (VT5)

(VT5 is virtual table returned generated by SELECT)

If DISTINCT is not specified expressions in the ORDER BY clause can access both the input and the output virtual tables of the SELECT phase.

Explanation is taken from "Inside SQLServer 2008 T-SQL Querying" - I'm not sure if this particular chapter is available online (Step 6: The Presentation ORDER BY Phase, p.16.)

Thus,

SELECT distinct field2 FROM table1 order by field1 

gives an error:

Msg 145, Level 15, State 1, Line 1
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.


What happens in this scenario is the column that is sorted on will now be part of the Output of the scan/seek operation. For instance, take this example:

CREATE TABLE [dbo].[TestTable](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [some_int] [int] NOT NULL,
    [some_string] [char](128) NOT NULL,
    [some_bigint] [bigint] NOT NULL,
);
go

ALTER TABLE [dbo].[TestTable] 
ADD  CONSTRAINT [PK_TestTable_Id] 
PRIMARY KEY CLUSTERED 
(
    [id] ASC
);
GO

Now take this query (no sort, just retrieving two columns' data):

select
    some_int,
    some_bigint
from TestTable;

enter image description here

We will have an execution plan that, for the clustered index scan, has an output list of some_int and some_bigint. Just as you'd expect. Now let's alter this query and sort on a column, but don't include it in the selected column data:

select
    some_int,
    some_bigint
from TestTable
order by id;

enter image description here

Now we have an output list on the clustered index scan that includes id, some_int, and some_bigint.

Tags:

Sql Server