Are these two queries the same - GROUP BY vs. DISTINCT?

Your example #2 had me scratching me head for a while - I thought to myself: "You can't DISTINCT a single column, what would that mean?" - until I realised what is going on.

When you have

SELECT DISTINCT(t.ItemNumber)

you are not, despite appearances, actually asking for distinct values of t.ItemNumber! Your example #2 actually gets parsed the same as

SELECT DISTINCT
  (t.ItemNumber)
  ,
  (SELECT TOP 1 ItemDescription
   FROM Transactions
   WHERE ItemNumber = t.ItemNumber
   ORDER BY DateCreated DESC) AS ItemDescription
FROM Transactions t

with syntactically-correct but superfluous parentheses around t.ItemNumber. It is to the result-set as a whole that DISTINCT applies.

In this case, since your GROUP BY groups by the column that actually varies, you get the same results. I'm actually slightly surprised that SQL Server doesn't (in the GROUP BY example) insist that the subqueried column is mentioned in the GROUP BY list.


Same results but the second one seems to have a more expensive sort step to apply the DISTINCT on my quick test.

Both were beaten out of sight by ROW_NUMBER though...

with T as
(
SELECT ItemNumber, 
       ItemDescription,
       ROW_NUMBER() OVER ( PARTITION BY ItemNumber ORDER BY DateCreated DESC) AS RN
FROM Transactions
)
SELECT * FROM T
WHERE RN=1

edit ...which in turn was thumped by Joe's solution on my test setup.

Plans

Test Setup

CREATE TABLE Transactions
(
ItemNumber INT not null,
ItemDescription VARCHAR(50) not null,
DateCreated DATETIME not null
)

INSERT INTO Transactions
SELECT 
number, NEWID(),DATEADD(day, cast(rand(CAST(newid() as varbinary))*10000 
  as int),getdate()) 
FROM master.dbo.spt_values

ALTER TABLE dbo.Transactions ADD CONSTRAINT
    PK_Transactions PRIMARY KEY CLUSTERED 
    (ItemNumber,DateCreated) 

If you're running at least 2005 and can use a CTE, this is a little cleaner IMHO.

EDIT: As pointed out in Martin's answer, this also performs much better.

;with cteMaxDate as (
    select t.ItemNumber, max(DateCreated) as MaxDate
        from Transactions t
        group by t.ItemNumber
)
SELECT t.ItemNumber, t.ItemDescription
    FROM cteMaxDate md
        inner join Transactions t
            on md.ItemNumber = t.ItemNumber
                and md.MaxDate = t.DateCreated