SQL Server discarding SPACE during GROUP BY
It's actually doing the opposite, but the observable effects are the same.
When comparing two strings of unequal length, one of the rules of SQL (the standard, not just SQL Server) is that the shorter string is padded with spaces until it's the same length, and then the comparison is performed.
If you want to avoid being surprised, you'll need to add a non-space character at the end of each string.
In fact, checking the standard text, it appears that there are two options:
4.6 Type conversions and mixing of data types
...
When values of unequal length are compared, if the collating sequence for the comparison has the
NO PAD
attribute and the shorter value is equal to a prefix of the longer value, then the shorter value is considered less than the longer value. If the collating sequence for the comparison has thePAD SPACE
attribute, for the purposes of the comparison, the shorter value is effectively extended to the length of the longer by concatenation of <space>s on the right.
But all of the SQL Server collations I'm aware of are PAD SPACE
.
This is easier to see:
declare @t table (Name varchar(100), Age int)
insert @t values('A', 20),('B', 30),('C', 40),('D ', 25)
,(' A', 21),('A ', 32),(' A ', 28),('D ',10);
select Name, Replace(Name,' ','-'),
count(*) Count
from @t
group by Name
--
NAME COLUMN_1 COUNT
A -A 2
A A- 2
B B 1
C C 1
D D-- 2
Notice the space between A and dot. It chose the 1-space version over the 0-space.
Notice also that the D group chooses the one with 2 trailing spaces over the 4.
So, no it's not performing an RTRIM. It's somewhat of a soft bug however, because it's arbitrarily choosing one of the two columns (the one it came across first) as the result of the GROUP BY which could possibly throw you off if spaces mattered.