What is the correct result for this query?
Per the standard:
SELECT 1 FROM r HAVING 1=1
means
SELECT 1 FROM r GROUP BY () HAVING 1=1
Citation ISO/IEC 9075-2:2011 7.10 Syntax Rule 1 (Part of the definition of the HAVING clause):
Let
HC
be the<having clause>
. LetTE
be the<table expression>
that immediately containsHC
. IfTE
does not immediately contain a<group by clause>
, then “GROUP BY ()
” is implicit. LetT
be the descriptor of the table defined by the<group by clause>
GBC
immediately contained inTE
and letR
be the result ofGBC
.
Ok so that much is pretty clear.
Assertion: 1=1
is true search condition. I will provide no citation for this.
Now
SELECT 1 FROM r GROUP BY () HAVING 1=1
is equivlent to
SELECT 1 FROM r GROUP BY ()
Citation ISO/IEC 9075-2:2011 7.10 General Rule 1:
The
<search condition>
is evaluated for each group ofR
. The result of the<having clause>
is a grouped table of those groups of R for which the result of the<search condition>
is True.
Logic: Since the search condition is always true, the result is R
, which is the result of the group by expression.
The following is an excerpt from the General Rules of 7.9 (the definition of the GROUP BY CLAUSE)
1) If no
<where clause>
is specified, then letT
be the result of the preceding<from clause>
; otherwise, letT
be the result of the preceding<where clause>
.2) Case:
a) If there are no grouping columns, then the result of the
<group by clause>
is the grouped table consisting ofT
as its only group.
Thus we can conclude that
FROM r GROUP BY ()
results in a grouped table, consisting of one group, with zero rows (since R is empty).
An excerpt from the General Rules of 7.12, which defines a Query Specification (a.k.a a SELECT statement):
1) Case:
a) If
T
is not a grouped table, then [...]b) If
T
is a grouped table, thenCase:
i) If
T
has 0 (zero) groups, then let TEMP be an empty table.ii) If
T
has one or more groups, then each<value expression>
is applied to each group ofT
yielding a tableTEMP
ofM
rows, whereM
is the number of groups inT
. Thei
-th column of TEMP contains the values derived by the evaluation of thei
-th<value expression>
. [...]2) Case:
a) If the
<set quantifier>
DISTINCT
is not specified, then the result of the<query specification>
isTEMP
.
Therefore since the table has one group, it must have one result row.
Thus
SELECT 1 FROM r HAVING 1=1
should return a 1 row result set.
Q.E.D.
When there is a HAVING
clause, without a WHERE
clause:
SELECT 1 FROM r HAVING 1=1;
... then GROUP BY ()
is implicit. So, the query should be equivalent to:
SELECT 1 FROM r GROUP BY () HAVING 1=1;
... which should group all rows of the table into one group (even if the table has no rows at all - it's still one group of 0 rows) and return 1 row. The HAVING
with the True
condition should have no effect at all after that.
From a different angle, how many rows should a query like this return?
SELECT COUNT(*), MAX(b) FROM r;
One, zero or "zero or one, depending on if the table is empty or not"?
I think one row, no matter how many rows r
has.
From what I see, it looks like SQLServer and PostgerSQL don't bother looking into table at all:
CREATE TABLE r (b INT);
insert into r(b) values (1);
insert into r(b) values (2);
SELECT 1 FROM r HAVING 1=1;
also returns just one row. Even though SQLServer docs says
When GROUP BY is not used, HAVING behaves like a WHERE clause.
that is not true in this case - WHERE 1=1
instead of HAVING
returns proper number of rows. I'd say it's optimizer bug (or at least documentation bug)...
SQLServer plan shows 'Constant scan' in case of HAVING
and 'table scan' for WHERE
...
Oracle and Mysql behaviour seems more logical and correct to me...