What is the difference between COUNT(*) and COUNT(*) OVER()
COUNT(*) Returns all rows ignoring nulls right?
I'm not sure what you mean by "ignoring nulls" here. It returns the number of rows irrespective of any NULL
s
SELECT COUNT(*)
FROM (VALUES (CAST(NULL AS INT)),
(CAST(NULL AS INT))) V(C)
Returns 2
.
Altering the above query to COUNT(C)
would return 0
as when using COUNT
with an expression other than *
only NOT NULL
values of that expression are counted.
Suppose the table in your question has the following source data
+---------+---------------+
| Name | MaritalStatus |
+---------+---------------+
| Albert | Single |
| Bob | Single |
| Charles | Single |
| David | Single |
| Edward | Married |
| Fred | Married |
| George | NULL |
+---------+---------------+
The query
SELECT MaritalStatus,
COUNT(*) AS CountResult
FROM T
GROUP BY MaritalStatus
Returns
+---------------+-------------+
| MaritalStatus | CountResult |
+---------------+-------------+
| Single | 4 |
| Married | 2 |
| NULL | 1 |
+---------------+-------------+
Hopefully it is obvious how that result relates to the original data.
What does
COUNT(*) OVER()
do?
Adding that into the SELECT
list for the previous query produces
+---------------+-------------+-----------------+
| MaritalStatus | CountResult | CountOverResult |
+---------------+-------------+-----------------+
| Single | 4 | 3 |
| Married | 2 | 3 |
| NULL | 1 | 3 |
+---------------+-------------+-----------------+
Notice that the result set has 3 rows and CountOverResult is 3. This is not a coincidence.
The reason for this is because it logically operates on the result set after the GROUP BY
.
COUNT(*) OVER ()
is a windowed aggregate. The absence of any PARTITION BY
or ORDER BY
clause means that the window it operates on is the whole result set.
In the case of the query in your question the value of CountOverResult
is the same as the number of distinct MaritalStatus
values that exist in the base table because there is one row for each of these in the grouped result.