Count NULL values per row
1. You know the column names ...
For Postgres 9.6 or later, use num_nulls()
WITH t(a, b, c, d, e) AS (
VALUES
(NULL ,'hi',2,NULL,'null')
, ('' ,'hi',2,3,'test')
, (NULL ,'hi',2,3,'null')
)
SELECT num_nulls(a,b,c,d,e)
FROM t;
Returns your desired result exactly, works for any mix of data types.
The manual:
num_nulls(VARIADIC "any")
... returns the number of null arguments
For Postgres 9.5 or older, convert to text[]
, array_remove(arr, null)
and use the remaining array length for an exact count:
SELECT 5 - cardinality(array_remove(ARRAY[a::text,b::text,c::text,d::text,e::text], null))
FROM t;
Any type can be cast to text
. The cast is redundant for text
columns, of course.
array_remove()
requires Postgres 9.3 or later.
cardinality()
requires Postgres 9.4 or later. Substitute with array_length(arr, 1)
in older versions.
2. You don't know column names, but Postgres does
When building on actual tables (or other registered objects like a view or a materialized view), we can retrieve column names from the system catalog pg_attribute
to fully automate with dynamic SQL. Like:
CREATE OR REPLACE FUNCTION f_num_nulls(_tbl regclass)
RETURNS SETOF int AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
'SELECT num_nulls(%s) FROM %s'
, (SELECT string_agg(quote_ident(attname), ', ') -- column list
FROM pg_attribute
WHERE attrelid = _tbl
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0) -- no system columns
, _tbl
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_num_nulls('myschema.tbl');
Returns the count for for each row in current physical order. Nothing else, to be absolutely generic.
Related:
- Table name as a PostgreSQL function parameter
- psql: SELECT * ... except one column
We could also pass each row to return a single count for it using a polymorphic function. Related:
- Refactor a PL/pgSQL function to return the output of various SELECT queries
3. You don't know anything: anonymous records
In the unlikely event that column names are unknown even to Postgres (like from a VALUES
expression in your example), convert to a document type (json
, jsonb
, xml
, hstore
) to get a handle, like demonstrated by ypercube (comment deleted by now) and Evan.
But anonymous records do not have primary keys or any other unique attribute per definition. Count within each LATERAL
subquery to defend against false aggregates. Demo with jsonb
:
SELECT *
FROM (
VALUES
(NULL ,'hi',2,NULL,'null')
, (NULL ,'hi',2,NULL,'null') -- duplicate row !!!
, ('' ,'hi',2,3,'test')
, (NULL ,'hi',2,3,'null')
) t -- column names unknown
, LATERAL (
SELECT count(*) FILTER (WHERE j.value = jsonb 'null') AS num_nulls
FROM jsonb_each(to_jsonb(t)) j
) c;
Or with json
: probably a bit faster because the conversion is cheaper.
Demonstrating 3 different ways:
SELECT *
FROM (
VALUES
(NULL ,'hi',2,NULL,'null')
, (NULL ,'hi',2,NULL,'null')
, ('' ,'hi',2,3,'test')
, (NULL ,'hi',2,3,'null')
) t -- column names unknown
, to_json(t) j
, LATERAL (
SELECT count(*) FILTER (WHERE j1.value::text = 'null') AS num_nulls1
FROM json_each(to_json(t)) j1
) c1
, LATERAL (
SELECT count(*) FILTER (WHERE j->>k IS NULL) AS num_nulls2
FROM json_object_keys(j) k
) c2
, LATERAL (
SELECT count(*) - count(j->>k ) AS num_nulls3
FROM json_object_keys(j) k
) c3;
db<>fiddle here