Count the number of attributes that are NULL for a row
Possible without spelling out columns. Counter-pivot columns to rows and count.
The aggregate function count(<expression>)
only counts non-null values, while count(*)
counts all rows. The shortest and fastest way to count NULL values for more than a few columns is count(*) - count(col)
...
Works for any table with any number of columns of any
data types.
In Postgres 9.3+ with built-in JSON functions:
SELECT *, (SELECT count(*) - count(v)
FROM json_each_text(row_to_json(t)) x(k,v)) AS ct_nulls
FROM tbl t;
What is x(k,v)
?
json_each_text()
returns a set of rows with two columns. Default column names are key
and value
as can be seen in the manual where I linked. I provided table and column aliases so we don't have to rely on default names. The second column is named v
.
Or, in any Postgres version since at least 8.3 with the additional module hstore
installed, even shorter and a bit faster:
SELECT *, (SELECT count(*) - count(v) FROM svals(hstore(t)) v) AS ct_nulls
FROM tbl t;
This simpler version only returns a set of single values. I only provide a simple alias v
, which is automatically taken to be table and column alias.
- Best way to install hstore on multiple schemas in a Postgres database?
Since the additional column is functionally dependent I would consider not to persist it in the table at all. Rather compute it on the fly like demonstrated above or create a tiny function with a polymorphic input type for the purpose:
CREATE OR REPLACE FUNCTION f_ct_nulls(_row anyelement)
RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT (count(*) - count(v))::int FROM svals(hstore(_row)) v';
(PARALLEL SAFE
only for Postgres 9.6 or later.)
Then:
SELECT *, f_ct_nulls(t) AS ct_nulls
FROM tbl t;
You could wrap this into a VIEW
...
SQL Fiddle demonstrating all.
This should also answer your second question:
... the table name is obtained from argument, I don't know the schema of a table beforehand. That means I need to update the table with the input table name.
In Postgres, you can express this as:
select t.*,
((name is null)::int +
(age is null)::int +
(sex is null)::int
) as numnulls
from table t;
In order to implement this on an unknown table, you will need to use dynamic SQL and obtaining a list of columns (say from information_schema.columns)
).