How to md5 all columns regardless of type
There is much more elegant solution for this.
In Postgres, using table name in SELECT
is permitted and it has type ROW
. If you cast this to type TEXT
, it gives all columns concatenated together in string that is actually JSON.
Having this, you can get md5
of all columns as follows:
SELECT md5(mytable::TEXT)
FROM mytable
If you want to only use some columns, use ROW
constructor and cast it to TEXT
:
SELECT md5(ROW(col1, col2, col3)::TEXT)
FROM mytable
Another nice property about this solution is that md5
will be different for NULL
vs. empty string.
Obligatory SQLFiddle.
You can also use something else similar to mvp's solution. Instead of using ROW() function which is not supported by Amazon Redshift...
Invalid operation: ROW expression, implicit or explicit, is not supported in target list;
My proposition is to use NVL2 and CAST function to cast different type of columns to CHAR, as long as this type is compatible with all Redshift data types according to the documentation. Below there is an example of how to achieve null proof MD5 in Redshift.
SELECT md5(NVL2(col1,col1::char,''),
NVL2(col2,col2::char,''),
NVL2(col3,col3::char,''))
FROM mytable
This might work without casting second NVL2 function argument to char but it would definately fail if you'd try to get md5 from date column with null value. I hope this would be helpful for someone.