SQL to read XML from file into PostgreSQL database
Similar to this answer to a previous question, and if you don't want the restrictions of pg_read_file()
(in short: pg_read_file
can't read files outside the database directory, and reads text in the current session's character encoding).
This function works for any path, but needs to be created as superuser:
create or replace function stack.bytea_import(p_path text, p_result out bytea)
language plpgsql as $$
declare
l_oid oid;
begin
select lo_import(p_path) into l_oid;
select lo_get(l_oid) INTO p_result;
perform lo_unlink(l_oid);
end;$$;
lo_get
was introduced in 9.4 so for older versions you would need:
create or replace function stack.bytea_import(p_path text, p_result out bytea)
language plpgsql as $$
declare
l_oid oid;
r record;
begin
p_result := '';
select lo_import(p_path) into l_oid;
for r in ( select data
from pg_largeobject
where loid = l_oid
order by pageno ) loop
p_result = p_result || r.data;
end loop;
perform lo_unlink(l_oid);
end;$$;
then:
select convert_from(stack.bytea_import('/tmp/test.xml'), 'utf8')::xml;
The pg_read_binary_file
function can do this.
It has limitations: new in PostgreSQL 9.1 or above; must be a session owned by database superuser; must read a file within the database directory or below. Those are acceptable in my use case.
So the following will work to create a native XML
value from a file:
-- PostgreSQL 9.1 or later.
SELECT
XMLPARSE(DOCUMENT convert_from(
pg_read_binary_file('foo.xml'), 'UTF8'));
In PostgreSQL 8.3 – 9.0, the pg_read_file
function can be used, with the additional limitation that you can't specify a file-specific encoding (it reads the file as text in the current session's encoding).
-- PostgreSQL earlier than 9.1.
SELECT
XMLPARSE(DOCUMENT pg_read_file('foo.xml'));
I have posted a complete implementation of what you are asking for in a recent answer on SO.
The key features are the xpath()
function, pg_read_file()
, array handling, plpgsql functions, ..