how can I create new XML file from existing database in PostgreSQL database using java
I have a working implementation where I do everything inside PostgreSQL without additional libraries.
Auxiliary parsing function
CREATE OR REPLACE FUNCTION f_xml_extract_val(text, xml)
RETURNS text AS
$func$
SELECT CASE
WHEN $1 ~ '@[[:alnum:]_]+$' THEN
(xpath($1, $2))[1]
WHEN $1 ~* '/text()$' THEN
(xpath($1, $2))[1]
WHEN $1 LIKE '%/' THEN
(xpath($1 || 'text()', $2))[1]
ELSE
(xpath($1 || '/text()', $2))[1]
END;
$func$ LANGUAGE sql IMMUTABLE;
Handle multiple values
The above implementation doesn't handle multiple attributes at one xpath. Here is an overloaded version of f_xml_extract_val()
for that. With the 3rd parameter you can pick one
(the first), all
or dist
(distinct) values. Multiple values are aggregated to a comma-separated string.
CREATE OR REPLACE FUNCTION f_xml_extract_val(_path text, _node xml, _mode text)
RETURNS text AS
$func$
DECLARE
_xpath text := CASE
WHEN $1 ~~ '%/' THEN $1 || 'text()'
WHEN lower($1) ~~ '%/text()' THEN $1
WHEN $1 ~ '@\w+$' THEN $1
ELSE $1 || '/text()'
END;
BEGIN
-- fetch one, all or distinct values
CASE $3
WHEN 'one' THEN RETURN (xpath(_xpath, $2))[1]::text;
WHEN 'all' THEN RETURN array_to_string(xpath(_xpath, $2), ', ');
WHEN 'dist' THEN RETURN array_to_string(ARRAY(
SELECT DISTINCT unnest(xpath(_xpath, $2))::text ORDER BY 1), ', ');
ELSE RAISE EXCEPTION
'Invalid $3: >>%<<', $3;
END CASE;
END
$func$ LANGUAGE plpgsql;
COMMENT ON FUNCTION f_xml_extract_val(text, xml, text) IS '
Extract element of an xpath from XML document
Overloaded function to f_xml_extract_val(..)
$3 .. mode is one of: one | all | dist'
Call:
SELECT f_xml_extract_val('//city', x, 'dist');
Main part
Name of target table: tbl
; prim. key: id
:
CREATE OR REPLACE FUNCTION f_sync_from_xml()
RETURNS boolean AS
$func$
DECLARE
datafile text := 'path/to/my_file.xml'; -- only relative path in db dir
myxml xml := pg_read_file(datafile, 0, 100000000); -- arbitrary 100 MB
BEGIN
-- demonstrating 4 variants of how to fetch values for educational purposes
CREATE TEMP TABLE tmp ON COMMIT DROP AS
SELECT (xpath('//some_id/text()', x))[1]::text AS id -- id is unique
, f_xml_extract_val('//col1', x) AS col1 -- one value
, f_xml_extract_val('//col2/', x, 'all') AS col2 -- all values incl. dupes
, f_xml_extract_val('//col3/', x, 'dist') AS col3 -- distinct values
FROM unnest(xpath('/xml/path/to/datum', myxml)) x;
-- 1.) DELETE?
-- 2.) UPDATE
UPDATE tbl t
SET ( col_1, col2, col3) =
(i.col_1, i.col2, i.col3)
FROM tmp i
WHERE t.id = i.id
AND (t.col_1, t.col2, t.col3) IS DISTINCT FROM
(i.col_1, i.col2, i.col3);
-- 3.) INSERT NEW
INSERT INTO tbl
SELECT i.*
FROM tmp i
WHERE NOT EXISTS (SELECT 1 FROM tbl WHERE id = i.id);
END
$func$ LANGUAGE plpgsql;
Important notes
This implementation checks on a primary key if the inserted row exists already and updates in this case. Only new rows are inserted.
I use a temporary staging table to speed up the procedure.
Tested with Postgres 8.4, 9.0 and 9.1.
XML has to be well-formed.
pg_read_file()
has restrictions to it. The manual:Use of these functions is restricted to superusers.
And:
Only files within the database cluster directory and the
log_directory
can be accessed.
So you have to put your source file there - or create a symbolic link to your actual file/directory.
Or you can provide the file via Java in your case (I did it all inside Postgres).
Or you can import the data into 1 column of 1 row of a temporary table and take it from there.
Or you can use lo_import
like demonstrated in this related answer on dba.SE.
- SQL to read XML from file into PostgreSQL database
This blog post by Scott Bailey helped me.
Postgres has (thanks to Daniel Lyons for pointing it out) native XML support which you can use to store your table. If you however want to shred your XML data manually, there are different possibilities to represent XML data in a database. The first question should be, if you want a very generic solution, that will be able to store any XML document or one that is specific to your domain (i.e. only allows XML documents of a certain structure). Depending on that, you will have a very flexible, universal representation which is however harder to query (the SQL needed will be quite complicated). If you have a more specific approach, the queries will be simpler, but you will need to create new tables or add new attributes to existing talbes every time you want to store another type of document or add a field to an existing document; so changing the schema will be harder (which is one major advantage of XML). This presentation should give you some ideas what are the different possibilities.
Also, you might consider to switch to some DB that supports Xquery, like DB2. The ability to natively query using XQuery, a language targeted at processing XML, will simplify things a lot.
UPDATE: Given your comment, your XML data (that you linked to) is perfectly relational. It can be mapped 1:1 to the following table:
CREATE TABLE mynt (
ID SERIAL ,
myntnafn CHAR(3) ,
myntheiti Varchar(255) ,
kaupgengi Decimal(15,2) ,
midgengi Decimal(15,2) ,
solugengi Decimal(15,2) ,
dagsetning TimeStamp
)
So any mynt
tag would be a record in the table and the corresponding sub-tags the attributes. The data types I gathered from your data, they might be wrong. The main problem is, IMO, that there is no natural primary key, so I added an autogenerated one.
PostgreSQL has an XML datatype. There are lots of XML specific functions you can use to query and modify the data, such as with xpath.
From the Java side, you can pretend you're just working with strings, but know that the data is well-formed on the way out and it won't let you store non-well-formed data.