Python PostgreSQL COPY command used to INSERT or UPDATE (not just INSERT)
Not directly through the copy command.
What you can do however is create a temporary table, populate that table with the copy command, and then do your insert and update from that.
-- Clone table stucture of target table
create temporary table __copy as (select * from my_schema.my_table limit 0);
-- Copy command goes here...
-- Update existing records
update
my_schema.my_table
set
column_2 = __copy.column_2
from
__copy
where
my_table.column_1 = __copy.column_1;
-- Insert new records
insert into my_schema.my_table (
column_1,
column_2
) (
select
column_1,
column_2
from
__copy
left join my_schema.my_table using(column_1)
where
my_table is null
);
You might consider creating an index on __copy after populating it with data to speed the update query up.
Consider using a temp table as staging table that receives csv file data. Then, run an append into final table using Postgres' CONFLICT (colname) DO UPDATE ...
. Available in version 9.3+. See docs. Do note that the special excluded table is used to reference values originally proposed for insertion.
Also, assuming you use pyscopg2, consider using sql.Identifier() to safely bind identifiers like table or column names. However, you would need to decompose colStr to wrap individual items:
from psycopg2 import sql
...
cursor.execute("DELETE FROM tempTable")
conn.commit()
cmd = sql.SQL("COPY {0} ({1}) FROM STDIN WITH (FORMAT CSV, NULL '_|NULL|_'))")\
.format(sql.Identifier(temptableName),
sql.SQL(', ').join([sql.Identifier('col1'),
sql.Identifier('col2'),
sql.Identifier('col3')]))
cursor.copy_expert(cmd, io)
sql = "INSERT INTO finalTable (id_column, Col1, Col2, Col3)" + \
" SELECT id_column, Col1, Col2, Col3 FROM tempTable t" + \
" ON CONFLICT (id_column) DO UPDATE SET Col1 = EXCLUDED.Col1," + \
" Col2 = EXCLUDED.Col2," + \
" Col3 = EXCLUDED.Col3 ...;"
cursor.execute(sql)
conn.commit()