How to 'insert if not exists' in MySQL?
use INSERT IGNORE INTO table
see http://bogdan.org.ua/2007/10/18/mysql-insert-if-not-exists-syntax.html
there's also INSERT … ON DUPLICATE KEY UPDATE
syntax, you can find explanations on dev.mysql.com
Post from bogdan.org.ua according to Google's webcache:
18th October 2007
To start: as of the latest MySQL, syntax presented in the title is not possible. But there are several very easy ways to accomplish what is expected using existing functionality.
There are 3 possible solutions: using INSERT IGNORE, REPLACE, or INSERT … ON DUPLICATE KEY UPDATE.
Imagine we have a table:
CREATE TABLE `transcripts` ( `ensembl_transcript_id` varchar(20) NOT NULL, `transcript_chrom_start` int(10) unsigned NOT NULL, `transcript_chrom_end` int(10) unsigned NOT NULL, PRIMARY KEY (`ensembl_transcript_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Now imagine that we have an automatic pipeline importing transcripts meta-data from Ensembl, and that due to various reasons the pipeline might be broken at any step of execution. Thus, we need to ensure two things:
repeated executions of the pipeline will not destroy our database
repeated executions will not die due to ‘duplicate primary key’ errors.
Method 1: using REPLACE
It’s very simple:
REPLACE INTO `transcripts` SET `ensembl_transcript_id` = 'ENSORGT00000000001', `transcript_chrom_start` = 12345, `transcript_chrom_end` = 12678;
If the record exists, it will be overwritten; if it does not yet exist, it will be created. However, using this method isn’t efficient for our case: we do not need to overwrite existing records, it’s fine just to skip them.
Method 2: using INSERT IGNORE Also very simple:
INSERT IGNORE INTO `transcripts` SET `ensembl_transcript_id` = 'ENSORGT00000000001', `transcript_chrom_start` = 12345, `transcript_chrom_end` = 12678;
Here, if the ‘ensembl_transcript_id’ is already present in the database, it will be silently skipped (ignored). (To be more precise, here’s a quote from MySQL reference manual: “If you use the IGNORE keyword, errors that occur while executing the INSERT statement are treated as warnings instead. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted.”.) If the record doesn’t yet exist, it will be created.
This second method has several potential weaknesses, including non-abortion of the query in case any other problem occurs (see the manual). Thus it should be used if previously tested without the IGNORE keyword.
Method 3: using INSERT … ON DUPLICATE KEY UPDATE:
Third option is to use
INSERT … ON DUPLICATE KEY UPDATE
syntax, and in the UPDATE part just do nothing do some meaningless (empty) operation, like calculating 0+0 (Geoffray suggests doing the id=id assignment for the MySQL optimization engine to ignore this operation). Advantage of this method is that it only ignores duplicate key events, and still aborts on other errors.As a final notice: this post was inspired by Xaprb. I’d also advise to consult his other post on writing flexible SQL queries.
Solution:
INSERT INTO `table` (`value1`, `value2`)
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL
WHERE NOT EXISTS (SELECT * FROM `table`
WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1)
Explanation:
The innermost query
SELECT * FROM `table`
WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1
used as the WHERE NOT EXISTS
-condition detects if there already exists a row with the data to be inserted. After one row of this kind is found, the query may stop, hence the LIMIT 1
(micro-optimization, may be omitted).
The intermediate query
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL
represents the values to be inserted. DUAL
refers to a special one row, one column table present by default in all Oracle databases (see https://en.wikipedia.org/wiki/DUAL_table). On a MySQL-Server version 5.7.26 I got a valid query when omitting FROM DUAL
, but older versions (like 5.5.60) seem to require the FROM
information. By using WHERE NOT EXISTS
the intermediate query returns an empty result set if the innermost query found matching data.
The outer query
INSERT INTO `table` (`value1`, `value2`)
inserts the data, if any is returned by the intermediate query.
on duplicate key update, or insert ignore can be viable solutions with MySQL.
Example of on duplicate key update update based on mysql.com
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;
Example of insert ignore based on mysql.com
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE
col_name=expr
[, col_name=expr] ... ]
Or:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name
SET col_name={expr | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE
col_name=expr
[, col_name=expr] ... ]
Or:
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE
col_name=expr
[, col_name=expr] ... ]