Use of the PostgreSQL Upsert (INSERT ON CONFLICT DO) Function

Background

Upsert (INSERT ON CONFLICT DO) is a new function of PostgreSQL 9.5. When a constraint error occurs during data insertion, data insertion is rolled back or changed to update. The syntax for the same is as follows:

Command:     INSERT  
Description: create new rows in a table
Syntax:
[ WITH [ RECURSIVE ] with_query [, ...] ]
INSERT INTO table_name [ AS alias ] [ ( column_name [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
[ ON CONFLICT [ conflict_target ] conflict_action ]
[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]

where conflict_target can be one of:

( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] ) [ WHERE index_predicate ]
ON CONSTRAINT constraint_name

and conflict_action is one of:

DO NOTHING
DO UPDATE SET { column_name = { expression | DEFAULT } |
( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] ) |
( column_name [, ...] ) = ( sub-SELECT )
} [, ...]
[ WHERE condition ]

For versions earlier than PostgreSQL 9.5, use functions or the with syntax to implement functions similar to upsert.

Upsert Usage Example for Versions Later Than 9.5

Execute the following command to create a test table with one field as the unique key or primary key.

create table test(id int primary key, info text, crt_time timestamp);

Step 1) If a table does not exist, insert it. Otherwise, update it.

test03=# insert into test values (1,'test',now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;  
INSERT 0 1

test03=# select * from test;
id | info | crt_time
----+------+----------------------------
1 | test | 2017-04-24 15:27:25.393948
(1 row)

test03=# insert into test values (1,'hello digoal',now()) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time;
INSERT 0 1

test03=# select * from test;
id | info | crt_time
----+--------------+----------------------------
1 | hello digoal | 2017-04-24 15:27:39.140877
(1 row)

Step 2) If the data does not exist, insert it. Otherwise, do not perform any operation.

test03=# insert into test values (1,'hello digoal',now()) on conflict (id) do nothing;  
INSERT 0 0
test03=# insert into test values (1,'pu',now()) on conflict (id) do nothing;
INSERT 0 0
test03=# insert into test values (2,'pu',now()) on conflict (id) do nothing;
INSERT 0 1
test03=# select * from test;
id | info | crt_time
----+--------------+----------------------------
1 | hello digoal | 2017-04-24 15:27:39.140877
2 | pu | 2017-04-24 15:28:20.37392
(2 rows)

Upsert Usage Example for Versions Earlier Than 9.5

Use different methods to meet the actual requirements.

1) Functions

test03=# create or replace function f_upsert(int,text,timestamp) returns void as $$  
declare
res int;
begin
update test set info=$2,crt_time=$3 where id=$1;
if not found then
insert into test (id,info,crt_time) values ($1,$2,$3);
end if;
exception when SQLSTATE '23505' then
-- return; -- on conflict (id) do nothing
update test set info=$2,crt_time=$3 where id=$1; -- insert into test values ($1,$2,$3) on conflict (id) do update set info=excluded.info,crt_time=excluded.crt_time
end;
$$ language plpgsql strict;
CREATE FUNCTION

test03=# select f_upsert(1,'digoal',now()::timestamp);
f_upsert
----------

(1 row)

test03=# select * from test;
id | info | crt_time
----+--------+----------------------------
2 | pu | 2017-04-24 15:28:20.37392
1 | digoal | 2017-04-24 15:31:29.254325
(2 rows)

test03=# select f_upsert(1,'digoal001',now()::timestamp);
f_upsert
----------

(1 row)

test03=# select * from test;
id | info | crt_time
----+-----------+---------------------------
2 | pu | 2017-04-24 15:28:20.37392
1 | digoal001 | 2017-04-24 15:31:38.0529
(2 rows)

test03=# select f_upsert(3,'hello',now()::timestamp);
f_upsert
----------

(1 row)

test03=# select * from test;
id | info | crt_time
----+-----------+---------------------------
2 | pu | 2017-04-24 15:28:20.37392
1 | digoal001 | 2017-04-24 15:31:38.0529
3 | hello | 2017-04-24 15:31:49.14291
(3 rows)

Use functions to upsert data. To write data in batches, create variables as shown below.

create table batch (id int primary key, info text, crt_time timestamp);create or replace function merge_batch(VARIADIC i batch[])
returns void as $$
declare var batch;
begin
foreach var in array i loop
update batch set info=var.info,crt_time=var.crt_time where id=var.id;
if not found then
insert into batch values (var.*);
end if;
end loop;
exception when others then
return;
end;
$$ language plpgsql strict;
select merge_batch((1,'abc','2000-01-01'),(1,'abcd','2017-02-02'));

2) With Syntax — Method 1

create table test(id int primary key, info text, crt_time timestamp);

If the exists, update it. Otherwise, insert it.

with upsert as (update test set info=$info,crt_time=$crt_time where id=$id returning *) insert into test select $id,$info,$crt_time where not exists (select 1 from upsert where id=$id);

Replace the variables and perform a test.

with upsert as (update test set info='test',crt_time=now() where id=1 returning *) insert into test select 1,'test',now() where not exists (select 1 from upsert where id=1);

When a nonexistent value is inserted, only one session has the successfully inserted value, whereas the other session returns a primary key constraint error.

3) With Syntax — Method 2

Ensure concurrency even if the table does not have a primary key or a unique constraint.

create table test(id int, info text, crt_time timestamp);

3.1) Data Inserted to Only One Session.

When the same data item is updated, the first established session locks the record, whereas the session that is established later enters the waiting state.

with     
w1 as(select ('x'||substr(md5('$id'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info=$info,crt_time=$crt_time where id=$id returning *)
insert into test select $id, $info, $crt_time from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=$id);

Next, replace the variables and perform a test.

with     
w1 as(select ('x'||substr(md5('1'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info='digoal0123',crt_time=now() where id=1 returning *)
insert into test select 1, 'digoal0123', now() from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=1);

INSERT 0 0

test03=# select * from test;
id | info | crt_time
----+------------+---------------------------
2 | pu | 2017-04-24 15:28:20.37392
3 | hello | 2017-04-24 15:31:49.14291
1 | digoal0123 | 2017-04-24 15:31:38.0529
(3 rows)

with
w1 as(select ('x'||substr(md5('4'),1,16))::bit(64)::bigint as tra_id),
upsert as (update test set info='digoal0123',crt_time=now() where id=4 returning *)
insert into test select 4, 'digoal0123', now() from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=4);

INSERT 0 1

test03=# select * from test;
id | info | crt_time
----+------------+----------------------------
2 | pu | 2017-04-24 15:28:20.37392
3 | hello | 2017-04-24 15:31:49.14291
1 | digoal0123 | 2017-04-24 15:31:38.0529
4 | digoal0123 | 2017-04-24 15:38:39.801908
(4 rows)

3.2) Data Inserted to Only One Session.

When the same data item is updated, the session that is established first updates the data, whereas the session that is established later directly returns an error.

with w1 as(select ('x'||substr(md5('$id'),1,16))::bit(64)::bigint as tra_id),    
upsert as (update test set info=$info,crt_time=$crt_time from w1 where pg_try_advisory_xact_lock(tra_id) and id=$id returning *)
insert into test select $id,$info,$crt_time from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=$id);

Next, replace the variables and perform a test.

with w1 as(select ('x'||substr(md5('1'),1,16))::bit(64)::bigint as tra_id),    
upsert as (update test set info='test',crt_time=now() from w1 where pg_try_advisory_xact_lock(tra_id) and id=1 returning *)
insert into test select 1,'test',now() from w1
where pg_try_advisory_xact_lock(tra_id) and not exists (select 1 from upsert where id=1);

INSERT 0 0

test03=# select * from test;
id | info | crt_time
----+------------+----------------------------
2 | pu | 2017-04-24 15:28:20.37392
3 | hello | 2017-04-24 15:31:49.14291
4 | digoal0123 | 2017-04-24 15:42:50.912887
1 | test | 2017-04-24 15:44:44.245167
(4 rows)

4) Rule Method

PostgreSQL has supported Rule syntax for a long time. Create a rule with Rule syntax. If the rule exists, update it. Otherwise, insert it. However, when using the volatile function, do not directly use exists. Otherwise, it will be processed as an immutable function.

If it exists, do not insert it (ignore it). Otherwise, update it.

Implement idempotent writing so no problems occur while writing during resumable uploads.

postgres=# create table e(id int primary key, info text);
CREATE TABLE

Execute the following code to create a volatile function.

CREATE OR REPLACE FUNCTION public.ff(integer)
RETURNS boolean
LANGUAGE sql
STRICT
volatile
AS $function$
select true from e where id=$1 limit 1;
$function$;

Create a rule as shown below.

postgres=# create rule r1 as on insert to e where ff(NEW.id) do instead update e set info=NEW.info where id=NEW.id;
CREATE RULE
postgres=# insert into e values (1,'test'),(1,'test');
INSERT 0 0
postgres=# insert into e values (2,'test'),(2,'test');
INSERT 0 1
postgres=# insert into e values (3,'test'),(3,'test');
INSERT 0 1
postgres=# truncate e;
TRUNCATE TABLE
postgres=# select * from b;
id | info
----+------
1 | a
1 | b
(2 rows)
postgres=# insert into e select * from b;
INSERT 0 1
postgres=# \d+ b
Table "public.b"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
info | text | | | | extended | |

Implement performance stress testing as shown below.

vi test.sql\set id random(1,1000000)
insert into e values (:id, md5(random()::text));
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 1000progress: 90.0 s, 132056.5 tps, lat 0.220 ms stddev 0.055
progress: 91.0 s, 131656.9 tps, lat 0.220 ms stddev 0.040
progress: 92.0 s, 134941.0 tps, lat 0.215 ms stddev 0.090
progress: 93.0 s, 134324.5 tps, lat 0.216 ms stddev 0.076
progress: 94.0 s, 136699.4 tps, lat 0.212 ms stddev 0.070
progress: 95.0 s, 139291.4 tps, lat 0.208 ms stddev 0.067
progress: 96.0 s, 136073.2 tps, lat 0.213 ms stddev 0.076
progress: 97.0 s, 135804.6 tps, lat 0.214 ms stddev 0.076
progress: 98.0 s, 146037.6 tps, lat 0.199 ms stddev 0.069
progress: 99.0 s, 129619.5 tps, lat 0.224 ms stddev 0.049
progress: 100.0 s, 129230.0 tps, lat 0.224 ms stddev 0.047
progress: 101.0 s, 131048.4 tps, lat 0.221 ms stddev 0.055
progress: 102.0 s, 128808.0 tps, lat 0.225 ms stddev 0.048
progress: 103.0 s, 128954.6 tps, lat 0.225 ms stddev 0.048
progress: 104.0 s, 131227.9 tps, lat 0.221 ms stddev 0.042
progress: 105.0 s, 129604.0 tps, lat 0.224 ms stddev 0.057

Always take the following precautions for using the Rule syntax.

1) The use of exists in the Rule syntax is only suitable for a single insert statement (use the volatile function to solve this problem).

postgres=# create table e(id int, info text);
CREATE TABLE
postgres=# create rule r1 as on insert to e where exists (select 1 from e t1 where t1.id=NEW.id limit 1) do instead nothing;
CREATE RULE

在一个语句中插入多条,如果多条中有重复,则在RULE中判断条件时仅判断一次(类似immutable函数)。

postgres=# insert into e values (1,'test'),(1,'test');
INSERT 0 2
postgres=# select * from e;
id | info
----+------
1 | test
1 | test
(2 rows)

2) Rule syntax does not support copy statements. Repeated copy statements also cause problems.

In any case, it is recommended that tables requiring upsert have a primary key.

Original Source:

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Getting Started with PWN

INTRODUCTION TO CSS

Ansible 2.10 Installation on Mac OS X

How To Annoy A Software Engineer in 3 Easy Steps

FantomStaker.io Referral System

LambdaDays 2019 recap

Malware Detected in Wordpress !

Fun Fact Fridays 2017 #9 — Flame Wars

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Dynamic SQL processing with Apache Flink

How can I manage my Kafka Artefacts?

Efficient fetching of data from Oracle database in Golang

Architecture of object-based storage and S3 standard specifications