How Does PostgreSQL Implement Batch Update, Deletion, and Insertion?
Background
This article aims to answer all the following questions:
- How do I insert multiple records at once?
- How do I update multiple records at once?
- How do I delete multiple records at once?
Batch operations help reduce the interactions between databases and applications and improve data processing throughput.
Batch Insertion
Batch Insertion 1
Use the insert into ... select
method as shown below.
postgres=# insert into tbl1 (id, info ,crt_time) select generate_series(1,10000),'test',now();
INSERT 0 10000
postgres=# select count(*) from tbl1;
count
-------
10001
(1 row)
Batch Insertion 2
Use the values(),(),...()
method as shown below.
postgres=# insert into tbl1 (id,info,crt_time) values (1,'test',now()), (2,'test2',now()), (3,'test3',now());
INSERT 0 3
Batch Insertion 3
Use the BEGIN; ... multiple insert statements... ; END
method as shown below.
Strictly speaking, this is not a batch operation, but it helps to reduce the synchronization wait time when transactions are submitted. It also improves performance.
postgres=# begin;
BEGIN
postgres=# insert into tbl1 (id,info,crt_time) values (1,'test',now());
INSERT 0 1
postgres=# insert into tbl1 (id,info,crt_time) values (2,'test2',now());
INSERT 0 1
postgres=# insert into tbl1 (id,info,crt_time) values (3,'test3',now());
INSERT 0 1
postgres=# end;
COMMIT
Batch Insertion 4
Unlike Insert protocol, the Copy protocol is more streamlined and makes insertion more efficient.
test03=# \d test
Table "public.test"
Column | Type | Modifiers
----------+-----------------------------+-----------
id | integer | not null
info | text |
crt_time | timestamp without time zone |
Indexes:
"test_pkey" PRIMARY KEY, btree (id)
test03=# copy test from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 8 'test' '2017-01-01'
>> 9 'test9' '2017-02-02'
>> \.
COPY 2
The corresponding copy interface varies with the language driver.
Batch Update
Implement a batch update as shown below.
test03=# update test set info=tmp.info from (values (1,'new1'),(2,'new2'),(6,'new6')) as tmp (id,info) where test.id=tmp.id;
UPDATE 3
test03=# select * from test;
id | info | crt_time
----+--------------+----------------------------
3 | hello | 2017-04-24 15:31:49.14291
4 | digoal0123 | 2017-04-24 15:42:50.912887
5 | hello digoal | 2017-04-24 15:57:29.622045
1 | new1 | 2017-04-24 15:58:55.610072
2 | new2 | 2017-04-24 15:28:20.37392
6 | new6 | 2017-04-24 15:59:12.265915
(6 rows)
Use other table names after ‘from’ to perform multi-table JOIN batch update.
Batch Deletion
Implement batch deletion as shown below.
test03=# delete from test using (values (3),(4),(5)) as tmp(id) where test.id=tmp.id;
DELETE 3
test03=# select * from test;
id | info | crt_time
----+---------+----------------------------
1 | new1 | 2017-04-24 15:58:55.610072
2 | new2 | 2017-04-24 15:28:20.37392
6 | new6 | 2017-04-24 15:59:12.265915
Use other table names after ‘using’ to perform multi-table JOIN batch deletion.
If you want to clear the table, we recommend usig truncate.
test03=# set lock_timeout = '1s';
SET
test03=# truncate test;
TRUNCATE TABLE
test03=# select * from test;
id | info | crt_time
----+------+----------
(0 rows)
Batch Update or Deletion -If Cartesian or One-to-Many or Many-to-Many JOIN Operations Occur
In a scenario where a Cartesian or one-to-many or many-to-many JOIN operations occur, the database might not know the target row for which the value needs to be updated.
postgres=# create table t1 (id int primary key, info text);
CREATE TABLE
postgres=# create table t2(id int, info text);
CREATE TABLE
postgres=# insert into t1 select generate_series(1,10), 't1';
INSERT 0 10
postgres=# insert into t2 values (1,'t2');
INSERT 32796 1
postgres=# insert into t2 values (1,'t2');
INSERT 32797 1
postgres=# insert into t2 values (1,'t3');
INSERT 32798 1
postgres=# insert into t2 values (1,'t4');
INSERT 32799 1
postgres=# update t1 set info=t2.info from t2 where t1.id=t2.id;
UPDATE 1
postgres=# select * from t1 where id=1;
id | info
----+------
1 | t2
(1 row)postgres=# delete from t1 using t2 where t1.id=t2.id;
DELETE 1postgres=# drop table t1;
DROP TABLE
postgres=# drop table t2;
DROP TABLE
postgres=# create table t1 (id int, info text);
CREATE TABLE
postgres=# create table t2 (id int, info text);
CREATE TABLE
postgres=# insert into t1 values (1,'t1');
INSERT 32814 1
postgres=# insert into t1 values (1,'t1');
INSERT 32815 1
postgres=# insert into t1 values (1,'t1');
INSERT 32816 1
postgres=# insert into t2 values (1,'t2');
INSERT 32817 1
postgres=# insert into t2 values (1,'t3');
INSERT 32818 1
postgres=# insert into t2 values (1,'t4');
INSERT 32819 1
postgres=# insert into t1 values (2,'t1');
INSERT 32820 1
postgres=# select * from t1;
id | info
----+------
1 | t1
1 | t1
1 | t1
2 | t1
(4 rows)postgres=# select * from t2;
id | info
----+------
1 | t2
1 | t3
1 | t4
(3 rows)postgres=# update t1 set info=t2.info from t2 where t1.id=t2.id;
UPDATE 3
postgres=# select ctid,* from t1;
ctid | id | info
-------+----+------
(0,4) | 2 | t1
(0,5) | 1 | t2
(0,6) | 1 | t2
(0,7) | 1 | t2
(4 rows)postgres=# delete from t1 using t2 where t1.id=t2.id;
DELETE 3
postgres=# select ctid,* from t1;
ctid | id | info
-------+----+------
(0,4) | 2 | t1
(1 row)
References
- https://jdbc.postgresql.org/documentation/publicapi/index.html
- https://www.postgresql.org/docs/9.6/static/libpq-copy.html