Real-Time Conversion in Data Loading — Triggers and Rules

Image for post
Image for post

Background

PostgreSQL is a powerful database that supports more data types and index interfaces than various open-source and commercial databases do. Some users migrate data from other data sources to PostgreSQL or have data from other data sources, most of which are strings and numbers. PostgreSQL allows converting the data to more accurate descriptions (data types), which improves performance or functionality.

Let’s take a quick look at the various examples.

1) Longitude and latitude, which may be two fields in other databases.

PostgreSQL supports the geometry type and GIS types such as point, line, plane, grid, and TOP.

2) Data scope, which may be two fields in other databases that respectively indicate the upper limit and lower limit.

PostgreSQL uses range to indicate data types such as numbers, time, and IP addresses.

3) IP address, which may be saved as strings in other databases.

PostgreSQL supports the network type.

4) JSON, which may be saved as strings in other databases.

PostgreSQL supports the JSON type.

5) Image feature value, line segment, polygon, circle, UUID, XML, and array, which may be saved as strings in other databases.

PostgreSQL supports all these types.

6) Full-text search, which may require the support of search engines in other databases.

PostgreSQL supports full-text search.

7) Enumeration type, which may be saved as strings in other databases.

PostgreSQL supports the enumeration type.

8) User data in JSON format, which needs to be saved as structured data in other databases.

PostgreSQL supports the JSON data type, indicating that it’s easy to save JSON data directly or save formatted data. Also, it allows converting JSON data into structured data during import.

Next, we need a way to convert string data into the types supported by PostgreSQL smoothly in real-time. PostgreSQL supports triggers and rules. You may use either method to smoothly convert data.

Example 1) Rule Conversion

The data source is the longitude and latitude represented by two fields, which need to be converted into the geometry type in real-time.

Step1: Create the source table structure.

postgres=# create table nt(id int, c1 numeric, c2 numeric);    
CREATE TABLE

Step2: Create the target table structure.

postgres=# create table nt_geo (id int, geo geometry);    
CREATE TABLE

Step3: Create a rule or trigger for the source table. An example command is shown below.

postgres=# create rule r1 as on insert to nt do instead insert into nt_geo values (NEW.id, ST_MakePoint(NEW.c1,NEW.c2));    
CREATE RULE

Step4: Use the source data structure to insert data into the source table.

postgres=# insert into nt values (1,1,1);    
INSERT 0 1

Data is automatically written to the target table. The source table is only a conversion entry and does not store data.

postgres=# select * from nt;    
id | c1 | c2
----+----+----
(0 rows)

postgres=# select * from nt_geo ;
id | geo
----+--------------------------------------------
1 | 0101000000000000000000F03F000000000000F03F
(1 row)

Example 2) Rule Conversion

Convert JSON data into structured data.

Step1: Consider the source table, JSONB unstructured

postgres=# create table t1 (id int, info text, j jsonb);  
CREATE TABLE

Step2: Consider the target table, structured.

postgres=# create table t2 (id int, info text, c1 int, c2 int, c3 text);  
CREATE TABLE

Step3: Create a rule for the source table to automatically convert JSONB unstructured data into structured data for insertion.

postgres=# create rule r1 as on insert to t1 do instead insert into t2 values (NEW.ID, NEW.INFO, ((NEW.J)->>'c1')::int, ((NEW.j)->>'c2')::int, (NEW.j)->>'c3');  
CREATE RULE

Step4: Perform the insert test to complete the conversion.

postgres=# insert into t1 values (1,'test',jsonb '{"c1":1, "c2":2, "c3":"text"}');  
INSERT 0 1
postgres=# select * from t1;
id | info | j
----+------+---
(0 rows)

postgres=# select * from t2;
id | info | c1 | c2 | c3
----+------+----+----+------
1 | test | 1 | 2 | text
(1 row)

In the copy test, rule conversion does not work for copy because the copy interface does not trigger rules.

postgres=# copy (select 1,'test',jsonb '{"c1":1, "c2":2, "c3":"text"}') to '/tmp/test';  
COPY 1
postgres=# copy t1 from '/tmp/test';
COPY 1
postgres=# select * from t1;
id | info | j
----+------+----------------------------------
1 | test | {"c1": 1, "c2": 2, "c3": "text"}
(1 row)

postgres=# select * from t2;
id | info | c1 | c2 | c3
----+------+----+----+------
1 | test | 1 | 2 | text
(1 row)

Example 3) Trigger Conversion

Step1: Delete a rule.

postgres=# drop rule r1 on t1;  
DROP RULE

Step2: Create a trigger function.

postgres=# create or replace function tg() returns trigger as $$  
postgres$# declare
postgres$# begin
postgres$# insert into t2 values (NEW.ID, NEW.INFO, ((NEW.J)->>'c1')::int, ((NEW.j)->>'c2')::int, (NEW.j)->>'c3');
postgres$# return null;
postgres$# end;
postgres$# $$ language plpgsql strict;
CREATE FUNCTION

Step3: Create a before trigger.

postgres=# create trigger tg before insert on t1 for each row execute procedure tg();  
CREATE TRIGGER

Step4: Clear data and insert it again. In both the insert and copy tests, data conversion is supported.

postgres=# truncate t1;  
TRUNCATE TABLE
postgres=# truncate t2;
TRUNCATE TABLE
postgres=# copy t1 from '/tmp/test';
COPY 0
postgres=# insert into t1 values (1,'test',jsonb '{"c1":1, "c2":2, "c3":"text"}');
INSERT 0 0
postgres=# select * from t2;
id | info | c1 | c2 | c3
----+------+----+----+------
1 | test | 1 | 2 | text
1 | test | 1 | 2 | text
(2 rows)

postgres=# select * from t1;
id | info | j
----+------+---
(0 rows)

That’s it! The update and delete operations are performed in a similar way. You only need to create update and delete rules.

Summary

If the copy function is not required, just use a rule. If both copy and insert are required, use a trigger.

Finally, input data in the original format and use UDF conversion for querying. To create an index, use expression indexes. For example, some input values of the array type are separated by commas (,), while the PG format is {a,b,c}.

postgres=# select regexp_split_to_array('a,b,c,d,e', ',');
regexp_split_to_array
-----------------------
{a,b,c,d,e}
(1 row)

Therefore, use this function to convert a multi-value string into an array and create an array function.‘

Original Source:

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store