Automatic Cleaning and Scheduling of PostgreSQL Rotate Tables — Constraints and Triggers

Background

Time series scenarios usually have the following requirements:

1) Expired historical data must automatically clear up as fast as possible to prevent write-ahead logging (WAL). Either the drop or truncate method can be used.

2) PG supports partition tables. It is better to use partitions. However, this requires the constant creation of partitions. Currently, pg_pathman automatically creates partitions. But partitions aren't automatically deleted.

So, now the critical question is how to proceed ahead?

Method

Let’s take a look at the proposed solution to fix the issue raised in the preceding section:

1) Specify a fixed number of partition tables (similar to list partitions) in a period. For example, there’s a need for 24 tables ending with hours, 7 weekly tables, and 60 minute-based tables.

2) Trigger and clean (truncate) old data based on data insertion time.

This method fixes the written code without necessitating any further changes and creates all the necessary data tables at one time.

Example

Step 1) Create all detailed partition tables required in a period. Create a total of six tables, each for 10 minutes as shown below.

Step 2) Create a constraint table or a scheduling table (because it’s not possible to trigger a truncate event for each record). Also, clean data generated more than 10 minutes ago at every 10-minute interval as shown below.

create table t_const(crt_time timestamp primary key);

Step 3) Create a trigger for the partition table, truncate data by period, and write the data into the constraint table.

Step 4) Execute the following commands to create a constraint table trigger to trigger the rules for cleaning detailed partitions.

With this the automatic cleanup is complete! Do give a try yourself.

The following snippet displays the automatic cleanup.

If data is written to the same partition later, automatic cleanup will not be triggered. This complies with requirements. Automatic cleanup is triggered only by the first piece of written data.

Summary

Data can be stored by partition. If you use a native partition table, a shared lock is added to the primary table and all its partitions during data writing, and hence the user cannot perform truncate. We recommend using the pg_pathman partition or directly write data into a partition.

Also, note that the truncate method prevents WAL log generation during data cleanup, and hence triggers are used to enable automatic data cleanup.

References

Note: The above articles are in Chinese

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.