Behavior and Audit Log Modeling: PostgreSQL Best Practice (1)

By Digoal

Many systems retain users’ behavior logs, which include information like browsing behaviors, social behaviors, and operation behaviors. Typical application scenarios include SQL database auditing and behavior auditing of bastion hosts within enterprises.

The number of behavior logs and audit logs is related to the business volume and the number of operations. To meet the real-time query requirements of enterprises, it is usually necessary to build search engines, for example, by using the full-text search feature in ES or PostgreSQL.

Using PostgreSQL to build search engines has some advantages and can meet several requirements:

1. Store details. In addition to fields that need indexing, the Details field can also be stored in PostgreSQL.

2. Create an index, that is, create a full-text index in the log behavior field.

3. Create multidimensional indexes. In addition to an index in the log behavior field, other fields can also be indexed, such as indexes for the time dimension and the property dimension. These indexes can be combined to meet multidimensional search requirements.

4. Search for information in real time. Synchronization with search engines is not required.

Example

Testing Machine

Use a large and inexpensive SATA disk and use an SSD as a BCACHE cache.

Create a directory in each disk.

Operating System Configuration

Skipped

Compile PostgreSQL

Environment Variables

Initialize Clusters

Create 12 database clusters corresponding to each disk to fully utilize the disk I/O.
vi init.sh

Configure Parameters

Start Clusters

Bind database instances to different CPU cores

Stop Clusters

Modeling

Table Structure

Four fields: Storage PK (corresponding to original detail data PKs), Time, User ID, and User Behavior (tsvector field).

Retrieval Requirements

Data can be retrieved by time interval, user ID, and word-breaking criteria.

Retention Period

A time period after which retained logs will be cleaned (for example, one month).

Partition

Create multiple partitioned tables in each cluster (for example, 12 partitioned tables are used in this example).

If practical, we recommend that you create a partitioned table every hour so that you can query data within a specific time interval simply by using the partitions, without having to create time indexes.

If a single user has a very large amount of data, we recommend that you also create hash or LIST partitions by UID in order to query by UID instead of using indexes. (This eliminates the need to create an index in the UID field or even the need to store the UID field itself.)

Indexes

Use full-text indexes in the behavior field.

Use B-tree indexes in the user ID field.

Use BRIN indexes in the time field.

Simulate Data

Time (generated in a time series).

User ID (randomly generated in a specific range)

User behavior data (strings about 512 characters long, split into several tokens, for example, 40 tokens of different lengths in this example).

Initialize Data Tables

Test

Pour the test data, for example, insert 200 million pieces of data into each table and 2.4 billion pieces of data into each database (around 6 TB). The total amount of inserted data is 28.8 billion pieces (around 72 TB).

Pour 10 pieces of data in each batch.

The query test data is as follows. The test data is highly random, and the content of each record is about 40 elements, no longer than 512 characters in length.

Launch a full-text retrieval request, enter four query criteria, and return PKs by stream.

We recommend that you use the streaming return interface because the result set may be very large.

Stress testing

Resource Usage

dstat

The CPU is basically exhausted and the disks are substantially occupied after data is written.

User data consumes most of the CPU overhead (you can use perf to check the performance later).

Disk Usage

perf

The postgres process consumes most of the overhead.

top

TPS of Data Writing

The performance of writing data is about 65,000 rows per second on a single machine.

The performance of writing data basically depends on the number of elements in the tsvector field and hash. Each record has 40 elements in this example. If the number of elements is reduced by 50%, the performance will be about twice as good.

Evaluate How Many Index Entries Are Created per Second

1. Full-text search index entries

Each record has about 40 elements. Around 2.6 million index entries will be created per second in the event of 65,000 TPS.

2. The number of UID index entries is negligible.

3. TS index entries use a BRIN index and are negligible.

Creating full-text index entries has the most significant impact on the performance and consumes the most resources.

Query Performance

Example

Performance of Data Writing to SSD with fsync=on

  1. Data writing testing per second

Up to 70,000 records are written per second, and around 2.8 million full-text index entries are created per second.

The performance is relatively smooth.

Performance of Data Writing to SATA and SSD Bcache with fsync=off

  1. Data writing testing per second

Up to 75,000 records are written per second, and around 3 million full-text index entries are created per second.

The performance is relatively smooth.

Summary

1. Query aggregation

We recommend that you use plproxy to aggregate queries because log data is distributed among multiple clusters and tables.

2. Write to shards

Writing to shards can be completed in the business layer to write data in a scrambled manner.

In actual application scenarios, you can create more partitions as needed.

3. The majority of the overhead is incurred on Postgres. To perform detailed analysis, re-compile Postgres

4. GIN index optimization

https://www.postgresql.org/docs/9.6/static/sql-createindex.html

The purpose of gin_pending_list_limit is to delay the merging action, because a record may involve too many GIN keys. If updated in real time, a large number of GIN indexes need to be written, influencing the performance.

In this example, gin_pending_list_limit is set to 2 MB, and the TPS is relatively stable. If a larger value is set for gin_pending_list_limit, significant fluctuation will occur in the event of insufficient CPU resources.

You can set a proper gin_pending_list_limit value based on specific test requirements.

5. If PostgreSQL is used entirely as an index database and data loss can be tolerated, you can set the fsync switch to off. (fsync on checkpoints has big impact on the I/O performance. Because a SATA disk is used in this example, significant performance fluctuation will occur if fsync is on.)

The risk of data loss is much lower if databases are designed to have high availability. (However, after a server crash, rebuilding a standby database is not easy due to the massive amounts of data.)

We recommend that you use more database instances and make the size of each instance controllable (for example, smaller than 2 TB), so that the time for rebuilding a standby database is relatively manageable.

6. In order to achieve better response time (RT), we recommend that you store detail data and indexes separately, because writing detail data requires short RT while indexes can tolerate some latency. In addition, the reliability requirements for detail data and indexes are different.

Reference:https://www.alibabacloud.com/blog/behavior-and-audit-log-modeling-postgresql-best-practice-1_594865?spm=a2c41.12952233.0.0

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store