PostgreSQL Feed Systems Similar to Weibo — Design and Performance Metrics

By Digoal

Background

Frequently used features in systems similar to Weibo

This article describes scenarios that are related message pushing and message consumption.

Take a 2048-letter-long message for example.

Design

To ensure efficient pushing and consumption, when designing, we need to consider partitioning. Partitioning is also helpful for transparent database sharding later.

For example, we can perform hash partitioning by UID.

1. Create a Hash Partition Table

Create a message push table

Create a partial index because only records that are not consumed are a concern during consumption.

Create 1024 partitions

2. Write Data by Using UDFs

Currently the writing and querying efficiency of RDS PostgreSQL 10 partition tables is not very good. To ensure relatively ideal writing efficiency, it is recommended to use UDFs and dynamically join SQL.

Writing Performance

Consider 2 billion users. Enter a user randomly and push a message that is made up of 2048 English letters.

In PostgreSQL 10, a single instance can write 195,000 rows/s. The performance bottleneck occurs mainly due to writing LOCK of the WAL log.

Consume UDFs

Currently the writing and querying efficiency of RDS PostgreSQL 10 partition tables is not very good. To ensure relatively ideal writing efficiency, it is recommended to use UDFs and dynamically join SQL.

Consumption example

Consumption Performance

In order to make the actual consumption obvious and observable, at least 20 entries are consumed each time. Here we generate a batch of intensive data first and then test the performance.

Enter a user randomly and consume 20 rows each time. Perform consumption 27000 times per second.

Summary

Writing and querying performance for partition tables will be considerably improved in after PostgreSQL 11. In the future, partition tables can be directly used to avoid using UDFs and dynamic SQL to access partitions.

The built-in UDFs of PostgreSQL (such plpgsql, plpython, pljava, and plv8) support the expansion of any sorting algorithms.

Performance metrics for a single instance

Push | Consume

195,000 rows/s | 540,000 rows/s , 27,000 times/s (An average of 20 rows are consumed each time)

Original Source

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.