Learn How the Column-Store Meta Scan Accelerates Query Performance in AnalyticDB for PostgreSQL

Alibaba Cloud
8 min readJun 2, 2020

Alibaba Cloud AnalyticDB for PostgreSQL is a massively parallel processing based data warehouse service on Alibaba Cloud. Its kernel was evolved from the Greenplum open-source database based on the PostgreSQL engine, with improved analysis performance. The column-store meta scan function is key to improving database query performance.

AnalyticDB for PostgreSQL supports column stores and features high data compression performance and superior query performance. However, given that the database system reads data from the entire column or creates a B-Tree index when processing query criteria with a high filter rate, two constraints affect indexing. The first is that indexes are not compressed, which can lead to data bloat. The other constraint is that large result sets lead to index failure and a higher index cost than that of tablescan.

To combat these constraints, AnalyticDB for PostgreSQL provides the meta scan function, which features excellent filter performance and extremely small storage usage.

As a further enhancement of column store tables, the meta scan function collects the maximum and minimum values from the columns of a column store table to implement a filtering function similar to indexing based on the block-offset mechanism. The metadata stores the maximum value, minimum value, and offset, all of which are collected from each block of columns. The maximum and minimum values are used for conditional filtering. The block offset is used to skip the blocks that do not meet filter conditions. This minimizes the I/O of reading column stores and the CPU consumption of block decompression, ultimately improving query performance. In a TPC-H 1 TB test, query performance increased by 29% and can be increased by up to 500% in queries based on conditions with a high filter rate, such as Q06, Q12, Q14, and Q15 of TPC-H.

Implementation of the Meta Scan Function

The meta scan mechanism of Alibaba Cloud AnalyticDB for PostgreSQL relies on metadata that is similar to metadata in the open-source ORC and Parquet formats. The metadata is used to filter out the blocks of a column store table that do not meet specified conditions. In short, this helps improve query performance.

As shown in the figure above, AnalyticDB for PostgreSQL divides the data of each column store table into two parts. One part is the table data, and the other part is the metadata, which is collected into storage. When the table is scanned, the meta scan function reads the metadata, filters out the blocks that do not need to be read based on the minimum and maximum values, and directly reads the blocks that meet conditions based on the block offset. Then, it returns the tuple to the executor, which returns the calculation result to the client.

In AnalyticDB for PostgreSQL, metadata is stored in the following format:

Metadata is divided into row group and batch group metadata by row.

  • Row group metadata: The metadata of each table consists of multiple row group metadata records. The maximum and minimum values of each row group metadata record reflect the metadata of 10,000 consecutive rows. Each row group metadata record contains 10 row batch metadata records. As shown in the figure above, row group metadata uses this formula: (min, max) = (1, 4000).
  • Row batch metadata: The max and min values of each row batch metadata record reflect the metadata of 1,000 consecutive rows. Unlike row group metadata, row batch metadata also records the offset of the block where the first row to be overwritten resides.

The meta scan function reads row group metadata in sequence during the scan process. If the filter conditions meet the minimum and maximum values of the current row group metadata record, the meta scan function traverses each row batch metadata record. Moreover, if the filter condition meets the minimum and maximum values of the current row batch metadata record, then the meta scan function locates the block of the file based on the block offset. Otherwise, the meta scan function will scan the next row batch metadata record. Last, if the filter conditions do not meet the minimum and maximum values of the current row group metadata record, then the meta scan function reads the next row group metadata record until all data is traversed.

For the sake of simplicity as well as improved transaction attributes, metadata is stored in auxiliary tables in the form of heap tables. A meta table is also created when a column store table is created. As such, consistency and atomicity are guaranteed for metadata and table data through the multiversion concurrency control (MVCC) feature of heap tables and the transaction mechanism.

This requires the overwriting of all data with metadata. This means a table must update each of its rows to the latest metadata, which may be frequently updated if data is inserted by using a single INSERT statement. This reduces the write performance and the metadata scan performance during queries. To solve this problem, we divide each meta table into two tables: the primary meta and secondary meta tables. The metadata of the primary meta table is synchronously updated to the secondary meta table when data is inserted into the primary table or when the primary table is updated. The metadata of the secondary meta table is transferred to the primary meta table when each row group metadata record overwrites 10,000 rows. Query requests are only sent to the primary meta table. This prevents the primary meta table from bloating due to frequent insert and update operations involving a small amount of data. When metadata is maintained in this way, the meta scan function implements special processing on the rows that are not overwritten by data of the primary meta table based on the following logic:

Metadata is collected on a segfile basis. After all of the metadata of a segfile is read, the tuples at the end of the segfile are returned to the executor in sequence. If some segfiles are not recorded in the primary meta table, these segfiles are scanned in sequence after the primary meta table is scanned. As such, the meta scan function can scan all table data.

Currently, the meta scan function supports the following data types and operators:

  • Supported data types: int2, int4, and int8; float4 and float8; time, timetz, timestamp, and timestamptz; varchar, text, and bpchar; and cash.
  • Supported operators: =, <, <=, >, and >=.

AnalyticDB for PostgreSQL provides the sortkey function used to sort data in a single table partition based on specified columns. The meta scan function performs better when combined with the sortkey function. Despite effective filtering, the meta scan function does not perform well when column values are not evenly distributed in a data file. Only a few blocks are skipped when column values are scattered across many blocks. You can create a sortkey for the table so that the data in a single partition is sorted based on filter fields. As such, identical values can be located in consecutive blocks, allowing the meta scan function to skip many blocks during the scan process. This improves scan performance.

TPC-H Test

TPC-H is a standard test set for testing the performance of online analytical processing (OLAP) data warehouses. This test primarily measures the response time of each query, or more specifically, the time that elapses from query submission to the return of results. We tested the acceleration effect on TPC-H through the meta scan function in AnalyticDB for PostgreSQL. For the test, we created a sortkey for four tables and performed the TPC-H test on a hard disk drive (HDD) with 1 TB data.

The TPC-H test results are as follows.

The test results show that performance improved by up to about 500% for Q03, Q04, Q06, Q10, Q12, Q14, and Q15. The overall TPC-H performance improved by 22% using the Geomean function.

The meta scan function significantly improved the TPC-H performance for queries with strong filter conditions, but it did not improve the performance for full table scans and table join operations.

Upgrade Your Existing Instances with the Meta Scan Function

This section describes how to use the meta scan function in existing instances of AnalyticDB for PostgreSQL. The meta scan function supports forward compatibility with storage formats. After a minor version upgrade, the metadata of column store tables must be updated for AnalyticDB for PostgreSQL instances before using the meta scan function. You can use the following user-defined function (UDF) to update the metadata of a table:

tcount INT := 0;
IF tcount > 1 THEN
RAISE EXCEPTION 'found more than one table of name %', tname;
ELSEIF tcount = 0 THEN
RAISE EXCEPTION 'not found target table in pg_aocsmeta, table name:%', tname;

AnalyticDB for PostgreSQL provides the rds_enable_column_meta_scan parameter to enable and disable the meta scan function. You can use the following SQL statements to enable or disable the meta scan function for the current session:

-- disable metascan
set rds_enable_column_meta_scan = off;
-- enable metascan
set rds_enable_column_meta_scan = on;
-- show metascan is enable?
show rds_enable_column_meta_scan;

To enable or disable the meta scan function at the instance level, submit a ticket to contact an Alibaba Cloud customer services representative.


In AnalyticDB for PostgreSQL, the column-store meta scan function is used to filter out non-compliant blocks based on the maximum and minimum values of row group and batch group metadata so that only compliant blocks need to be read. This reduces the scan I/O and the CPU consumption during block decompression. Therefore, the meta scan function can significantly improve the performance of queries with strong filter conditions.

AnalyticDB for PostgreSQL is built on the open-source databases PostgreSQL and Greenplum, and is maintained by the ApsaraDB OLAP database team. In the near future, Alibaba Cloud will make all of the functions connected with AnalyticDB for PostgreSQL will be made open source as a contribution to the open-source community.

Original Source:



Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com