Alibaba X-Engine Empowers Fast DDL Operations

By Wangde

X-Engine is a proprietary storage engine from Alibaba. As an optional engine of Alibaba Cloud ApsaraDB RDS for MySQL, X-Engine not only features high performance at a low cost, but also incorporates many new useful features. This article describes how MySQL (with X-Engine) completes quasi-instant Data Definition Language (DDL) operations, which may take several hours to complete using traditional databases.

Challenges for Database DDL Operations

As Internet services are developing rapidly, frequent changes in application models have become the norm. Correspondingly, database access patterns and schemas also change. But as we discussed before, DDL operations require a long execution period and are resource-consuming. They need additional disk space to create temporary tables, which compromises system throughput. Moreover, if an instance crashes during a DDL operation, the recovery process is very slow.

Let’s take a look at an example. The following steps are performed in MySQL by taking the “adding a column” operation as an example:

  1. Create an empty table in the new schema.
  2. Copy data to the new table, assign the values in the new column as default values, and update the index table. DML operations that the database receives are recorded in a temporary file.
  3. Add an exclusive lock to block writes, and apply the DML operations recorded in the temporary file to the new table. If the number of the DML operations is huge, this step takes a long time to complete.
  4. Delete the old table and name the new table after the old table.

Evidently, this process has a long locking time. Meanwhile, the data replication operation consumes system resources and temporary space, and involves high I/O traffic. To adapt to the changing services, Fast DDL operations that do not instantly modify the data at the storage layer and can be completed fast become a necessity. MySQL 8.0 introduced the instant add column feature, which supports quick modification of only table metadata in order to add a column. However, it does not support other types of DDL operations. X-Engine stores multiple versions of table schemas. By using it, each line of record is parsed at the engine layer, and format conversion is done based on the updated schema. Therefore, X-Engine supports multiple types of Fast DDL operations.

Fast DDL Implementation Solutions in the Industry

MySQL 8.0

Write: Record in the new format.

Read: Populate the new column with the default values stored in the system table.

Supported types:

• Change index option
• Rename table
• Set/drop default
• Modify column when the table is empty
• Add/drop virtual columns
• Add columns

MariaDB10.3

Supported types:

• Add column
• Drop column
• Extend VARCHAR maximum (Only if the physical format allows; not VARCHAR(255) to VARCHAR(256))

Aurora

The select request concatenates the old version of records into the new version of records.

Supported types:

• Only supports adding nullable columns, without default values

Multi-version Schemas of X-Engine

X-Engine adopts LSM-tree architecture. In this architecture, newly written data is appended to memtable in the memory. When the size of memtable reaches a threshold, it is switched to immutable memtable and stops receiving further modifications. The table is gradually flushed for persistent storage in the form of fixed-size extents. When the extents reach a certain amount, the system compacts multiple versions of extents with the same key. To ensure that each line of record can be parsed, the most intuitive and streamlined method is to append the metadata to the record. To implement independent record parsing from the system table, X-Engine stores the metadata in detail. However, appending the metadata to each line of record consumes a large amount of space. To cut storage costs, we ensure that the data schema of each memtable is consistent with that in each extent, and store the schema information in the memtable and extent.

The schema contains key information such as the number of columns, the column type, the column length, and default values. By using such information, X-Engine can parse the columns before returning the result. In addition, it returns only the result of querying the target columns. The following example shows how to return results when a table contains different schema versions of extents.

Fast DDL Implementation of X-Engine

Read Requests

Update the X-Engine Schema

Write Requests

Flush/Compaction

Summary

Learn more about Alibaba Cloud ApsaraDB RDS databases at https://www.alibabacloud.com/product/apsaradb-for-rds-mysql

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.