Table Store is Alibaba Cloud’s first distributed multi-model database, which is a NoSQL database. At present, many application systems no longer rely solely on relational databases in the underlying layer, but use different databases according to different business scenarios. For example, cache KeyValue data will be stored in Redis, document data will be stored in MongoDB, and graphic data will be stored in Neo4J.
Looking back at the development of NoSQL: NoSQL was created in the era of Web 2.0. The era of rapid development of the Internet has also brought about an explosion of Internet data. Traditional relational databases cannot handle such massive amounts of data, and a distributed database with high scalability is required. However, it is very challenging to implement a high-availability and scalable distributed database based on the traditional relational data model. Data models for most of the data on the Internet are simple, and do not require a relational model for modeling. If a simpler data model could be used instead of relational model to model data, weaken transactions and constraints, and aim at high availability and scalability, then databases designed in this way would better meet the business requirements. Such a concept has promoted the development of NoSQL.
In summary, the development of NoSQL is based on the new challenges of business and the new needs for the database in the Internet era. Developed based on this, NoSQL has distinctive features:
- Multi-Data Model: To meet the needs for different data types, many different data models have been created, such as KeyValue, Document, Wide Column, Graph and Time Series. This is one of the most notable features of NoSQL database, which breaks through the constraints of the relational model and chooses a development orientation of diversity. The choice of the data model is more scenario-oriented, more conforms to the actual needs of the business, and can be further optimized.
- High Concurrency and Low Latency: The development of NoSQL database is mainly driven by the demand of online business, and its design goal is more to provide high-concurrency and low-latency access for online business.
- High Availability: To cope with the explosive growth of data volume, scalability is one of the core design goals, so a distributed underlying architecture tends to be considered at the beginning of design.
The development trends on DBEngines indicate that various NoSQL databases have undergone significant development in recent years. As a distributed NoSQL database, Alibaba Cloud’s TableStore uses a multi-model architecture in terms of the data model, and supports both Wide Column and Time series.
The Wide Column model is a classic model proposed by BigTable and widely used by other systems of the same type. At present, most of the semi-structured and structured data in the world are stored using this model. In addition to the Wide Column model, we have also introduced another new data model: time series, a new generation model for message data, which is suitable for the storage and synchronization of messages in messaging systems such as IM, feeds, and IoT device pushdowns, and is now widely used. Next, the two models are described in detail.
Wide Column Model
The above is a schematic diagram of the Wide Column model. To better understand this model, we take a relational model for comparison. A relational model can be simply understood as a two-dimensional model consisting of rows and columns, with schema fixed for each row. So the features of a relational model are: two-dimension and fixed schema, which is the simplest understanding, aside from transactions and constraints. The Wide Column model is a three-dimensional model with an additional dimension of time to the two dimensions of row and column. The time dimension is reflected in the attribute column, which has multiple values, each value corresponding to a timestamp as the version. And each row is schema free, with no strong schema definition. So the differences between Wide Column model and the relational model are: three-dimension, schema free, and simplified transactions and constraints.
This model consists of:
- Primary key: Each row has a primary key, which consists of multiple (1–4) columns. The primary key is defined with a fixed schema, and is mainly used to uniquely identify a row of data.
- Partition key: The first column of the primary key is called the partition key, which is used to partition the table. Each partition is distributed to a different machine for service. Within the same partition key, cross-row transactions are provided.
- Attribute column: All columns except for the primary key column in a row are attribute columns. An attribute column corresponds to multiple values, different values correspond to different versions, and a row can store an unlimited number of attribute columns.
- Version: Each value corresponds to a different version, and the value of which is a timestamp that defines the time to live of the data.
- Data type: Table Store supports a variety of data types, including String, Binary, Double, Integer, and Boolean.
- Time To Live: Time to live can be defined for each table. For example, if the time to live is configured as one month, the data written in the table data before one month will be automatically cleaned up. The write time of the data is determined by the version, which is generally determined by the server time when the data is written on the server end, and can also be specified by the application.
- MaxVersion: The maximum number of versions saved in each column can be defined for each table, which is used to control the number of versions in a column. Data which exceeds the max number of version will be automatically cleaned up.
The features of the Wide Column model are summarized as: three-dimensional structure (row, column, and time), wide row, multi-version data, and time-to-live management. Also, in terms of data operation, the Wide Column model provides two data access APIs, Data API and Stream API.
The Data API is a standard data API that provides online data read/write, including:
- PutRow: Insert a new row, and overwrite the same row if it exists.
- UpdateRow: Update a row, which can be used to add or delete the attribute columns in a row, or update the values of the existing attribute columns. If the row does not exist, a new row will be inserted.
- DeleteRow: Delete a row.
- BatchWriteRow: Update multiple rows of data in multiple tables in batch, which can combine PutRow, UpdateRow, and DeleteRow.
- GetRow: Read data from a row.
- GetRange: Scan data in a range, either in ascending or descending order.
- BatchGetRow: Read multiple rows in multiple tables in batch.
In the relational model database, there is no standard API for the incremental data in the database, while in many application scenarios of traditional relational databases, the use of the incremental data (binlog) cannot be ignored. This is widely used inside Alibaba, and provides the DRC middleware to fully utilize this part of the data. After fully utilizing the incremental data, we can do a lot of things in terms of the technical architecture:
- Heterogeneous data source replication: MySQL data can be incrementally synchronized to NoSQL for cold data storage.
- Integration with StreamCompute: MySQL data can be analyzed in real time for some control room display applications.
- Integration with the search system: The search system can be extended to a secondary index of MySQL to enhance the data retrieval in MySQL.
However, even if the incremental data of a relational database is useful, the industry does not have a standard API definition to get this data. TableStore has long recognized the value of this data, and has provided a standard API to fully utilize the data. Here is our Stream API (documentation).
The Stream API generally includes:
- ListStream: Get the stream of the table and the ID of the range stream.
- DescribeStream: Get the details of the stream, and pull the shard list and the shard tree in the stream.
- GetShardIterator: Get the iterator for the current incremental data of the shard.
- GetStreamRecord: Get the incremental data in the shard according to shard iterator.
The implementation of TableStore Stream is much more complicated than MySQL Binlog, because TableStore has a distributed architecture, and Stream also has a distributed incremental data consumption framework. The data consumption of Stream must be obtained in an order-preserving manner. The shards of the Stream correspond to the partitions of the table inside the TableStore. The partition of the table may be split and merged. To ensure the data consumption for the old and the new shards after partition splitting and merging is order-preserving, we have designed a more sophisticated mechanism. The design of TableStore Stream is not described here, and we will provide more detailed design documents later.
Because the complexity of Stream’s internal architecture also impacts the Stream’s data consumption side, it is not easy for users to use the Stream API. A new data consumption channel service we planned this year is coming soon, to simplify the data consumption of Stream, and provide a simpler and easier to use API.
Time series Model
The time series model is a new data model that we have created for the message data scenarios. It can meet the special requirements of message data scenarios for message order preserving, massive message storage, and real-time synchronization.
The above is a schematic diagram of the time series model, which abstracts the data in a large table into multiple time series. There is no limit for the number of time series in a large table.
A time series consists of:
- Time series ID: The ID uniquely identifies a time series.
- Time series Meta: Time series metadata, which can contain any key-value pair attributes.
- Message Sequence: It carries all messages in the time series. Messages are stored in order in the sequence, and are assigned with incremented IDs according to the write order. A message sequence can carry unlimited number of messages. A message ID can be randomly used to locate a message within the sequence, and scan in either ascending or descending order can be provided.
- Message Entry: It contains the detailed content of the message and can contain any key-value pairs.
The time series model is similar to the message queue in terms of logic, and a time series is similar to the topic in a message sequence. The difference is that the TableStore time series is more focused on the scale of topics. In the instant messaging scenario, both the inbox or outbox of a user is a topic. In the IoT message scenario, each device corresponds to a topic, and the scale of topics will reach the order of 10 million or even 100 million. TableStore time series is based on the underlying distributed engine. A single table can theoretically support the unlimited number of time series (topics), which simplifies the sequence’s Pub/Sub model. It also supports message order preservation, random positioning, and scan in ascending and descending orders. It better meets the requirements of scenarios with massive message data, such as instant messaging (IM), feeds, and IoT messaging systems.
Time series is a new data model launched last year, which is constantly being optimized. Based on this model, we have helped DingTalk, Cainiao Smart Customer Service, Taopiaopiao Xiaojuchang, Smart Device Management, and other services to build messaging systems for instant messaging, feeds, and IoT messages.
To learn more about Alibaba Cloud Table Store, visit www.alibabacloud.com/product/table-store