What Does the Future Hold for Next-Generation Cloud Database Technology in the Cloud Native Era?
We are now in an all-cloud era, full of new technologies, innovations, and challenges. More importantly, we now face some important questions that can redefine the way we deal with database technologies. What reforms will be made in the database market in this era? How can cloud service providers offer more efficient and cost-effective database solutions to help more enterprise users seize opportunities presented by cloud migration?
In the database session of 2019 Alibaba Cloud Summit held in Beijing, Feifei Li, Vice President of Alibaba Group, Chief Database Scientist of Alibaba DAMO Academy, and Head of Database Business Group of Alibaba Cloud Intelligence, gave an insightful presentation on the next-generation cloud-native database technologies and the challenges they face.
Database Development and Technical Evolution
According to a database market analysis report released by DB-Engine in January 2019, relational database products are still dominant in the database market. Meanwhile, more market segments, such as the graph database, document database, and NoSQL database segments, are forming in the database market. Another trend in this market is the continuous decline of the market shares of the traditional commercial database giants Oracle, DB2, and Microsoft SQL Server. By contrast, open-source and third-party database market shares keep expanding.
After over 40 years of evolution, database technology is still developing vigorously. Cloud computing vendors have reached a consensus that databases are an important component in the connection of IaaS and intelligent cloud applications. Therefore, the vendors need to improve their capabilities throughout the entire data lifecycle, including data production, storage, and consumption, enabling users to connect IaaS and intelligent applications.
Thanks to the constantly developing database technology, we now have online transaction processing (OLTP) systems to record real-time transaction data and online analytical processing (OLAP) systems to analyze massive amounts of data in real time. OLTP and OLAP systems require the support of database services and management tools. Given these circumstances, NoSQL database solutions have been developed to store semi-structured and non-structured data.
From the late 1970s to early 1980s, relational databases came into being, and later the SQL query language and OLTP systems were developed. The explosive growth in data volumes and the demand for complex data analysis gave rise to data warehousing, as well as OLAP, extract-transform-load (ETL), and other data processing technologies. With the continuous increase of multi-source heterogeneous data, such as graphs, documents, spatial-temporal data, and time series data, non-relational NoSQL and NewSQL database systems have also emerged.
What Type of Database Do We Need in the Cloud Native Era?
Traditional databases typically use a single-node architecture, whereas cloud-native databases usually use a shared storage architecture. Alibaba Cloud PolarDB establishes a shared storage architecture over a high-speed network. This architecture separates storage from computing to enable the fast scale-out of computing nodes. In addition, PolarDB allows for the rapid scaling of storage and computing capabilities based on the actual needs of customers. Customers can use this shared storage database to complete a non-intrusive data migration without any change to the original business logic.
In addition to the cloud-native shared storage technology, a distributed architecture is required to handle highly concurrent access to massive amounts of data. For example, Alibaba is exploring the use of a distributed architecture to cope with the challenges posed by Double Eleven every year. Also, Alibaba Cloud wants to provide different query interfaces, such as SQL, to support queries of data in multiple models and states. Concerning the storage system, Alibaba Cloud hopes to allow users to store their data in different locations and use a unified interface like SQL to query all types of data. Alibaba Cloud Data Lake Analytics (DLA) is a cloud-native technology developed for this application scenario.
Traditional solutions isolate read and write conflicts by using the OLTP system to process transactions and the OLAP system to analyze huge volumes of data. In the cloud native era, Alibaba Cloud will minimize the cost of data migration by taking advantage of the technical benefits delivered by new hardware devices. This can be done by integrating transaction processing and data analytics features in one engine so that these two needs can be addressed seamlessly by one system.
Alibaba Cloud serves a large number of enterprises, which use our cloud resource pools based on a virtualized architecture that separates storage from computing. Therefore, we need to monitor and schedule all off-premises resources in an intelligent way to quickly respond to customer requirements and deliver optimal service quality. To achieve the necessary intelligence, we need to use machine learning and AI to enable automatic sensing, decision-making, recovery, and optimization in all sectors, including data migration, data protection, and elastic scheduling.
Another important technology required in the cloud native era is integrated hardware and software design. New hardware devices are providing many new technologies that will benefit database systems, such as RDMA networking, SSD, NVM, GPU, and IPG acceleration. The Alibaba Cloud PolarDB shared storage service uses a remote direct memory access (RDMA) network to provide access to remote database nodes as fast as local node access.
Many Alibaba Cloud customers require high availability for their financial business systems. Alibaba Cloud databases run high-availability protocols and use a three-replica architecture to implement seamless real-time switching among local databases. In addition, disaster recovery can be implemented by using remote databases based on different customer requirements. The remote databases synchronize data by using Binlog technology to guarantee high availability of financial business. As cloud users attach great importance to data security, the Alibaba Cloud database service provides data encryption as soon as data is written onto disks, safeguarding data throughout its entire lifecycle.
Alibaba Cloud Database Service: A Full Range of Controllable Proprietary Products to Deliver Both Innovation and Commercial Value
Alibaba Cloud provides a series of database service tools, including data backup, data migration, data management, hybrid cloud data management, and intelligent diagnosis and optimization systems. These tools help customers quickly migrate data to the cloud and offer them a hybrid cloud solution.
Core engine products of the Alibaba Cloud database service not only include our proprietary products but also third-party and open-source products. We hope to offer more options with commercial databases and open-source products. Also, we will make use of the technical benefits of cloud computing in our proprietary database products and do more research to help customers eliminate the difficulties and problems they encounter in the application of third-party or open-source databases.
The core OLTP products of the Alibaba Cloud database service are PolarDB and its distributed edition, PolarDB X. In addition, Alibaba Cloud offers mainstream MySQL, PostgreSQL, and SQL Server database systems, along with a series of Oracle-compatible services such as PPAS.
Our core OLAP products include AnalyticDB, DLA for multi-source heterogeneous data, and Time Series and Spatial-Temporal Database (TSDB) for IoT applications. For NoSQL users, Alibaba Cloud offers a wide range of third-party database products, such as HBase, Redis, and MongoDB, as well as Alibaba Cloud Graph Database (GDB).
The Alibaba Cloud database management platform and full-link monitoring service provide intelligent data monitoring and analysis during the entire lifecycle, enabling Alibaba Cloud databases to provide optimal service. Alibaba Cloud builds a hybrid cloud data storage link that integrates online and offline data storage. During migration to the cloud, customers can choose Alibaba Cloud Data Transmission Service (DTS) for real-time data uploading and synchronization. After the migration, customers can use PolarDB or other cloud-native database products to store data, or use DLA, AnalyticDB, or other analytics products to analyze data. Data analysis for specific scenarios can be implemented by using the document database, graph database, or TSDB solution. Customers can use the DTS system for data synchronization and backup between online and offline databases, and use the Hybrid Cloud Database Management (HDM) system to manage all databases. In addition, the Alibaba Cloud database service provides a database management suite that allows customers to manage and develop databases, improving the efficiency of database management and development.
PolarDB and AnalyticDB
Here, we will introduce two cloud-native database products: PolarDB and AnalyticDB.
PolarDB provides efficient shared distributed storage over an RDMA network. The shared storage technology establishes a one-write-multiple-read model among multiple computing nodes. In this model, multiple read-only nodes can appear quickly, adapting to the actual workload to complete computing during peak hours. The shared storage technology also allows for the rapid scaling of storage nodes. Customers can pay for an appropriate amount of PolarDB resources based on their application scenarios and the peak and trough data volumes of their business. This billing model significantly improves the database use efficiency and reduces the cost. In general, PolarDB is a super MySQL database. We will release PolarDB versions compatible with PostgreSQL and Oracle in the future.
In some scenarios, users are faced with the challenges of high concurrency and massive data access needs that exceed the capacity of shared storage. The distributed PolarDB X system uses the sharding partition solution to achieve unlimited scale-out for the storage capacity. PolarDB X will undergo public beta testing later and is currently available for trial use.
Read and write conflicts may occur during the analysis of a massive amount of data. Reading and analyzing a huge amount of data is a complex process. Therefore, we recommend that you use AnalyticDB, the real-time interactive analytical database system of Alibaba Cloud. AnalyticDB features a high writing throughput and provides a storage engine for row- and column-based storage. Therefore, it is capable of real-time interactive analysis and shows excellent performance in its quick response to highly concurrent access to massive amounts of data. Due to its compatibility with MySQL, AnalyticDB can import data directly from MySQL databases, allowing it to process a query of tens of billions of data entries in a matter of milliseconds and write data at millions of TPS.
Data Transmission Service (DTS)
In addition to core cloud-native database products, we also offer multiple database tools, such as DTS. DTS solves data transmission issues for non-cloud customers. After these customers migrate to the cloud, DTS implements real-time data synchronization between their databases off-premises and on-premises or from TP to AP systems. DTS provides efficient synchronization of incremental data to guarantee real-time data consistency. In addition, DTS provides a data subscription capability and can connect to more data sources through different protocols and interfaces.
New Member of the Database Family: GDB
Alibaba Cloud GDB is a new database product currently in public beta testing on the Alibaba Cloud website. It is a real-time reliable online database that supports property graph models and processes highly linked data query and storage using many cloud-native technologies, such as separated storage and computing. Similar to mainstream graph database products, GDB supports standard graph query languages and is compatible with Gremlin syntax. Another key feature of GDB is its support for real-time updates and OLTP data consistency. This allows GDB to guarantee data consistency during the analysis and storage of huge amounts of property graphs. GDB has all the features of a cloud-native database product, such as high service availability and easy maintenance. Its typical scenarios include social networking, financial fraud detection, and real-time recommendations. In addition, GDB supports knowledge graphs, neural networks, and other network models.
Striving for Customer Empowerment: Sound Development at Lower Costs and Higher Efficiency
The Alibaba Cloud database team strives to offer enterprise-grade cloud-native database services. Based on a full range of controllable proprietary technologies, our team offers various services, such as quick cloud migration, centralized management of off-premises and on-premises data, and data security. For example, we have deployed Alibaba Cloud databases in Hangzhou and other cities to support complex applications such as City Brain. Such applications require the storage of both structural and non-structural data and pose major challenges to OLTP, OLAP, and tool products. Alibaba Cloud AnalyticDB, PolarDB, and DTS use cloud-native technologies to seamlessly support these applications.
To give an example, PolarDB went through public beta testing in August 2018 and was put into commercial use at the end of 2018. So far, PolarDB has seen rapid growth on public cloud platforms. The cause of this rapid growth is Alibaba Cloud’s desire to help customers eliminate their difficulties. We do not use new technology to create new needs. Instead, we develop technology to address the current needs of our customers. PolarDB features cloud native elastic storage and computing in minutes, high cost-effectiveness, flexible Pay-as-You-Go billing, and highly concurrent request processing capability. It supports fast scale-out of read-only nodes, provides a large capacity, and uses a distributed shared storage architecture to create an experience similar to single-node database access. In addition, it does not intrude into the business logic of customers and is highly compatible with MySQL.
AnalyticDB is a real-time interactive analytical system. Data produced locally or obtained from a big data system can all be migrated to an AnalyticDB cluster by DTS. Then, the data can be used for business analysis, visual presentation, and interactive query. AnalyticDB supports join queries on several hundred tables, delivering millisecond-level query services to users. Today, many users are using cloud-native databases like PolarDB and AnalyticDB on Alibaba Cloud. Cloud-native databases are helping them eliminate difficulties in their applications and realize greater business value.
In summary, a series of new technologies and challenges have emerged in the cloud native era. Facing these challenges, we must consolidate database kernel products, management platforms, and database tools to deliver the most efficient solutions with the highest commercial value. Alibaba Cloud sincerely invites you to experience our products and technologies and hopes to help more customers solve their problems. We also hope more developers and partners in the ecosystem will develop in-depth solutions for specific industries and fields based on our database services and products. Let’s work together to increase the prosperity of the database market in the cloud native era.