Exploration and Practice of Database Disaster Recovery in the DT Era

  1. Logical errors, including software bugs, virus attacks, and corrupted data blocks
  2. Physical damage, including server damage and disk damage
  3. Natural disasters, such as fires and earthquakes that can destroy data centers

Enterprise-Class Database Disaster Recovery System

Definition of Disaster Recovery

Disaster recovery refers to creating backups and disaster tolerance.

  1. Backing up is to prepare one or more copies of the important data generated by application systems or those of critical original data.
  2. Disaster tolerance is to deploy two or more IT systems with the same functions in two separate locations that are distant from each other in the same or different cities. These systems monitor the health of each other and support switchover upon failures. In the event that a system stops working due to an accident (a natural or man-made disaster), the entire application system is switched to the other system so that the services are sustained.

Disaster Preparation Pain Points

  1. Backup pain points
  2. Backup failures
  3. Slow recovery
  4. Lossful recovery
  5. High remote backup costs
  6. Low cost performance
  7. Disaster tolerance pain points
  8. A single disaster tolerance solution supports only a few scenarios and cannot meet the requirements of scenarios with different data sizes.
  9. The disaster tolerance solution lacks global control and management over the system because of the lack of link monitoring and quick fault identification.
  10. The inspection capability is unavailable.
  11. The fault recovery costs are high, and it is difficult to make decisions regarding data verification, comparison, and correction.
  12. Collaboration is difficult for the switchover of multi-layer disaster recovery tools.
  13. The contingency plan lacks proper control, and the O&M process cannot be automated.

Deployment Solution

An enterprise-class database disaster recovery system should be selected based on business requirements and take the following factors into full consideration: RPO, RTO, costs, and scalability. The system must also meet the various requirements of database disaster recovery, including the setup of the disaster recovery environment, data synchronization, monitoring and alarming, drilling, failover, and data verification and correction.

Core Products for Enterprise-Class Database Disaster Recovery

After multiple rounds of iteration, the outstanding disaster recovery capabilities of Alibaba Cloud products have been well proved. The following core products can help enterprises to develop database disaster recovery solutions for different scenarios or requirements:

  1. Database Backup Service (DBS) is a backup service that provides continuous protection for databases at a low cost. It offers powerful protection for data in various environments, including enterprise data centers and other cloud vendors. Database Backup Service provides a total data backup and operation recovery solution, and supports real-time incremental backup and data recovery in seconds. You can use Database Backup Service for data backup between databases in a database disaster recovery solution.
  2. Data Transmission Service (DTS) is a data streaming service provided by Alibaba Cloud to support the data exchange between different types of data sources. It provides data transmission capabilities such as data migration, real-time data subscription, and real-time data synchronization. In a database disaster recovery solution, you can use Data Transmission Service to implement data migration and real-time synchronization between various databases, laying a solid foundation for database disaster recovery.
  3. Hybrid Cloud Database Management (HDM) is a platform that helps enterprises to connect different components in the hybrid cloud database architecture. Meanwhile, it supports the central management of multiple environments, quick and elastic data migration to the cloud, and failover. In the hybrid cloud disaster recovery scenario, you can use HDM to conveniently and quickly synchronize data from the local IDC to the cloud and conduct disaster recovery drills. When a fault occurs, you can implement failover on the HDM platform to maintain the availability of databases.

Typical Scenarios

Real-Time Backup

If you need data backup, for example, if you need continuous real-time backup that does not affect business operations, you can purchase Database Backup Service to implement hot backups of your databases. This service supports real-time incremental backups and data recovery within seconds. The following figure shows the architecture of the solution:

  1. Deployment of key components:
  2. Two databases, the production and recovery databases, are deployed in the local area for production data storage and data recovery, respectively.
  3. The purchased storage service is available in two Alibaba Cloud regions, for example, China (Shenzhen) and China (Qingdao). The storage service can be Object Storage Service (OSS) or Network Attached Storage (NAS).
  4. In the meantime, Database Backup Service is purchased for the real-time hot backup of local databases to the cloud.
  5. Backup of the off-cloud production data onto the cloud:
    (You can use either of the following methods to back up the off-cloud production data onto the cloud.)
  6. Deploy another local storage system to back up the production data to the storage space of the local IDC, and then copy this backup data from the local IDC to the cloud.
  7. Use Database Backup Service for direct hot backup of data from the local production database to the cloud storage spaces in both regions.
  8. Data recovery:
  9. If the production database fails but the storage space of the local IDC is operating normally, restore data from the local storage space to the local recovery database.
  10. If both the production database and the storage space in the local IDC fail or the local storage space is not available, use Database Backup Service to recover data from the cloud storage space to the local recovery database.
  11. Architecture characteristics:
  12. Advantage: Supports demanding technical requirements, good consistency, and quick recovery.
  13. Disadvantage: The RTO varies depending on the size of the database.
  14. Application scenario: The real-time backup solution is a sophisticated solution applicable to most relational databases.

Multiple Remote Active Backups

You can find all the following solutions in the enterprise-class database disaster recovery system: on-cloud elastic disaster tolerance, dual or multiple active backups, and three centers in two locations. The following example describes a solution using multiple remote active backups. This solution supports data-level remote dual active backups and one-click switchover to another data center for flexible scale-up or scale-down and future linear expansion.

  1. Unit-based reconstruction is performed on applications.
  2. Data Transmission Service is deployed for implementing bi-directional synchronization between databases in two or more locations, resolving the single-points-of-failure problem in the same city.
  3. HDM is deployed to monitor and manage the architecture with dual or multiple active backups. Meanwhile, HDM also supports switchover and failover.
  4. The two data centers support read/write splitting, and local users read data from the nearest data center.

New Product: Database Backup Service

As an on-cloud database backup agent, Database Backup Service is used with OSS to create a cloud database backup solution. This solution takes only five minutes to implement real-time backups with a second-level RPO (which indicates the maximum duration allowed for data loss when the database fails, with a smaller value desired).



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud


Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com