How to Build the Most Effective Backup System — A Conversation with the Expert

Database backup is a hot topic. It may seem simple at first, but in practice, O&M personnel often encounter various problems when it comes to backing up databases. So, what typical challenges are presented and how can we build an effective backup system? Which solutions are applicable? To answer these questions, we interviewed Heng Tiegang, a database backup expert at Alibaba.

Heng Tiegang (nickname: Pei’en), an Alibaba database backup expert

Why Should Databases Be Backed Up?

I think the answer to this question is already obvious. So, rather than answering this question, I would like to answer another question: what risks can be prevented through database backup? In fact, since its generation, data has always been accompanied by the risks of data loss caused by natural disasters, power failures, network faults, hardware faults, software faults, and human faults.

The point is, even if your database survives from hardware bugs today, a lightning tomorrow, or a power failure the day after tomorrow, you may mistakenly delete data due to a slip of the hand three days from today.

Which Challenges Are Presented by Database Backups?

The first challenge is taking stock of database assets. For an individual user, all these database assets may just be one instance, and the user clearly knows the assets even without stocktaking. However, for an enterprise user, especially a user from a large-sized enterprise, the database can have multiple instances and various database types due to business diversity. In this case, the O&M personnel need to clearly know the numbers, distribution, types (production or core databases), and functions of different databases.

The second challenge is the evaluation of the backup system. While backup is a basic and daily practice, people often find that it does not help during crunch times. The reason is that backup, as a basic task, does not promote the business, and as long as no problems occur, few people remember it. However, once a problem occurs, backup immediately becomes the target of public attention. Backups often do not help during emergencies mainly because people do not take backups seriously enough, so investment in backup is insufficient. Many enterprises claim that backups are a top priority, but never implement them properly.

I recommend that you ask your technical team right away: Is your backup system really effective?

What Is an Effective Backup System?

Different databases can be used for different purposes, and the effectiveness of a backup system varies accordingly. According to their functions, databases can be classified as test databases, production databases, and core databases.

For test databases, you must learn the importance of the database based on its intended use. If the test database is used for personal tests, in most cases data is imported and cleared without being backed up. If the test database is used for R&D, we recommend that you enable the backup function and do not underestimate the importance of backups. This is because all development and testing personnel in the enterprise work on the test database, and a single data problem can immediately cause trouble for the entire team. In addition, a test database is likely to encounter more problems than a production database.

For a production database, first ensure that you have enabled the backup function. Then, evaluate whether the backup cycle meets the requirements, for example, full backup on a daily basis. When a failure occurs, the only up to one day of new data is lost. In this case, you need to check whether the last copy of backup data had been restored and whether the backup data is valid.

For a core database, its importance is higher than that of a test or production database. In addition to the preceding measures, you need to take some other measures. Real-time backup has become a mandatory option for an enterprise to select a database backup solution, because it minimizes the amount of data lost upon a fault. Fast recovery also plays an increasingly significant role for the core database. Based on the risks of potential faults, you can select the optimal recovery solution, perform regular drills on the entire backup and recovery system, and sample the backup data to test the recovery function. I recommend that you develop a policy, which automatically and regularly conducts the entire recovery process and provides drill reports.


Which Solutions Are Applicable?

Again, be prepared before the data is lost. Act now to protect your database. Here are some of the solutions that are deployed based on Alibaba Cloud products:

Could You Give Us a Brief Introduction to Your Work?

I am currently in charge of an Alibaba Cloud product called DBS. Have you ever heard of it? As a database backup channel, DBS has been put into commercial use, and is used together with OSS to develop a cloud database backup solution. It takes only five minutes for such a solution to implement real-time backup with a second-level Recovery Point Objective (RPO). The RPO indicates the maximum duration allowed for data loss when the database fails. And of course, a smaller RPO is always desired.

In addition to providing continuous data protection and low-cost backup service for databases, DBS also provides powerful data protection in different environments, including public clouds, enterprise-created IDCs, and other cloud vendors. DBS features low cost, high performance, and zero risks. It provides users with an ideal cloud database backup solution.

Currently, the backup system time achieved by DBS has been tested by massive users. DBS not only supports real-time backup and second-level RPO but also incorporates the table-level recovery capability. It helps users to recover only valuable data and decrease the RTO to several minutes.

About the Author

Heng Tiegang (nickname Pei’en) joined Alibaba in 2011, and was once the MySQL DBA of Alibaba Group. He is currently a database product manager responsible for designing database backup products.


Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store