A Quick Guide to DevOps Capability System on the Cloud


Cloud computing has been developing for more than ten years now, and its priority has shifted from just “cloud migration” to a more sophisticated “cloud optimization” mindset. The former issue is actually a decision-making problem. That is, when an enterprise decides to migrate to the cloud, the topic is over. However, the latter one is a continuous topic. There is no silver bullet for resource optimization, and the goal towards a fully-optimized system is a never-ending process.

1. Automated O&M Pyramid

In terms of the O&M automation level, DevOps is advanced and clearly defined. The hierarchical relationship between O&M automation and DevOps can be obtained from the autonomous driving standards. Here is a comparison chart, as shown in Figure 1.

Figure 1: Automated O&M Pyramid
Table 1: A Reference for Comparison of Autonomous Driving Level and Automated O&M Level
  • Level 2 — Semi-Automated O&M: At this level, no more than 30% of O&M automation is required. Partial operations can be performed through the Command Line Interface (CLI).
  • Level 3 — Highly-Automated O&M: The degree of automation is about 50%. Enterprise O&M personnel use the Software Development Kit (SDK) of the cloud platform to call the Application Programming Interface (API) for routine O&M and develop their own O&M systems. However, their O&M systems do not support the customization or platform capability and are tightly coupled with internal systems.
  • Level 4 — Standardized Automated O&M: With more than 90% O&M automation, the O&M systems have been equipped with capabilities of platforms, templates and code. Systems can be customized by the enterprise’s own requirements. Besides, O&M personnel are ready to use templated products to standardize and automate their O&M.
  • Level 5 — AIOps: It is intelligent O&M with 100% automation without on-call personnel, emancipating the productive forces entirely. This is currently the O&M goal of many large Internet enterprises and also the focus of leading enterprises.

2. DevOps: The Advanced Mode of Automated O&M

2.1 Templating: The Most Compatible Way for DevOps

Here, I would like to focus on the differences between highly-automated O&M (Level 3) and standardized automated O&M (Level 4). Most of the in-house O&M systems are highly automated. For example, a function is developed on the internal O&M system, which can immediately create 10 cloud servers. While the next time to create other resources, for example, to create three message queues, an extra function for additional message queues needed to be created. So, highly-automated O&M can be regarded as a single “system” focusing on specific scenarios.

2.2 Templating or Coding: The Foundation of AIOps

Although AIOps may be the common goal for everyone, but there is still a long way to go. At this stage, AIOps is only available in a few specific scenarios. For example, under the scenario of elastic scaling, historical data can be learned and then pre-scaled, or AI can be used to detect whether a certain metric is abnormal. So AIOps is far from being fully automated. It is recommended that AIOps research can be conducted in specific scenarios to keep the focus on AIOps.

3. Foundation Core of DevOps: CI/CD and Infrastructure as Code (IaC)

Usually, the first step of an automated O&M system on the cloud is environment deployment, which is a fundamental and important step. It is costly to modify the application after the deployment is completed. Especially after going online, it will be a project for environment migration. That’s why environmental deployment needs to be designed at the very beginning.

Figure 2: System Operating Environment for CI/CD Pipelines

3.1 System Operation Environment of CI/CD Pipeline

Take the example of IaC of CI/CD streamlined application as an example to illustrate the entire process of automated deployment.

Figure 3: CI/CD Pipeline
Figure 4: System Environment Directory
  • Environment regions such as Hangzhou, Beijing, or Shanghai.
  • Other parameters in the deployment environment such as account, AccessKey/Token, and role.
  • Resource configurations such as the number of servers and domain names.

3.2 Three Capabilities of Standardized Deployment on the Cloud

The biggest difference between a cloud environment and a traditional data center is that all cloud services are API-centric. Users can create, modify, and delete resources by API. Therefore, deployment on the cloud is naturally standardized, thus improving the deployment efficiency on the cloud, i.e., realizing efficient and unified standardized deployment. Repeat deployment is required in the following four typical scenarios.

  • Multi-region deployment helps quickly repeat the deployment. For example, the cluster is deployed in Hangzhou first and then deployed in other regions such as Beijing and Shanghai. Generally, it is only necessary to add a new region stage on the pipeline, adding configuration parameters to implement one-click deployment.
  • Cluster deployment can also realize quick deployment. For example, several clusters are deployed first, then multiple clusters are deployed repeatedly. Users can add a new cluster stage to the pipeline and then configure parameters to implement one-click deployment.
  • Disaster recovery environment deployment. If a production environment is deployed first and then a disaster recovery environment needs to be deployed, cluster deployment must be used.

4. Capabilities a Complete DevOps System Should Have

If 100% automation is the ideal form of DevOps, the lack of any process may become an obstacle to the practice of DevOps. Generally, DevOps involves eight stages in its O&M. They are planning, coding, building, testing, publishing, O&M, and monitoring, then return to the planning and start a new round of iteration.

Figure 5: DevOps Flowchart
Figure 6: Four-Ring Diagram of the Lifecycle of Cloud Resources
Figure 7: Different Phases of the ECS Automated O&M Kit
Table 2: Comparison between Container Control (Kubernetes) and Cloud Server DevOps

5. Obstacles to DevOps Implementation: Achieve Balance with Financial Process

In fact, DevOps is not a new concept, but few enterprises have implemented DevOps yet. There are many reasons for DevOps. According to my years of experience, there are two biggest issues, one is finance, and the other is the habits of O&M developers.

About the Author

By Wu Junyin, senior technical expert on intelligent Elastic Compute Service (ECS) at Alibaba Cloud. He is responsible for the architectures of some new ECS products and trusted computing instances, and for the technology development and O&M architecture of intelligent OnECS and OnAliyun of Alibaba Cloud. Moreover, Wu Junyin has rich experience in cloud computing and is committed to creating unified management and O&M experience that based on ECS-centric automation and DevOps.

Original Source:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store