Experiences and Lessons from the Management Practices of Alibaba Kubernetes

Complicated Kubernetes

“If I were to say that Kubernetes has a problem, of course, it would be too complicated,” Sun Jianbo said in an interview. “However, this is caused by the positioning of Kubernetes itself.”

Stateful Application Support

In addition to the inherent complexity issue, the support of Kubernetes for stateful applications has also been an issue for many developers. There’s no optimal solution. Currently, the mainstream solution for stateful applications in the industry is Operator, which is very difficult to compile.

Large-Scale Kubernetes Practices by Alibaba

Today, Kubernetes’ application scenarios in the Alibaba economy cover all aspects of Alibaba business, including e-commerce, logistics, and off-line computing. Kubernetes is also one of the main forces supporting Alibaba 618, Double 11, and other Internet-level promotions. Alibaba Group and Ant Financial run dozens of ultra-large Kubernetes clusters. The largest cluster has about 10,000 machine nodes, and this is not its upper limit of capabilities. Each cluster serves tens of thousands of applications. In addition, we maintain Kubernetes clusters with tens of thousands of users on Alibaba Cloud Container Service for Kubernetes (ACK). The scale and technical challenges are second to none.

  • Second: You need to use OAM, Helm, and other application definition tools and models reasonably to solve the application definition and description problem and connect with the existing application management capabilities.
  • Third: You can consider using and integrating various continuous delivery capabilities to build a complete application delivery chain.

Decoupling O&M and R&D

Decoupling allows Kubernetes projects and cloud service providers to expose different dimensions to different roles and implement declarative APIs more consistently with users’ needs. For example, application developers only need to declare in a YAML file that “application A needs to use a 5 GB readable and writable space.” The application O&M personnel only need to declare in the corresponding YAML file that “Pod A needs to be mounted with a 5 GB readable and writable data volume.” The concentration brought by allowing users to only care about their own things is the key to reducing the learning threshold and difficulty for Kubernetes users.

Speaker Background

Sun Jianbo is a Technical Expert at Alibaba and a member of the Kubernetes project community. Currently, he is working on the delivery and management of large-scale cloud-native applications at Alibaba. In 2015, he worked on the technical book, Docker Containers and Container Cloud Technology. He previously worked for Qiniu Cloud and participated in time series database migration, stream computing, log platform, and other project-related applications to the cloud.

Original Source:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store