A Proof of the Auto Scaling Capabilities of Function Compute
Since the launch of the very first mini programs in 2017, more and more mobile applications have been made available as mini programs. Mini programs can be easily accessed and require no installation, eliminating much of the burdens of traditional applications. Mini programs have become a huge trend in China. Alibaba uses mini programs in several of its mobile apps, including Taobao Mobile, Alipay, DingTalk, and AMAP. During this year’s Double 11 Shopping Festival, mini programs played an important role, with most major shopping activities and promotions held on Taobao and Tmall being backed up by mini programs.
In terms of logical architecture, a mini program consists of a client and a server. The client provides the user interface and interaction logic, whereas the server processes and analyzes data. When supporting a large number of mini programs, a platform application may encounter several challenges on the server end. Two of which are the following:
- Many mini programs are inactive most of the time. Therefore, the traditional mode in which at least one server is deployed for a mini program would waste resources.
- During the peak hours of a promotional activity, such as Double 11, the number of mini program calls will increase dramatically. In such cases, the server must be able to quickly and automatically scale out to meet the demand.
In China, currently Alibaba Cloud provides a complete mini program solution: Mini Program Cloud. Alibaba Cloud’s domestic solution of Mini Program Cloud provides the core capability of effectively utilizing and automatically scaling in and out resources. However, this solution depends on greatly Alibaba Cloud’s own serverless computing solution of Function Compute, which is Alibaba Cloud’s fully hosted serverless computing service. This product enables developers to build reliable, elastic, and secure services simply by compiling and uploading code without managing any of the underlying infrastructure behind it, such as servers in the cloud.
In this article, we’re going to take look at the core auto scaling technologies of Alibaba Cloud Function Compute, which are used to scale mini programs during the major Double 11 global shopping festival.
The Technical Architecture of Mini Programs
The following figure details the overall technical architecture of a mini program that runs on Taobao.
Mini programs involve the following logical architecture and user interactions:
- A user can enter a mini program on Taobao by tapping a store activity in the Taobao Mobile app. The user interface and interaction are provided by the client end of the mini program.
- When the user participates activities provided by the mini program, the client calls a function either to request data from the server or to send data to the server.
- During the function call, the client must first access the Taobao gateway for required authentication and then call the function in Mini Program Cloud.
- Then, the code of the function is executed in Mini Program Cloud, where the user can define the business logic. With the following scalable capabilities of Mini Program Cloud, the user can easily build a complete e-commerce application:
- Data storage: Stores structured data.
- File storage: Stores files such as text, image, and video files.
- E-commerce services: Retrieve user information and create payment transactions.
- Statistical analysis: Automatically collects statistics on the usage of the mini program and analyzes users to support business decisions.
As you can see from the following walkthrough, functions comprise the core business logic of the entire mini program. Functions also connect the basic cloud capabilities and serve the client. Insufficient function capabilities can affect the running of the entire mini program. In this architecture, functions need to support a large number of mini programs. Therefore, these functions must meet two baseline requirements: They must be always online so that the mini programs can be used in out-of-the-box mode, and they must support auto scaling to cope with sudden surges of mini program visits. The technical architecture of Alibaba Cloud’s Function Compute can meet the demand of both of these requirements.
The Technical Architecture of Function Compute
The below figures provides a general overview of Function Compute’s technical architecture:
The features of Function Compute’s core components are as follows:
- API service: Serves as the gateway of Function Compute and provides features such as authentication and throttling.
- Resource scheduling: Allocates and manages computing resources for calling functions, and is responsible for ensuring high scheduling efficiency and performance.
- Function execution engine: Is the environment where function code is executed with sufficient security and isolation.
So, one question yet to address is how does Function Compute address the challenges faced by the presence of mini program platforms in this architecture? The following sections provide insights into how this solution works.
When you create a function and upload its code, Function Compute only stores the code package into Object Storage Service (OSS), but does not allocate any computing resources. Therefore, Function Compute can support massive numbers of mini programs. When a function is called for the first time, Function Compute allocates the relevant computing resources, and downloads, loads, and then executes the function code. This entire process is called cold start. Function Compute controls the system cold-start time to within 200 ms through a series of several different optimization. Therefore, even a mini program that is not used for certain amount of time can be rapidly used when it is called for the first time.
The next question is how does Function Compute deal with a gradual increase or sudden surge when mini programs are continuously called? The answer to this question is that the resource scheduling module of Function Compute can accurately manage the status of each instance. That is, when a request is received, this module quickly checks whether an idle instance is available. If an instance is not available, the request is placed into the pending queue. Then, once an instance is released, the request can be quickly processed. Besides this, the resource scheduling module can create a new instance in the background. Then, when the new instance is ready, it can also serve requests. In this policy, the P95 latency of requests remains stable even when the load doubles.
The above figure shows the call volume and latency data of a mini program on Taobao. As shown in the figure, the transactions per second (TPS) experienced transient momentary peaks during the Double 11 promotion, but the P95 latency kept stable. This can be attributed to the fact that Function Compute rapidly creates new instances when peaks occur and uses existing resources for caching, meaning that the entire process can be relatively smooth.
In certain flash sale scenarios, massive computing resources must be in place in just an instant. Therefore, in such cases, real-time auto scaling alone isn’t sufficient to meet platform demands. Moreover, the cold start time of 200 ms is too long for flash sale scenarios. And the underlying computing resources are subject to throttling when being expanded. Therefore, to support these scenarios, Function Compute provides the reserved instance function. With reserved instances, you can reserve resources for predictable activities and eliminate cold start.
In this case, you do not need to reserve resources based on peaks, which is much different from the traditional server-based approach. Instead, you can use a hybrid mode that combines reserved instances and pay-as-you-go instances. Requests will be first processed by reserved instances. Then, once all the reserved instances are used up, more pay-as-you-go instances will be automatically created to process requests. On the basis that the reserved instances can be scheduled in an optimized manner, the cold start of pay-as-you-go instances has a lower impact. As such, the auto scaling of Function Compute helps achieve a good balance between performance and cost.
Mini programs are lightweight and rapidly iterated mobile applications. They place demanding requirements on the development efficiency of mini program developers. After a mini program goes online, the visits to the mini program can gradually increase or suddenly surge during major promotion such as Double 11. In both cases, a big challenge is given to the stability and elasticity of back-end services. Function Compute allows mini programs to run immediately after function code is uploaded, which greatly improves the development efficiency of back-end services. The auto scaling in hybrid mode helps tackle load changes. With these features, Function Compute has become the optimal choice for mini program platforms.