The Concept and Challenges of Serverless

By Xu Xiaobin, a Senior Middleware Expert at Alibaba and author of Maven in Practice. He is now responsible for the serverless R&D and O&M platform in the Alibaba Group. He previously maintained the Maven central repository.


Although it’s been half a year since I wrote that article, I don’t think the situation has changed much. Many developers and managers have a one-sided or even erroneous understanding of serverless technology. If new technologies are launched without sufficient knowledge of application architecture evolution, cloud infrastructure capabilities, and risks, their business value cannot be realized, efforts are wasted, and technical risks are introduced.

In this article, I will attempt to analyze the charm and core concepts of serverless systems from the perspective of application architecture and summarize the challenges that must be faced to implement serverless systems.

Evolution of Application Architectures

As the business continues to grow, more R&D personnel must be hired to develop features for monolith applications. At this time, the code in monolith applications does not have clear physical boundaries, and soon, different parts of the code conflict with each other. This requires manual coordination and a large number of conflict merge operations, sharply decreasing R&D efficiency. In this case, monolith applications are split into microservice applications that are independently developed, tested, and deployed. Services communicate with each other through APIs over protocols such as HTTP, GRPC, and DUBBO. Microservices split based on Bounded Context in the domain-driven design mode greatly improve medium and large teams’ R&D efficiency. To learn more about Bounded Context, consult books about domain-driven design.

In the evolution of monolithic applications to microservices, the distributed architecture is the default option from the physical perspective. In this case, architects have to meet new challenges produced by distributed architectures. In this process, distributed services, frameworks, and distributed tracking systems are generally used first, for example, the cache service Redis, the configuration service Application Configuration Management (ACM), the state coordination service ZooKeeper, the message service Kafka, and communication frameworks such as GRPC or DUBBO. In addition to the challenges of distributed environments, the microservice model gives rise to new O&M challenges. Previously, developers only needed to maintain one application, but now they need to maintain ten or more applications. Therefore, the workloads involved in security patch upgrades, capacity assessments, and troubleshooting has increased exponentially. As a result, application distribution standards, lifecycle standards, observation standards, and auto scaling are increasingly important.

Now let’s talk about the term “cloud-native.” In simple words, whether architecture is cloud-native depends on whether the architecture evolved in the cloud. “Evolving in the cloud” is not simply about using services at the infrastructure as a service (IaaS) layer of the cloud, such as Elastic Compute Service (ECS), Object Storage Service (OSS), and other basic computing and storage services. Rather, it means using distributed services, such as Redis and Kafka, in the cloud. These services directly affect the business architecture. As mentioned earlier, distributed services are necessary for a microservice architecture. Originally, we developed such services ourselves or maintained them based on open-source versions. In the cloud-native era, businesses directly use cloud services.

Another two technologies that need to be mentioned are Docker and Kubernetes. Docker defines application distribution standards. Applications written in Spring Boot and Node.js are all distributed by images. Based on Docker technology, Kubernetes defines a unified standard for applications throughout their life cycles, covering startup, launch, health checks, and deprecation. With application distribution standards and lifecycle standards, the cloud provides standard web app services, including application version management, release, post-release observation, and self-recovery. For example, for stateless applications, an underlying physical node’s failure does not affect R&D at all. This is because the web app service automatically switches the application containers from the faulty physical node to a new physical node based on the application lifecycle. Cloud-native provides even greater advantages.

On this basis, the web app service detects runtime data for applications, such as business traffic concurrency, CPU load, and memory usage. Therefore, auto scaling rules are configured for businesses based on these metrics. The platform executes these rules to increase or decrease the number of containers based on business traffic. This is the most basic implementation of auto scaling. This helps you avoid resource constraints during your business’s off-peak hours, reduce costs, and improve O&M efficiency.

As the architecture evolves, R&D and O&M personnel gradually shift their focus from physical machines and want the machines to be managed by the platform system without human intervention. This is a simple understanding of serverless.

Core Concepts of Serverless

In today’s cloud era, a narrow understanding of serverless as simply not caring about servers is not enough. In addition to the basic computing, network, and storage resources contained in the servers, cloud resources also include various types of higher-level resources, such as databases, caches, and messages.

In February 2019, the University of California, Berkeley published a paper titled Cloud Programming Simplified: A Berkeley View on Serverless Computing. This paper provides a very clear and vivid metaphor:

In the context of the cloud, serverful computing is like programming in a low-level assembly language, while serverless computing is like programming in a high-level language such as Python. Take the simple expression c = a + b as an example. If you describe it in an assembly language, you must first select several registers, load the values into registers, perform mathematical calculations, and then store the results. This is like today’s serverful computing in the cloud environment. Developers first need to allocate or find available resources, then load code and data, perform calculations, store the calculation results, and finally release the resources.

The paper describes that serverful computing is still the most common way of using the cloud today, but it should not be how we use the cloud in the future. In my opinion, the serverless vision should be stated as “Write locally, compile to the cloud”. This means that code only cares about the business logic, and resources are managed by tools and the cloud. Now, we have a general but abstract knowledge of serverless. Let’s understand the features of serverless platforms.

1) Developers Do Not Need to Care About Servers

2) The Platform Supports Auto Scaling

3) The Platform is Billed by Actual Resource Usage

4) The Platform Features Less Code and Faster Delivery

Challenges in Implementing Serverless Platforms

1) It Is Difficult to Make Businesses Lightweight

2) The Response Capabilities of the Infrastructure Is Insufficient

3) The Business Process Lifecycle Is Different Than Container Lifecycle

4) Observability Must Be Improved

5) R&D and O&M Personnel Must Adopt New Habits


As mentioned in this article, the serverless infrastructure development imposes new requirements on the application architecture, continuous delivery, service governance, O&M, and monitoring. Serverless infrastructure also places higher responsiveness requirements on lower-level infrastructures such as computing and storage networks. Therefore, serverless represents a comprehensive technological evolution that involves the application, platform, and infrastructure levels.

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.