Knative on Alibaba Cloud: The Ultimate Serverless Experience

Development History of Cloud Hosts

Before cloud computing emerged, if an enterprise needed to provide external services on the Internet, it had to lease a physical machine from an IDC and then deploy applications on the physical machine. Over more than a decade, the performance of physical machines has increased steadily following the prediction of Moore’s Law. As a result, a single application cannot fully use the resources on the entire physical machine. Therefore, we need technology to improve resource utilization. In normal cases, if one application does not fully occupy a machine, you can simply deploy a few more. However, the hybrid deployment of multiple applications on the same physical machine leads to many problems. For example, the following problems may be caused:

  • Resource isolation
  • System dependency and high O&M difficulty

Knative Serving

Before introducing Knative, let us use a web application as an example to learn how to distribute traffic to an application and release the application in Kubernetes mode. In the following figure, the Kubernetes mode is shown on the left side and the Knative mode is shown on the right side.

  • You must maintain the relationship between the ingress and services when exposing services externally.
  • If you need to implement canary release, you must perform rolling updates by using multiple deployments.

Application Hosting

Kubernetes is an abstraction oriented to IaaS management. If you directly deploy applications through Kubernetes, you must maintain a large number of resources.

Traffic Management

Knative routes traffic through gateways and can then split the traffic by percentage. In this way, a solid foundation is laid for basic capabilities such as auto scaling and canary release.

Canary Release

Serving allows you to manage multiple versions of applications. In this way, multiple versions of an application can provide online services at the same time.

Scaling

Scaling is the core capability of Knative and helps reduce application costs. This capability enables Knative to automatically scale out an application when traffic increases and scale in an application when traffic decreases.

Why Did We Develop Serverless Kubernetes?

The Kubernetes community version requires that you purchase a host in advance and register the host as a Kubernetes node to schedule pods. However, purchasing a host in advance is illogical for application developers. Application developers only want to allocate IaaS resources to their application instances that need to run, but do not want to maintain complex IaaS resources. In other words, it’s great that Kubernetes is fully compatible with the Kubernetes API developed by the community. This way, it can automatically allocate resources when required but does not require manual maintenance and management of IaaS resources. This matches the resource requirements of application developers. Alibaba Cloud Serverless Kubernetes (ASK) was developed to achieve this objective.

Highlights

SLB-based Gateways

By default, the Knative community supports various implementations of gateways, such as Istio, Gloo, Contour, Kourier, and Ambassador. Istio is the most popular among the various implementations because Istio not only can serve as a gateway but also can serve as a service mesh. Although these gateways provide a full range of features, they were not initially designed to serve as gateways for serverless services. First, you must have a resident gateway instance. To ensure high availability, you must deploy at least two instances to back up each other. Second, the control ends of these gateways must also be resident. The fees of IaaS resources and O&M of these resident gateways are necessary business costs.

Cost-effective Reserved Instances

Reserved instances are an exclusive feature of ASK Knative. By default, the Knative community version can scale in an application to zero when no traffic is generated. However, this causes cold start problems when Knative scales the application out from zero. Cold start problems include a long application startup time in addition to IaaS resource allocation, Kubernetes scheduling, and image retrieval problems. The application startup time may range from milliseconds to minutes and is almost uncontrollable on a general-purpose platform. Certainly, these problems exist in all serverless products. Most conventional FaaS products maintain a public IaaS pool to run different functions. To prevent the pool from being used up and minimize the cold start time, most FaaS products set limits on function execution. For example, the following limits are set:

  • Burst concurrency: A maximum number of concurrent functions is specified for each function by default. If the requested number of functions exceeds this upper limit, the system enables throttling.
  • CPU and memory: The maximum CPU utilization and maximum memory usage cannot be exceeded.

Demos

After you create an ASK cluster, you can apply for the Knative feature by joining the provided DingTalk group. Then, you can directly use the capabilities provided by Knative in the ASK cluster.

# kubectl -n knative-serving get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-gateway LoadBalancer 172.19.8.35 47.102.220.35 80:30695/TCP 26h

Deploy the Coffee Service

Save the following content to the coffee.yaml file and then deploy the service to the cluster by using kubectl.

# cat coffee.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: coffee
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/target: "10"
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:160e4dc8
env:
- name: TARGET
value: "coffee"
# kubectl get ksvc
NAME URL LATESTCREATED LATESTREADY READY REASON
coffee http://coffee.default.example.com coffee-fh85b coffee-fh85b True
# curl -H "Host: coffee.default.example.com" http://47.102.220.35
Hello coffee!

Auto Scaling

The autoscaler is a first-class object in Knative and the core capability provided by Knative to help users reduce costs. The default policy for the Knative Pod Autoscaler (KPA) can automatically adjust the number of pods based on real-time traffic requests. Let us try out the scaling capability of Knative. First, let’s check the current pod information.

# kubectl get pod
NAME READY STATUS RESTARTS AGE
coffee-bwl9z-deployment-765d98766b-nvwmw 2/2 Running 0 42s
# cat coffee-v2.yaml
... ...
name: coffee-v2
annotations:
autoscaling.knative.dev/target: "10"
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:160e4dc8
... ...
hey -z 30s -c 90 --host "coffee.default.example.com" "http://47.100.6.39/?sleep=100"
  • -c 90 specifies that 90 concurrent requests are initiated in the stress test.
  • --host "coffee.default.example.com" specifies the host to bind.
  • "http://47.100.6.39/?sleep=100" specifies the request URL. In particular, sleep=100 indicates that the image test sleeps for 100 milliseconds, just like a real online service.

Reserved Instances

The preceding Highlights section describes how ASK Knative uses reserved instances to resolve the cold start problems and reduce costs. This section describes how to switch between reserved instances and standard instances.

# kubectl get pod
NAME READY STATUS RESTARTS AGE
coffee-bwl9z-deployment-reserve-85fd89b567-vpwqc 2/2 Running 0 5m24s
# cat coffee-set-reserve.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: coffee
spec:
template:
metadata:
annotations:
knative.aliyun.com/reserve-instance-eci-use-specs: "ecs.t5-c1m2.large"
autoscaling.knative.dev/target: "10"
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:160e4dc8
env:
- name: TARGET
value: "coffee-set-reserve"
# kubectl get pod
NAME READY STATUS RESTARTS AGE
coffee-vvfd8-deployment-reserve-845f79b494-lkmh9 2/2 Running 0 2m37s
# kubectl get pod coffee-vvfd8-deployment-reserve-845f79b494-lkmh9 -oyaml |head -20
apiVersion: v1
kind: Pod
metadata:
annotations:
... ...
k8s.aliyun.com/eci-instance-cpu: "2.000"
k8s.aliyun.com/eci-instance-mem: "4.00"
k8s.aliyun.com/eci-instance-spec: ecs.t5-c1m2.large
... ...

Upgrade the Coffee Service

Before the upgrade, let’s check the current pod information:

# kubectl get pod
NAME READY STATUS RESTARTS AGE
coffee-fh85b-deployment-8589564f7b-4lsnf 1/2 Running 0 26s
# cat coffee-v1.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: coffee
spec:
template:
metadata:
name: coffee-v1
annotations:
autoscaling.knative.dev/target: "10"
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/knative-sample/helloworld-go:160e4dc8
env:
- name: TARGET
value: "coffee-v1"
  • The environment variable contains "v1". Therefore, according to the HTTP return code, you can determine that version v1 is providing the current service.
# curl -H "Host: coffee.default.example.com" http://47.102.220.35
Hello coffee-v1!
# kubectl get pod
NAME READY STATUS RESTARTS AGE
coffee-v1-deployment-5c5b59b484-z48gm 2/2 Running 0 54s

Summary

Knative is the most popular serverless orchestration framework in the Kubernetes ecosystem. The Knative community version can provide services only when a resident controller and a resident gateway are available. These resident instances not only incur IaaS costs but also make O&M more complex, increasing the difficulty in developing serverless applications. Therefore, we use ASK to fully manage Knative Serving. Knative Serving is an out-of-the-box service that allows you to save on the costs of these resident instances. We not only provide gateway capabilities through SLB products but also provide various types of reserved instances based on burstable instances. ASK allows you to greatly reduce IaaS costs and improve investment efficiency by accumulating CPU credits during off-peak hours and consuming the accumulated CPU credits during peak hours.

References

Original Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store