Getting Started with Kubernetes | Kubernetes Container Runtime Interface
By Zhijin, Engineer at Alibaba Cloud
The container runtime interface (CRI) is used to enable container communication in Kubernetes. This article is divided into three parts: (1) origin and design of the CRI; (2) CRI implementation methods; (3) CRI-related tools.
Docker was the first container runtime before the CRI was introduced in Kubernetes 1.5. Kubelet performs operations on containers by using Docker APIs through the built-in dockershim, producing final state-oriented results. rkt was another container runtime that was developed after Docker. rkt was incorporated into the Kubelet code so that it is supported by Kubernetes. Containing both Docker and rkt, the Kubernetes code is becoming increasingly complex and difficult to maintain. Then, hyber.sh joined the community and wanted to become the third container runtime.
This article proposes to abstract container runtime operations into an interface and decouple the Kubelet code from the implementation code of a specific container runtime. This interface is the CRI, which allows container runtimes to access Kubernetes.
All software issues can be fixed by adding one layer. Therefore, the CRI was added to fix the container runtime issue. The CRI uses gRPC as its communication protocol. gRPC was made open source shortly after the CRI was introduced. It provides better performance than HTTP and REST. gRPC can automatically generate communication protocol code, saving you the trouble of manually writing the client code and server code.
Now, let’s look at the design of the CRI.
The preceding figure shows the Kubelet architecture after the introduction of CRI. The generic runtime manager is closely related to containers. Currently, dockershim still exists in the Kubelet code and provides the most stable implementation of container runtimes. “Remote” refers to the CRI. The CRI consists of two parts:
- The CRI server provides generic interfaces used to create and delete containers.
- The streaming server provides streaming data interfaces such as exec and port-forward.
The container network interface (CNI) operates on the CRI because network resources are created when a pod is created and then injected into the pod. Next, we must discuss containers and images. A container is created by a container engine.
The CRI design is described as follows. Kubernetes is oriented to the final state. In each coordinated loop, Kubelet retrieves the pod data that is scheduled to the local node from the API server and processes the pod data based on the final (expected) state.
In the loop’s first phase, Kubelet obtains the container status through the List interface, creates a container through the Sandbox and Container interfaces, and pulls the container image through the Image interface. The CRI describes the Kubelet-expected container runtime behavior, which was previously mentioned.
Container Lifecycle Management Through the CRI
When you run a pod through the kubectl command, Kubelet performs the following operations through the CRI:
- Kubelet calls the RunPodSandbox interface to create a pod container to store container-related resources, such as the network space, PID space, and process space.
- Kubelet calls the CreatContainer interface to create a business container in the pod container space.
- Kubelet calls the StartContainer interface to start the container. The interfaces used to destroy containers are StopContainer and RemoveContainer.
CRI Streaming Interface
This section describes the exec streaming interface of the CRI. The exec interface is used to run a command in a container and can be attached to the container’s I/O stream to run interactive commands. The exec interface reduces resource use and ensures a stable connection.
The exec operation is sent to the API server. After authentication, the API server initiates an exec request to the Kubelet server. Then, Kubelet calls the CRI’s exec interface to send a specific request to the container runtime. The container runtime does not directly serve this request on the exec interface. Instead, the container runtime asynchronously returns the result of each execution through the streaming server. The API server interacts with the streaming server to obtain streaming data. This makes the CRI server more lightweight and reliable.
The current CRI implementations include:
- PouchContainer @alibaba
CRI-containerd is a mainstream next-generation CRI implementation. CRI-O was developed by Red Hat. PouchContainer is Alibaba’s CRI implementation. There are also other CRI implementations, but we will not describe them here.
The following figure shows the CRI-containerd architecture.
CRI-containerd is implemented based on containerd. In its early implementations, the CRI was an independent process and interacted with containerd. This resulted in inter-process overhead, so in later versions, the CRI was integrated into containerd as a pluggable plug-in.
The entire architecture looks intuitive. Meta Services, Runtime Service, and Storage Service are interfaces provided by containerd. These interfaces are generic container-related interfaces and used to manage images and container runtimes. CRI encapsulates the gRPC service on top of the interfaces. The right part of the figure shows a specific container implementation. For example, when creating a container, you need to create a specific runtime and shim, which are combined with the container to form a pod sandbox.
CRI-containerd allows containerd to implement a wider range of container runtime interfaces. CRI-containerd can use the ctr tool provided by containerd to call these container runtime interfaces, in addition to the CRI.
The following figure shows the implementation of CRI-O.
CRI-O is a CRI service that is implemented by directly encapsulating container interfaces in the Open Container Initiative (OCI). CRI-O only provides a specific external CRI. It does not have a wide range of interfaces like containerd. CRI-O can be used to manage container runtimes and images.
This section describes CRI-related tools. These tools can all be found in this project.
crictl is a Docker-like command line tool used to operate the CRI. crictl helps users and developers debug container issues. They do not need to apply a YAML file to the API server and then use Kubelet for debugging. crictl allows you to directly operate on the CRI.
critest is used to verify whether the CRI behavior is as expected.
- Performance tools
You can test interface performance by using other performance tools.
Question 1: Is it possible to improve the CRI specification of the v1 alpha2 version?
Answer: The CRI specification is developed in a top-down manner. We can improve the CRI specification by developing the CRI to provide Kubernetes features.
Question 2: How do I create a custom runtime behavior through annotation?
Answer: Currently, the CRI cannot meet the requirements of all users. Many companies, such as Alibaba, have enhanced or customized the CRI in different ways. The simplest way is to customize the runtime behavior through annotation. Set an annotation field for each interface. The container runtime customizes the runtime behavior by understanding this field. You can customize the runtime behavior by identifying the annotation field on the CRI.
Let’s summarize what we have learned in this article.
- The CRI is intended to decouple the container runtime from Kubernetes.
- CRI implementations include CRI-O and CRI-containerd.
- cri-tools are used for CRI debugging, and critest is used for CRI testing.