By Xiheng, Senior Technical Expert at Alibaba Cloud
The network architecture is a complex component of Kubernetes. The Kubernetes network model poses requirements for specific network functions. The industry has developed many network solutions for specific environments and requirements. A container network interface (CNI) allows you to easily configure a container network when creating or destroying a container. This article describes how classic network plug-ins work and how to use CNI plug-ins.
What Is A CNI?
A CNI is a standard network implementation interface in Kubernetes. Kubelet calls different network plug-ins through CNIs to implement different network configuration methods. A series of CNIs are implemented by plug-ins such as Calico, Flannel, Terway, Weave Net, and Contiv.
How to Use CNIs in Kubernetes
Kubernetes determines which CNI to use based on the CNI configuration file.
The CNI usage instructions are as follows:
(1) Configure the CNI configuration file (/etc/cni/net.d/xxnet.conf) on each node, where xxnet.conf indicates the name of a network configuration file.
(2) Install the binary plug-in in the CNI configuration file.
(3) After a pod is created on a node, Kubelet runs the CNI plug-in installed in the previous two steps based on the CNI configuration file.
(4) This completes the pod network configuration.
The following figure shows the detailed process.
When a pod is created in a cluster, the API server writes the pod configuration to the cluster. Some control components (such as the scheduler) of the API server are scheduled to a specific node. After listening to the creation of this pod, Kubelet performs some creation actions locally. When a network is created, Kubelet reads the configuration file in the configuration directory. The configuration file declares the plug-in to use. Kubelet executes the binary file of the CNI plug-in. The CNI plug-in enters the pod’s network space to configure a pod network. After a pod network is configured, a pod is created by Kubelet and goes online.
The preceding process seems complicated and involves multiple steps, such as configuring the CNI configuration file and installing the binary plug-in.
However, many CNI plug-ins can be installed in one click and are easy to use. The following figure shows how to use the Flannel plug-in. Flannel automatically installs the configuration and binary file on each node by using a Deploying template of kubectl apply Flannel.
Then, all CNI plug-ins required by the cluster are installed.
Many CNI plug-ins provide a one-click installation script. You do not need to concern yourself with the internal configuration of Kubernetes or how the APIs are called.
Select an Appropriate CNI Plug-in
The community provides many CNI plug-ins, such as Calico, Flannel, and Terway. Before selecting an appropriate CNI plug-in in a production environment, let’s have a look at the different CNI implementation modes.
Select an implementation mode based on a specific scenario, and then select an appropriate CNI plug-in.
CNI plug-ins are divided into three implementation modes: Overlay, Routing, and Underlay.
- In Overlay mode, a container is independent of the host’s IP address range. During cross-host communication, tunnels are established between hosts and all packets in the container CIDR block are encapsulated as packets exchanged between hosts in the underlying physical network. This mode removes the dependency on the underlying network.
- In Routing mode, hosts and containers belong to different CIDR blocks. Cross-host communication is implemented through routing. No tunnels are established between different hosts for packet encapsulation. However, route interconnection partially depends on the underlying network. For example, a reachable route must exist from the underlying network to Layer 2.
- In Underlay mode, containers and hosts are located at the same network layer and share the same position. Network interconnection between containers depends on the underlying network. Therefore, this mode is highly dependent on the underlying capabilities.
Select an implementation mode based on your environment and needs. Then, select an appropriate CNI plug-in in this mode. How do we determine the implementation mode of each CNI plug-in available in the community? How do we select an appropriate CNI plug-in? These questions can be answered by considering the following aspects:
Different environments provide different underlying capabilities.
- A virtual environment, such as OpenStack, imposes many network restrictions. For example, machines cannot directly access each other through a Layer 2 protocol, only packets with Layer 3 features such as IP addresses are forwarded, and a host can only use specified IP addresses. For a restricted underlying network, you can only select plug-ins in Overlay mode, such as Flannel-VXLAN, Calico-IPIP, and Weave.
- A physical machine environment imposes few restrictions on the underlying network. For example, Layer 2 communication can be implemented within a switch. In such a cluster environment, you can select plug-ins in Underlay or Routing mode. In Underlay mode, you can insert multiple network interface controllers (NICs) directly into a physical machine, or virtualize hardware on NICs. In Routing mode, routes are established through a Linux routing protocol. This avoids the performance degradation caused by VXLAN encapsulation. In this environment, you can select plug-ins such as Calico-BGP, Flannel-HostGW, and Sriov.
- A public cloud environment is a type of virtual environment and places many restrictions on the underlying capabilities. However, each public cloud adapts containers for improved performance and may provide APIs to configure additional NICs or routing capabilities. If you run businesses on the public cloud, we recommend that you select the CNI plug-ins provided by a public cloud vendor for compatibility and optimal performance. For example, Alibaba Cloud provides the high-performance Terway plug-in.
After considering the environmental restrictions, you may have an idea of which plug-ins can be used and which ones cannot. Then, consider your functional requirements.
- First, consider security requirements.
Kubernetes supports NetworkPolicy, allowing you to configure rules to support policies such as whether to allow inter-pod access. Not every CNI plug-in supports NetworkPolicy declaration. If you require NetworkPolicy support, you can select Calico or Weave.
- Second, consider the need to interconnect resources within and outside a cluster.
Applications deployed on virtual machines (VMs) or physical machines cannot be migrated all at once to a containerized environment. Therefore, it is necessary to configure IP address interconnection between VMs or physical machines and containers by interconnecting them or deploying them at the same layer. You can select a plug-in in Underlay mode. For example, the Sriov plug-in allows pods and legacy VMs or physical machines to run at the same layer. You can also use the Calico-BGP plug-in. Though containers are in a different CIDR block from the legacy VMs or physical machines, you can use Calico-BGP to advertise BGP routes to original routers, allowing the interconnection of VMs and containers.
- Lastly, consider Kubernetes’ service discovery and load balancing capabilities.
Service discovery and load balancing are services of Kubernetes. Not all CNI plug-ins provide these two capabilities. For many plug-ins in Underlay mode, the NIC of a pod is the Underlay hardware or is virtualized and inserted into a container through hardware. Therefore, the NIC traffic cannot be routed to the namespace where the host is located. As a result, you cannot apply the rules that the kube-proxy configures on the host.
In this case, the plug-in cannot access the service discovery capabilities of Kubernetes. If you require service discovery and load balancing, select a plug-in in Underlay mode that supports these two capabilities.
Consideration of functional requirements will narrow your plug-in choices. If you still have three or four plug-ins to choose from, you can consider performance requirements.
Pod performance can be measured in terms of pod creation speed and pod network performance.
- Pod Creation Speed
For example, when you need to scale out 1,000 pods during a business peak, you can use a CNI plug-in to create and configure 1,000 network resources. You can select a plug-in in Overlay or Routing mode to create pods quickly. The plug-in implements virtualization on machines, so you only need to call kernel interfaces to create pods. If you select a plug-in in Underlay mode, you need to create underlying network resources, which slow down the pod creation process. Therefore, we recommend that you select a plug-in in Overlay or Routing mode when you need to quickly scale out pods or create many pods.
- Pod Network Performance
The network performance of pods is measured by metrics such as inter-pod network forwarding, network bandwidth, and pulse per second (PPS) latency. A plug-in in Overlay mode will provide lower performance than plug-ins in Underlay and Routing modes because the former implements virtualization on nodes and encapsulates packets. This encapsulation causes packet header loss and CPU consumption. Therefore, do not select a plug-in in Overlay mode if you require high network performance in scenarios such as machine learning and big data. We recommend that you select a CNI plug-in in Underlay or Routing mode.
You can select an appropriate network plug-in by considering the preceding three requirements.
How to Develop Your Own CNI Plug-in
The plug-ins provided by the community may not meet your specific requirements. For example, only the VXLAN plug-in in Overlay mode can be used in Alibaba Cloud. This plug-in provides relatively poor performance and cannot meet some business requirements of Alibaba Cloud. In response, Alibaba Cloud developed the Terway plug-in.
You can develop a CNI plug-in if none of the plug-ins in the community are suitable for your environment.
A CNI plug-in is implemented as follows:
(1) A binary CNI plug-in is used to configure the NIC and IP address of a pod. This is equivalent to connecting a network cable to the pod, which has an IP address and NIC.
(2) A daemon process is used to manage the network connections between pods. This step connects pods to the network and enables them to communicate with each other.
Connect a Network Cable to a Pod
A network cable can be connected to a pod as follows:
Prepare an NIC for the Pod
You can connect one end of a veth virtual NIC to the pod’s network space and the other end of the NIC to the host’s network space. In this way, the namespaces of the pod and host are connected.
Allocate an IP Address to the Pod
Ensure that the IP address allocated to the pod is unique in the cluster.
When creating a cluster, we specify a CIDR block for each pod and allocate a CIDR block based on each node. As shown on the right in the preceding figure, the 172.16 CIDR block is created. We allocate a CIDR block suffixed with /24 by node. This avoids conflicts between the IP addresses on nodes. Each pod is allocated an IP address from the CIDR block of a specific node in sequence. For example, pod 1 is allocated 172.16.0.1, and pod 2 is allocated 172.16.0.2. This ensures that each node is in a different CIDR block and allocates a unique IP address to each pod.
In this way, each pod has a unique IP address in the cluster.
Configure the IP Address and Route of the Pod
- Step 1: Configure the allocated IP address to the pod’s NIC.
- Step 2: Configure the route of the cluster’s CIDR block on the pod’s NIC so that traffic to this pod is routed to its NIC. Also, configure the CIDR block of the default route on this NIC so that traffic to the Internet is routed to this NIC.
- Step 3: Configure the route destined for the pod’s IP address on the host and direct the route to the veth1 virtual NIC at the peer end of the host. In this way, traffic can be routed from the pod to the host, and access traffic from the host to the pod’s IP address can be routed to the peer end of the pod’s NIC.
Connect the Pod to the Network
After connecting a network cable to the pod, you can allocate an IP address and route table to the pod. Next, you can enable communication between pods by configuring each pod’s IP address to be accessible in the cluster.
Pods can be interconnected in the CNI daemon process as follows:
- The daemon process that a CNI runs on each node learns the IP address of each pod in the cluster and information about the node where this pod is located. The daemon process listens to the Kubernetes API server to obtain each pod’s IP address and node information. Each daemon is notified when nodes and pods are created.
- Then, the daemon process configures network connection in two steps:
- (1) The daemon process creates a channel to each node of the cluster. This channel is an abstract concept and implemented by the Overlay tunnel, the Virtual Private Cloud (VPC) route table in Alibaba Cloud, or the BGP route in your own data center.
- (2) The IP addresses of all pods are associated with the created channel. Association here is also an abstract concept and implemented through Linux routing, a forwarding database (FDB) table, or an Open vSwitch (OVS) flow table. Through Linux routing, you can configure a route from an IP address to a specific node. An FDB table is used to configure a route from a pod’s IP address to the tunnel endpoint of a node through an Overlay network. A flow table provided by OVS is used to configure a route from a pod’s IP address to a node.
Let’s summarize what we have learned in this article.
(1) How to select an appropriate CNI plug-in when building a Kubernetes cluster in your environment.
(2) How to develop a CNI plug-in when the CNI plug-ins available in the community cannot meet your requirements.