Getting Started with Kubernetes | Kubernetes Network and Policy Control
By Ye Lei (Daonong), Senior Technical Expert of Alibaba
1) Basic Network Model of Kubernetes
This article introduces some underlying fundamentals for the Kubernetes network model. Kubernetes does not restrict network implementation solutions, and therefore no existing ideal reference cases are available. Kubernetes defines a container network model to determine restrictions on a container network. Generally, this is summarized into three requirements and four goals.
These three requirements must be met for a qualified container network solution. The four goals are to satisfy major metrics such as connectivity in the design of the network topology and the implementation of network functions.
The three requirements are as follows:
- 1) Any two pods may communicate with each other, without explicitly using Network address translation (NAT) to receive data or translate addresses.
- 2) Nodes and pods may communicate with each other, without explicit address translation.
- 3) The IP address displayed for a pod is the same as that for others, without any intermediate transitions.
The reason why Kubernetes has some arbitrary models and requirements on container networks will be described later.
The four goals specify how to connect an external system to applications in a container in a Kubernetes system.
- How does the external system communicate with services? In other words, how does an Internet or external users use services? Here, the services refer to Kubernetes services.
- How do these services communicate with backend pods?
- How do pods invoke each other?
- How do containers in a pod communicate with each other?
The ultimate goal is to enable the connected external system to provide services for containers.
The development of container networks is complex because it depends on the host network. From this perspective, container networks are divided into two categories: underlay and overlay.
- An underlay network is located at the same layer of the host network. Whether the underlay network has the same CIDR block and basic input and output devices as the host network are visible, as is whether a container IP address is distributed or assigned by the same center as the host network.
- By contrast, an overlay network does not need to apply for an IP address from the IPM management component on the host network. Generally, any IP address not conflicting with that of the host network is assigned to the overlay network.
It is critical to understand why does the community put forward perPodperIP and which is a simple and arbitrary model. In my opinion, it is because this model provides a lot of benefits for follow-up performance tracking and the monitoring of certain services. Always using the same IP address is beneficial for cases.
2) Network Namespace
This section describes the kernel foundation that is implemented by the network in the network namespace. In a narrow sense, the runC container technology is executed based on its kernel without relying on any hardware. The kernel representative process is a task. If isolation is not required, the host space (namespace) without a special space isolation data structure (nsproxy-namespace proxy) is used.
However, if an independent network proxy or mount proxy is used, private data needs to be added. The data structure is shown in the preceding figure.
The network namespace looks like an isolated network space that has its own network interface controller (NIC) or network device. The NIC is either a virtual or physical, which has its own IP address, IP table, routing table, and protocol stack status. The protocol stack is the TCP/IP protocol stack, which has its own status, iptables, and IPVS.
In other words, it is a completely independent network, which is isolated from the host network. The code of the protocol stack is common, but the data structures are different.
Relationships Between Pods and Network Namespaces
The preceding figure shows the relationships between pods and network namespaces. Each pod has an independent network space that is shared among pod net containers. It’s recommended to use the loopback interface for Kubernetes to enable communication between pod net containers. All the containers provide external services by using the pod IP address. The root network namespace on the host is considered to be a special network space whose PID is 1.
3) Overview of Mainstream Network Solutions
Typical Implementation Solutions for Container Networks
This section briefly introduces typical implementation solutions for container networks. Container networks are implemented in Kubernetes in various ways. The container network is complex as it involves coordination with the underlying IaaS network and a trade-off between performance and the flexibility of IP address allocation. Therefore, various solutions may be used.
This section describes several popular solutions, including Flannel, Calico, Canal, and WeaveNet. Most of these solutions use a policy-based routing method similar to Calico.
- Flannel is a comprehensive solution that provides various network backends. Different backends implement different topologies. This solution supports various scenarios.
- Calico uses policy-based routing, with BGP used among nodes to synchronize routes. It features rich functions and better support for network points. Calico requires a direct connection to the underlying network by using the MAC address, not across the L2 network.
- Some community members combine the advantages of Flannel and Calico into a grafted innovative project named Cilium.
- To encrypt data, use WeaveNet whose dynamic solutions achieve better encryption.
The Flannel solution is the most commonly used solution. The preceding figure shows a typical container network solution. First of all, a bridge is added to send a package from the container to the host. Its backends are independent. Hence, allows customization of how the package is transmitted off the host, which encapsulation method is used, and whether encapsulation is required.
The following section describes three major backends:
- The first is the user datagram protocol (UDP) for the user mode, which was also the earliest implementation.
- The second is VxLAN for the kernel mode. Both backends are overlay-network solutions. VxLAN offers better performance but requires kernels that support VxLAN features.
- If clusters are not large and reside on the same L2 network, use host-gw. In this mode, the backend is started by a broadcast routing rule, with higher performance.
4) Use of the Network Policy
The Concept Behind Network Policy
Let’s understand the concept behind network policy.
As mentioned previously, the basic network model of Kubernetes requires all pods to be interconnected, which may incur some problems. In a Kubernetes cluster, some traces cannot call each other directly. For example, if you want to block department A from accessing the services of department B, using policy is mandatory.
Basically, various selectors (labels or namespaces) are used to find a set of pods or both ends of the communication. Then, whether they can be interconnected is determined based on the stream feature description, which is considered as a whitelist.
Before using the network policy, enable extensions, v1beta1, and network policies for the API server, as shown in the preceding figure. More importantly, the specific network plug-in must support the implementation of the network policy. The network policy is merely an object provided by Kubernetes, which has no built-in components for implementation. The implementation depends on the selected container network solution. For example, the Flannel solution does not help because it does not actually implement the network policy.
To design a network policy, determine the following parameters:
- The first is the control object, such as the spec part in this example. In spec, use the podSelector or namespace selector to select a set of pods to be controlled.
- The stream direction also needs to be determined, which might be inbound, outbound, or both.
- Most importantly, describe the permitted or denied streams based on specified stream direction and control object. For example, for the stream feature, use some selectors to determine the peer end (which is actually selecting an object), use the IPBlock mechanism to determine the allowed IP addresses, and determine the protocols and ports. Combine the stream features into a quintuple that selects specific acceptable streams.
This section briefly summarizes this article.
- The IP address is the core of a pod container network. It is the foundation for each pod to communicate with external systems. Meanwhile, it must be identical for the internal and external systems and complies with the model features of Kubernetes.
- For network solutions, topology is the key that influences the container network performance. It’s significant to understand how the source and destination ends of the package are interconnected, how the package is sent from the container to the host from the container, whether the package needs to be encapsulated or decapsulated after being sent, whether the package is sent through policy-based routing, and how the package is parsed out when it reaches the peer end.
- Select the container network and design. Assume that there’s no information about the external network or the need for a universally adaptable solution. In such a case, if you were not sure whether the MAC address supported direct connection or whether you could control the routing table of the external router, choose the Flannel solution and use VxLAN as a backend solution. If you are sure that the network is dual-layer and directly connected, use Calico or Flannel-Hostgw as a backend solution.
- A network policy is a powerful tool that accurately controls inbound and outbound streams during O&M and use. Select an implementation method based on the desired control objects and streams.
Here are some questions for you:
1) Interfaces have been standardized according to the Container Network Interface (CNI). However, why does the container network have no standard implementation but is instead built inside Kubernetes?
2) Why does the network policy have no standard controller or implementation but is instead provided by the container network owner?
3) Is it possible to implement the container network with zero network devices? Given the solutions that differ from the TCP/IP solution, such as RDMA,
4) Many network problems occur during O&M and it is difficult to troubleshoot them. In this case, is it necessary to develop an open-source tool to check the network status between containers and hosts, between hosts, or between encapsulation and decapsulation? As far as we know, this tool is not yet available.
So far, this article covered all the concepts and the utilization of network policies for Kubernetes container networks.