By Ren Xijun (Zhejian).
This article explains the forwarding modes of Linux Virtual Server (LVS) and their working processes. It describes the causes, advantages, and disadvantages of the network packet forwarding principle and illustrates the merits and demerits taking into account the Alibaba Cloud Server Load Balancer (SLB).
Terms, Abbreviations, and Acronyms
Let’s take a quick look at the various terms and acronyms used in the article.
RS: Real Server 后端真正提供服务的机器
LB： Load Balance 负载均衡器
LVS： Linux Virtual Server
sip： source ip
Forwarding Modes of LVS
The Linux Virtual Server helps in load balancing by eliminating single point of failure (SPOF). There are multiple ways to forward packets;
- DR — Director Routing
- NAT — Network Address Translation
- fullNAT — Full NAT
- ENAT — Enhanced NAT, also known as the Triangle Mode or DNAT, and is defined by Alibaba Cloud
- IP TUN — IP Tunneling
Director Routing (DR)
The preceding diagram illustrates how DR mode of LVS works. Now, let’s consider the following example to understand the process.
Assume that the CIP is 126.96.36.199 and the VIP is 188.8.131.52.
- Step 1 The request traffic with the source IP (SIP) address 184.108.40.206, and the destination IP (DIP) address 220.127.116.11 (aliased as (18.104.22.168, 22.214.171.124)) reaches LVS first.
- Step 2 Then, LVS selects one of the RSs according to the load policy and changes the MAC address of this network packet to that of the selected RS.
- Step 3 Lastly, LVS transmits this network packet to the switch, which then transmits the network packet to the chosen RS.
- Step 4 When the selected RS discovers that both the MAC address and the DIP belong to itself it directly manages and responds to the network packet.
- Step 5 The selected RS replies with a packet (126.96.36.199, 188.8.131.52).
- Step 6 The switch directly forwards the response packet to the client by bypassing LVS.
As shown in the preceding process, after the request packet arrives at LVS, it just changes the destination MAC address of the packet, and forwards the response packet directly to the client.
Besides, note that multiple RSs and LVS share the same IP address but use different MAC addresses. L2 routes do not require IP addresses, and therefore, the RSs and LVS are on the same VLAN.
The RS configures the VIP on the LO loopback network interface controller (NIC) and adds the corresponding rule to the route, so that the operating system (OS) processes the packets received at step 4.
- The DR mode features the best performance. An inbound request passes through LVS, and the response packet is directly sent to the client by bypassing LVS.
- LVS and the RS must belong to the same VLAN.
- This mode necessitates the configuration between the RS a VIP and specifically process the Address Resolution Protocol (ARP).
- It doesn’t support Port Mapping.
Why Must LVS and RSs Belong to the Same VLAN or the Same L2 Network?
In DR mode, multiple RSs and LVS share the same VIP, and packets route between LVS and the RSs on the basis of the MAC address. Therefore, LVS and the RSs must belong to the same VLAN or L2 network.
Reason For The Best Performance In DR Mode
Response packets do not pass through LVS. In most cases, request packets are small and response packets are large, which easily leads to a traffic bottleneck on the LVS. In addition, in DR mode, LVS only changes the MAC addresses of inbound packets.
Why Packets Bypass LVS in DR Mode
RSs and LVS share the same VIP. Therefore, an RS correctly sets its SIP to the VIP while replying to a packet withoutLVS needing to change the SIP. In contrast, LVS changes the SIP during NAT and full NAT modes.
Structure Summary in DR Mode
The above diagram shows the overall structure in DR mode. The green arrow indicates the inbound request packet, and the red arrow indicates the request packet with the changed MAC address.
Network Address Translation (NAT)
The following figure shows the structure in NAT mode.
The general process for this mode is as follows.
- Step 1 The client sends a request packet (184.108.40.206, 220.127.116.11).
- Step 2 When the request packet arrives at LVS, LVS modifies the request packet to (18.104.22.168, RIP).
- Step 3 Further, when the request packet arrives at RS, it replies with a response packet (RIP, 22.214.171.124).
- Step 4 It is critical to note that this response packet cannot be directly sent to the client because the RIP is not a VIP and needs to be reset.
- Step 5 However, LVS is a gateway. Therefore, this response packet is first sent to the gateway, which then changes the SIP.
- Step 6 The gateway changes the SIP to the VIP, and sends the modified response packet (126.96.36.199, 188.8.131.52) to the client.
- The configuration is simple.
- As the name implies, NAT supports port mapping.
- The RIP is a private address and mainly facilitates communication between LVS and RSs.
- LVS and all RSs must belong to the same VLAN.
- LVS must forward all inbound and outbound traffic.
- LVS often becomes a bottleneck.
- Set VIP to the IP address of the gateway for RSs.
Why LVS and RSs Belong to the Same VLAN in NAT Mode?
The client recognizes the response packet only after LVS changes the SIP to the VIP. If the response packet’s SIP is not the DIP (or VIP) of the request packet sent by the client, the connection is reset. Secondly, if LVS is not a gateway, the response packet is forwarded on other routes, because the DIP of the response packet is a CIP. In this case, LVS cannot change the SIP of the response packet.
Structure Summary in NAT Mode
Since, LVS only changes either the SIP or DIP of inbound and outbound packets, the full NAT mode emerges as a supplement. The greatest disadvantage of the NAT mode is that LVS and RSs must belong to the same VLAN, which limits the flexibility of deploying the LVS cluster and the RS cluster. NAT is basically impractical in commercial public cloud environments such as Alibaba Cloud.
This mode is similar to the NAT mode. The general process for this mode is as follows.
- Step 1 The client sends a request packet (184.108.40.206, 220.127.116.11).
- Step 2 When the request packet reaches LVS, LVS modifies the request packet to (18.104.22.168, RIP). Please note that both the SIP and the DIP change here.
- Step 3 When the request packet arrives at an RS, the RS replies with a packet (RIP, 22.214.171.124).
- Step 4 Unlike the NAT mode, wherein you set DIP to the CIP, here the DIP of this response packet is the VIP. Therefore, when LVS and the RS do not belong to the same VLAN, the response packet still reaches LVS through IP routes.
- Step 5 LVS changes the SIP to the VIP and the DIP to the CIP respectively, and sends the modified response packet (126.96.36.199, 188.8.131.52) to the client.
- This mode overcomes the challenge of NAT mode and doesn’t require LVS and RSs to belong to the same VLAN. Hence, it is applicable to more complex deployment scenarios.
- Unlike NAT mode, the CIP is invisible to the RS- Similar to NAT mode, all inbound and outbound traffic still passes through LVS, which is a bottleneck.
How Full NAT Resolves the LVS and RSs Problem of NAT Mode?
Taking cue from it’s name, Full NAT LVS changes both, SIP and DIP of the inbound packet. Besides, the DIP of the response packet from the RS is the VIP (which is the CIP in NAT mode). Therefore, LVS and RSs can belong to different VLANs, provided that the L3 network between the VIP and RSs is available. In other words, LVS no longer needs to be a gateway, and LVS and RSs can be deployed in a more complex network environment.
Why CIP is Invisible to RS in Full NAT Mode?
Since the CIP changes in the Full NAT mode, the RS can only see the VIP of LVS. In Alibaba, the Option field of a TCP packet carries the CIP. On receiving the packet,RS usually deploys a self-defined TOA module to read the CIP from Option. In this case, the RS can see the CIP. However, this is not a universal open-source solution.
Summary of the Structure in Full NAT Mode
It is significant to note the IP address changes for the green inbound packet and the red outbound packet in the preceding figure.
So far, full NAT meets the same-VLAN requirement as in NAT mode, and is basically ready for the public cloud. However, this still does not resolve the problem that all inbound and outbound traffic passes through LVS, which implies that LVS needs to modify the inbound and outbound packets.
The concern here is to determine if there is a solution that doesn’t restrict the network relationship between LVS and RSs prevailing in full NAT mode, and allows outbound traffic to bypass LVS as in DR mode.
Enhanced NAT (ENAT) Mode of Alibaba Cloud
The Enhanced NAT (ENAT) mode is also known as triangle mode or DNAT mode. The general process for this mode is as follows:
- Step 1 The client sends a request packet (CIP, VIP).
- Step 2 On receiving the request packet, LVS modifies the request packet to (VIP, RIP), and adds the CIP to the Option field of a TCP packet.
- Step 3 Based on IP address, the request packet routes to an RS, and the CTK module reads the CIP from the Option field of the TCP packet.
- Step 4 The CTK module intercepts the response packet (RIP, VIP) and modifies it to (VIP, CIP).
- Step 5 The response packet doesn’t pass through LVS and directly sent to the client, because the DIP of the response packet is the CIP.
- LVS and RSs may belong to different VLANs.
- Outbound traffic bypasses LVS, which ensures good performance.
- Alibaba Group’s custom solution requires all RSs to install the CTK component (similar to the TOA module in full NAT mode).
Why ENAT Mode Doesn’t Require Routing Back Response Packets to LVS?
In full NAT mode, LVS has to change the IP address in the response packet, and therefore the response packet must route back to LVS. However, in ENAT mode, the CTK module on the RS changes the IP in the response packet in advance.
Why LVS and RSs May Belong to Different VLANs in ENAT Mode?
The reason is the same as discussed earlier for the full NAT mode.
Summary of the Structure in ENAT Mode
IP Tunneling (IP TUN)
Finally, let’s take a look at the less-used IP TUN mode. The general process for this mode is as follows:
- Step 1 When the request packet arrives at LVS, it encapsulates the request packet into a new IP packet.
- Step 2 LVS sets the DIP of the new IP packet to the IP address of an RS, and then forwards the IP packet to the RS.
- Step 3 After the RS receives the packet, the IPIP kernel module decapsulates it and extracts the user’s request message.
- Step 4 Finding that the DIP is a VIP and this IP address is configured for the tunl0 NIC of the RS, the RS directly handles the request and sends the results to the client.
- Cluster nodes may belong to different VLANs.
- Similar to the DR mode, the client directly receives the response packets.
- This mode requires RS to install the IPIP module.
- Adds another IP header.
- Similar to the DR mode,the tunl0 virtual NICs of LVS and RSs must install the same VIP.
Note: In DR mode, LVS changes the destination MAC address.
Why Cluster Nodes Belong to Different VLANs in IP TUN Mode?
The MAC address remains unchanged in IP TUN mode. Therefore, cluster nodes may belong to different VLANs, given that the communication between the IP addresses of LVS and RSs is available. Broadcasts between LVS and RSs must be available in DR mode.
IP TUN Performance
The response packet bypasses LVS. However, compared to the processing in DR mode, this mode allows additional encapsulation and decapsulation of the response packet.
Summary of the Structure in IP TUN Mode
In the preceding figure, the red line indicates the re-encapsulated packet whereas the IPIP module indicates a kernel module of the OS.
This article throws light on various modes of Linux Virtual Server (LVS). It covers the fairly popular DR mode along with the lesser known IP TUN mode. Further, it also highlights the other three modes which are analogous and more popular. Hope you find the working process of each LVS mode explained in this article pragmatic.