Analysis of UDP packet loss problem in Linux system

  1. First, the network message is sent to the NIC via a physical cable
  2. The network driver reads the messages from the network into the ring buffer, which uses DMA (Direct Memory Access) and does not require CPU involvement
  3. The kernel reads the message from the ring buffer, executes the logic of the IP and TCP/UDP layer, and finally puts the message into the application’s socket buffer.
  4. Application reads packets from socket buffer for processing
~# ifconfig eth0
...
RX packets 3553389376 bytes 2599862532475 (2.3 TiB)
RX errors 0 dropped 1353 overruns 0 frame 0
TX packets 3479495131 bytes 3205366800850 (2.9 TiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
...
[[email protected] GOD]# netstat -s -u
IcmpMsg:
InType0: 3
InType3: 1719356
InType8: 13
InType11: 59
OutType0: 13
OutType3: 1737641
OutType8: 10
OutType11: 263
Udp: 517488890 packets received 2487375 packets to unknown port received. 47533568 packet receive errors 147264581 packets sent 12851135 receive buffer errors 0 send buffer errors
UdpLite:
IpExt:
OutMcastPkts: 696
InBcastPkts: 2373968
InOctets: 4954097451540
OutOctets: 5538322535160
OutMcastOctets: 79632
InBcastOctets: 934783053
InNoECTPkts: 5584838675
  • packet receive errorsis not empty and has been growing to indicate that the system has UDP drops
  • packets to unknown port receivedIndicates that the destination port where the UDP message received by the system is not being used for monitoring, generally the service is not started, and does not cause serious problems
  • receive buffer errorsIndicates the number of packets dropped because the receive cache for UDP is too small
# ethtool -S eth0 | grep rx_ | grep errors
rx_crc_errors: 0
rx_missed_errors: 0
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
rx_errors: 0
rx_length_errors: 0
rx_over_errors: 0
rx_frame_errors: 0
rx_fifo_errors: 0
# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 256
RX Mini: 0
RX Jumbo: 0
TX: 256
int disable = 1;
setsockopt(sock_fd, SOL_SOCKET, SO_NO_CHECK, (void*)&disable, sizeof(disable)
  • /proc/sys/net/core/rmem_max: Allowed to set the receive buffer maximum value
  • /proc/sys/net/core/rmem_default: Default receive buffer value used
  • /proc/sys/net/core/wmem_max: Allow setting of the send buffer maximum value
  • /proc/sys/net/core/wmem_dafault: Default send buffer maximum value used
sysctl -w net.core.rmem_max=26214400 # Set to 25M
sudo sysctl -w net.core.netdev_max_backlog=2000
uint64_t receive_buf_size = 20*1024*1024; //20 MB
setsockopt(socket_fd, SOL_SOCKET, SO_RCVBUF, &receive_buf_size, sizeof(receive_buf_size));
# dropwatch -l kas
Initalizing kallsyms db
dropwatch> start
Enabling monitoring...
Kernel monitoring activated.
Issue Ctrl-C to stop monitoring 1 drops at tcp_v4_do_rcv+cd (0xffffffff81799bad) 10 drops at tcp_v4_rcv+80 (0xffffffff8179a620) 1 drops at sk_stream_kill_queues+57 (0xffffffff81729ca7) 4 drops at unix_release_sock+20e (0xffffffff817dc94e) 1 drops at igmp_rcv+e1 (0xffffffff817b4c41) 1 drops at igmp_rcv+e1 (0xffffffff817b4c41)
sudo perf record -g -a -e skb:kfree_skb
sudo perf script
  • UDP itself is a non-connected and unreliable protocol, applicable to the occasional loss of messages and does not affect the status of the program, such as video, audio, games, monitoring and so on. Applications that require higher message reliability do not use UDP, it is recommended to use TCP directly. Of course, the application layer can also be retried, to ensure reliability
  • If you find that the server drops, first through monitoring to see if the system load is too high, first try to reduce the load and then see if the problem of packet loss disappears
  • If the system load is too high, UDP packet loss is not an effective solution. If the application is abnormal result in CPU, memory, IO too high, please locate the abnormal application and repair in time, if the resources are not enough, monitoring should be able to find and rapidly expand
  • For a large number of systems receiving or transmitting UDP packets, you can reduce the probability of packet loss by adjusting the socket buffer size of the system and program.
  • When processing UDP packets, the application should be asynchronous and not have too much processing logic between the two received packets.
  • Pivotal:network Troubleshooting Guide
  • What is UDP “packet receive errors” and “packets to unknown Port received”
  • Lost Multicast Packets Troubleshooting Guide
  • Splunk answers:udp Drops on Linux

Original Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com