Exploring Alibaba Group’s PouchContainer Resource Management APIs — Part 1
PouchContainer is Alibaba Group’s efficient, open source, enterprise-class container engine technology featuring strong isolation, high portability and low resource consumption. It can assist enterprise clients in containerizing inventory operations and improve the utilization of physical resources in ultra-large-scale data centers.
Resource management is an important part of the container runtime. This article will introduce you to the common APIs of PouchContainer resource management and corresponding underlying kernel APIs. This article provides test cases for some APIs for the sake of understandability.
Common APIs for PouchContainer Resource Management
APIDescription — blkio-weightThe relative weight of block device IO, which is an integer ranging from 0 to 100. — blkio-weight-deviceIO relative weight of the specified block device. — cpu-periodPeriod value in the completely fair algorithm. — cpu-quotaQuota value in the completely fair algorithm. — cpu-shareCPU share (relative weight). — cpuset-cpusLimits CPU cores used by the container. — cpuset-memsLimits memory nodes used by the container, which only applies in NUMA systems. — device-read-bpsLimits read rate for the device, which is a positive integer measured in KB, MB, or GB. — device-read-iopsLimits read IO per second for the device, which is a positive integer. — device-write-bpsLimits write rate for the device, which is a positive integer measured in KB, MB, or GB. — device-write-iopsLimits write IO per second for the device, which is a positive integer.-m, — memoryMemory limit, which is a positive integer measured in B, KB, MB, or GB. — memory-swapTotal memory (physical memory + swap partition) limit, which is a positive integer measured in B, KB, MB, or GB. — memory-swappinessThe option to adjust the swap partition used by the container memory, which is an integer between 0 and 100 (inclusive). — memory-wmark-ratio
Used to calculate low_wmark, range of values: an integer between 0 and 100 (inclusive).--oom-kill-disableWhether to kill the container when the memory is used up.--oom-score-adjSets the possibility that the container process will trigger OOM, the larger the value, the easier the OOM of the container process will be triggered.--pids-limitUsed to limit the number of PIDs inside the container.
Underlying Core Technology of PouchContainer Resource Management
Memory Resource Management
APICorresponding Kernel APIKernel API Description-m, — memory
cgroup/memory/memory.limit_in_bytesSets the memory limit in bytes, and kb/KB, mb/MB or gb/GB can also be used.--memory-swap
cgroup/memory/memory.memsw.limit_in_bytesSets the total memory and swap partition usage. By setting this value, you can prevent the process from using up the swap partition.--memory-swappinesscgroup/memory/memory.swappinessControls the tendency of the kernel to use the swap partition. Range of values: an integer between 0 and 100 (inclusive). The smaller the value, the more likely it is to use physical memory.--memory-wmark-ratio
cgroup/memory/memory.wmark_ratioUsed to calculate low_wmark, low_wmark = memory.limit_in_bytes * MemoryWmarkRatio. When memory.usage_in_bytes is greater than low_wmark, the kernel thread is triggered to perform memory reclamation. When the memory.usage_in_bytes is less than high_wmark, the reclamation is stopped.--oom-kill-disable
cgroup/memory/memory.oom_controlIf it is set to 0, then when the memory usage exceeds the limit, the system will not kill the process, but block the process until memory is released for use. On the other hand, the system will send an event notification to the user mode, and the monitor of the user mode can respond according to the event, for example, increase the memory limit.--oom-score-adj
/proc/$pid/oom_score_adjSets the possibility that the process will trigger OOM. The larger the value, the easier the OOM of the container process will be triggered.
CPU Resource Management
APICorresponding cgroup APICgroup API Description — cpu-period
cgroup/cpu/cpu.cfs_period_usUsed to limit CPU bandwidth, and needs to be used with
cpu.cfs_quota_us. We can set the interval to 1 second and the quota to 0.5 seconds. The process in the cgroup can only run for a maximum of 0.5 seconds in 1 second, and then it will be forced to hibernate until the next 1 second.--cpu-quota
cgroup/cpu/cpu.cfs_quota_usUsed to limit CPU bandwidth, and needs to be used with
cpu.cfs_period_us.--cpu-sharecgroup/cpu/cpu.sharesAn API used to allocate the CPU proportion. Assuming that we create two cgroups (C1 and C2) in the root directory of cgroupfs, and configure cpu.shares to 512 and 1024 respectively. When there is a CPU contention between C1 and C2, C2 will get twice as much CPU usage as C1. It should be noted that CPU share will only work when they contend for CPU. If C2 is idle, then C1 can get the entire CPU resource.--cpuset-cpuscgroup/cpuset/cpuset.cpusA list of CPUs that the process is allowed to use (for example: 0–4, 9).--cpuset-memscgroup/cpuset/cpuset.memsA list of memory nodes that the process is allowed to use (for example: 0–1).
IO Resource Management
APICorresponding cgroup APICgroup API Description — blkio-weightcgroup/blkio/blkio.weightSets the weight value, which is an integer between 10 and 1,000 (inclusive). This is similar to cpu.shares, which allocates proportion instead of an absolute bandwidth limit, so it only works when different cgroups contend for the bandwidth of the same block device. — blkio-weight-devicecgroup/blkio/blkio.weight_deviceSets the weight value for the specific device, which will override the above blkio.weight. — device-read-bps
cgroup/blkio/blkio.throttle.read_bps_deviceSets the bandwidth limit for reading from the block device per second for a specific device.--device-write-bps
cgroup/blkio/blkio.throttle.write_bps_deviceSets the bandwidth limit for writing to the block device per second. The device must be specified.--device-read-iops
cgroup/blkio/blkio.throttle.read_iops_deviceSets the IO limit for reading from the block device. The device must be specified.--device-write-iops
cgroup/blkio/blkio.throttle.write_iops_deviceSets the IO limit for writing to the block device. The device must be specified.
Other Resource Management APIs
APICorresponding cgroup APICgroup API Description — pids-limitcgroup/pids/pids.maxLimits the number of processes.