How to Set Constraints on Kubernetes Resources

7 min readJan 2, 2019

As the most popular container cluster management platform, Kubernetes needs to coordinate the overall resource usage of a cluster and allocate appropriate resources to containers in pods. It must ensure that resources are fully utilized to maximize resource utilization, and that important containers in operation are allocated sufficient resources for stable operation.

Configure Constraints on Container Resources

The most basic resource metrics for a pod are CPU and memory.

Kubernetes provides requests and limits to pre-allocate resources and limit resource usage, respectively.

Limits restrict the resource usage of a pod as follows:

If its memory usage exceeds the memory limit, this pod is out of memory (OOM) killed.
If its CPU usage exceeds the CPU limit, this pod is not killed, but its CPU usage is restricted to the limit.

Testing the Memory Limit

Deploy a stress testing container in a pod and allocate a memory of 250 MiB for stress testing. The memory limit of the pod is 100 MiB.

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
  namespace: example
spec:
  containers:
  - name: memory-demo-2-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "50Mi"
      limits:
        memory: "100Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "250M", "--vm-hang", "1"]

After deployment, check the pod status. We can see that it is OOM killed.

kubectl -n example get po
NAME             READY     STATUS        RESTARTS   AGE
memory-demo      0/1       OOMKilled     1          11s

Testing the CPU Limit

apiVersion: v1
kind: Pod
metadata:
  name: cpu-demo
  namespace: example
spec:
  containers:
  - name: cpu-demo-ctr
    image: vish/stress
    resources:
      limits:
        cpu: "1"
      requests:
        cpu: "0.5"
    args:
    - -cpus
    - "2"

Check the container information. Although the pod is not killed, its CPU usage is restricted to 1,000 millicpu.

kubectl -n example top po cpu-demo
NAME       CPU(cores)   MEMORY(bytes)
cpu-demo   1000m        0Mi

Container QoS

Kubernetes manages the quality of service (QoS). Based on resource allocation for containers, pods are divided into three QoS levels: Guaranteed, Burstable, and BestEffort. If resources are insufficient, scheduling and eviction strategies are determined based on the QoS level. The three QoS levels are described as follows:

Guaranteed: Limits and requests are set for all containers in a pod. Each limit is equal to the corresponding request. If a limit is set but the corresponding request is not set, the request is automatically set to the limit value.
Burstable: Limits are not set for certain containers in a pod, or certain limits are not equal to the corresponding requests. During node scheduling, this type of pod may overclock nodes.
BestEffort: Limits and requests are not set for any containers in a pod.

Code for querying QoS: https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/helper/qos/qos.go

Impact of Different QoS Levels on Containers

OOM

Kubernetes sets the oom_score_adj parameter based on the QoS level. The oom_killer mechanism calculates the OOM score of each pod based on its memory usage and comprehensively evaluates each pod in combination with the oom_score_adj parameter. Pods with higher process scores are preferentially killed when OOM occurs.

If the memory resources of a node are insufficient, pods whose QoS level is Guaranteed are killed in the end. Pods whose QoS level is BestEffort are preferentially killed. For pods whose QoS level is Burstable, the oom_score_adj parameter value ranges from 2 to 999. According to the formula, if the memory request is larger, the oom_score_adj parameter value is smaller and such pods are more likely to be protected during OOM.

Source Code

Node information:
# kubectl describe no cn-beijing.i-2zeavb11mttnqnnicwj9 | grep -A 3 Capacity
Capacity:
 cpu:     4
 memory:  8010196Ki
 pods:    110
apiVersion: v1
kind: Pod
metadata:
  name: memory-demo-qos-1
  namespace: example
spec:
  containers:
  - name: memory-demo-qos-1
    image: polinux/stress
    resources:
      requests:
        memory: "200Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "50M", "--vm-hang", "1"]
    
---
apiVersion: v1
kind: Pod
metadata:
  name: memory-demo-qos-2
  namespace: example
spec:
  containers:
  - name: memory-demo-qos-2
    image: polinux/stress
    resources:
      requests:
        memory: "400Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "50M", "--vm-hang", "1"]---
apiVersion: v1
kind: Pod
metadata:
  name: memory-demo-qos-3
  namespace: example
spec:
  containers:
  - name: memory-demo-qos-3
    image: polinux/stress
    resources:
      requests:
        memory: "200Mi"
        cpu: "2"
      limits:
        memory: "200Mi"
        cpu: "2"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "50M", "--vm-hang", "1"]

Each node can be allocated a memory of 8,010,196 KiB, which is about 7,822.45 MiB.

According to the formula for the QoS level at Burstable:

request 200Mi: (1000 ¨C 1000 x 200/7822.45) = About 975request 400Mi: (1000 ¨C 1000 x 400/7822.45) = About 950

The oom_score_adj parameter values of these three pods are as follows:

// request 200Mi
  kubectl -n example exec  memory-demo-qos-1 cat /proc/1/oom_score_adj
975// request 400Mi?
  kubectl -n example exec  memory-demo-qos-2 cat /proc/1/oom_score_adj
949// Guaranteed
  kubectl -n example exec  memory-demo-qos-3 cat /proc/1/oom_score_adj
-998

Code for setting OOM rules: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/qos/policy.go

Pod Eviction

If the memory and CPU resources of a node are insufficient and this node starts to evict its pods, the QoS level also affects the eviction priority as follows:

The kubelet preferentially evicts pods whose QoS level is BestEffort and pods whose QoS level is Burstable with resource usage larger than preset requests.
Then, the kubelet evicts pods whose QoS level is Burstable with resource usage smaller than preset requests.
At last, the kubelet evicts pods whose QoS level is Guaranteed. The kubelet preferentially prevents pods whose QoS level is Guaranteed from being evicted due to resource consumption of other pods.
If pods have the same QoS level, the kubelet determines the eviction priority based on the pod priority.

ResourceQuota

Kubernetes provides the ResourceQuota object to set constraints on the number of Kubernetes objects by type and the amount of resources (CPU and memory) in a namespace.

One or more ResourceQuota objects can be created in a namespace.
If the ResourceQuota object is configured in a namespace, requests and limits must be set during deployment; otherwise, pod creation is rejected.
To avoid this problem, the LimitRange object can be used to set the default requests and limits for each pod.
For more information about extended resources supported in versions later than Kubernetes V1.10, see https://kubernetes.io/docs/tasks/configure-pod-container/extended-resource/

apiVersion: v1
kind: ResourceQuota
metadata:
  name: mem-cpu-demo
  namespace: example
spec:
  hard:
    requests.cpu: "3"
    requests.memory: 1Gi
    limits.cpu: "5"
    limits.memory: 2Gi
    pods: "5"

LimitRange

The LimitRange object is used to set the default resource requests and limits as well as minimum and maximum constraints for each pod in a namespace.

apiVersion: v1
kind: LimitRange
metadata:
  name: mem-limit-range
  namespace: example
spec:
  limits:
  - default:  # default limit
      memory: 512Mi
      cpu: 2
    defaultRequest:  # default request
      memory: 256Mi
      cpu: 0.5
    max:  # max limit
      memory: 800Mi
      cpu: 3
    min:  # min request
      memory: 100Mi
      cpu: 0.3
    maxLimitRequestRatio:  # max value for limit / request
      memory: 2
      cpu: 2
    type: Container # limit type, support: Container / Pod / PersistentVolumeClaim

The LimitRange object supports the following parameters:

default: indicates default limits.
defaultRequest: indicates default requests.
max: indicates maximum limits.
min: indicates minimum requests.
maxLimitRequestRatio: indicates the maximum ratio of a limit to a request. Because a node schedules resources based on pod requests, resources can be oversold. The maxLimitRequestRatio parameter indicates the maximum oversold ratio of pod resources.

Summary

Kubernetes sets requests and limits for container resources. To maximize resource utilization, Kubernetes determines scheduling strategies based on pod requests to oversell node resources.

Kubernetes restricts the resource usage of pods based on preset limits. If the memory usage of a pod exceeds the memory limit, this pod is OOM killed. The CPU usage of a pod cannot exceed the CPU limit.

Based on requests and limits of each pod, Kubernetes determines the QoS level and divides pods into three QoS levels: Guaranteed, Burstable, and BestEffort. If node resources are insufficient and pods are to be evicted or OOM killed, the kubelet preferentially protects pods whose QoS level is Guaranteed, and then pods whose QoS level is Burstable (where pods whose QoS level is Burstable with larger requests but less resource usage are preferentially protected). The kubelet preferentially evicts pods whose QoS level is BestEffort.

Kubernetes provides the RequestQuota and LimitRange objects to set constraints on pod resources and the number of pods in a namespace. The RequestQuota object is used to set the number of various objects by type and the amount of resources (CPU and memory). The LimitRange object is used to set the default requests and limits, minimum and maximum requests and limits, and oversold ratio for each pod or container.

Reasonable and consistent pod limits and requests must be set for some important online applications. Then, if resources are insufficient, Kubernetes can preferentially guarantee the stable operation of such pods. This helps improve resource usage.

Pod requests can be appropriately reduced for some non-core applications that occasionally occupy resources. In this case, such pods can be allocated to nodes with fewer resources during scheduling to maximize resource utilization. However, if node resources become insufficient, such pods are still preferentially evicted or OOM killed.

References

To learn more about Alibaba Cloud Container Service for Kubernetes, visit https://www.alibabacloud.com/product/kubernetes

Reference:https://www.alibabacloud.com/blog/how-to-set-constraints-on-kubernetes-resources_594313?spm=a2c41.12451150.0.0