Kubernetes CPU Management Policies

Image for post
Image for post

By Alwyn Botha, Alibaba Cloud Community Blog author.

This tutorial demonstrates how to define Pods that run CPU-intensive workloads that are sensitive to context switches plus the following characteristics:

From https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/

CPU manager might help Kubernetes Pod workloads with the following characteristics:
Sensitive to CPU throttling effects.
Sensitive to context switches.
Sensitive to processor cache misses.
Benefits from sharing a processor resources (e.g., data and instruction caches).
Sensitive to cross-socket memory traffic.
Sensitive or requires hyperthreads from the same physical CPU core.

When your Kubernetes node is under light CPU load, there is no problem. This means there are enough CPU cores for all Pods to work as if they were the only Pods using its CPU.

When many CPU-intensive loads run Pods compete for CPU cores then Pods must share CPU time. As CPU time becomes available such workloads may get done on other CPU cores.

A significant part of CPU time is spend switching between these workloads. This is called a context switch.

https://en.wikipedia.org/wiki/Context_switch

A context switch is the process of storing the state of a process or of a thread, so that it can be restored and execution resumed from the same point later. This allows multiple processes to share a single CPU.

Context switches are usually computationally intensive.

Kubernetes allows you to configure its CPU Manager policy so that such workloads can run more efficiently.

The kubelet CPU Manager policy is set with — cpu-manager-policy

Two policies are supported:

  • none: the default: the scheduling behavior you normally get on Linux systems.
  • static: certain Pods are given near exclusivity to CPU cores on a node.

near exclusivity: Kubernetes processes may still use part of the CPU time

near exclusivity: all other Pods are prevented form using the allocated CPU core

Certain Pods

Only Guaranteed pods ( Pods with matching integer CPU requests and limits ) are granted exclusive access to the CPU requests the specify.

From https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/

This static assignment increases CPU affinity and decreases context switches due to throttling for the CPU-bound workload.

From https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-guaranteed

  • For a Pod to be given a Quality of Service / QoS class of Guaranteed:
  • Every Container in the Pod must have a memory limit and a memory request, and they must be the same.
  • Every Container in the Pod must have a CPU limit and a CPU request, and they must be the same.

Contents

  • shared pool of CPUs
  • minikube configuration
  • ERROR: configured policy “static” differs from state checkpoint policy “none”
  • Demo of CPU Management Policy
  • minikube kubelet cpumanager startup
  • not exclusive but guaranteed Pod
  • kube-reserved=”cpu=500m”
  • CPU manager benchmarks

Shared Pool of CPUs

Initially all CPUs are available to all Pods. Guaranteed Pods remove CPUs from this availability pool.

When using this static policy — kube-reserved or — system-reserved must be specified.

These settings reserve CPU resources for Kubernetes system daemons.

— kube-reserved are allocated first : Kubernetes must be able to run

Guaranteed Pods then remove their requested integer CPU quantities from the shared CPU pool.

BestEffort and Burstable pods use remaining CPU pool. Their workload may context switch from time to time but this does not seriously affect their performance. ( If it seriously affects their performance they would have been defined Guaranteed. )

Minikube Configuration

This tutorial uses minikube. To use static CPU Management Policies we need to set these kubelet settings:

  • cpu-manager-policy=”static”
  • kube-reserved=”cpu=500m” … reserve a bit of CPU for Kubernetes daemons
  • feature-gates=”CPUManager=true” … switch on capability to use static CPU allocations.

Minikube will start up as normal. Only when you make a syntax error will it hang.

Everything looks great. Please enjoy minikube! means settings got applied successfully.

ERROR: configured policy “static” differs from state checkpoint policy “none”

Unfortunately this is not mentioned in the documentation.

You cannot just change your Kubernetes cluster from CPU policy “none” to CPU policy “static” using just those flags on minikube start.

You will get this error at the bottom of journalctl -u kubelet :

To fix this problem get a list of your nodes:

Then drain your node name:

Make a backup of /var/lib/kubelet/cpu_manager_state

Allow the node to accept work again:

If you now run minikube with the static CPU Management Policy flags; it should work now.

Demo of CPU Management Policy

We need to define a Guaranteed Pod:

Kubernetes Millicores

You can specify CPU requests and limits using whole integers or Millicores.

One CPU core comprises 1000m = 1000 millicores.

A 4 core node has a CPU capacity of 4000m

4 cores * 1000m = 4000m

CPU Management Policies understand both formats. ( Specifying 1000m for both the CPU values would also work in our example above. )

Actual extract from minikube logs moments after the Pod got created.

Logs edited for readability:

  • myguaranteed-pod under management of static policy
  • it gets 1 CPU allocated: allocateCpus: (numCPUs: 1)
  • updated default cpuset: “0,2–3” … remaining Pods can use these CPUs

kubectl describe pods/myguaranteed-pod | grep QoS

QoS Class: Guaranteed

kubelet logs after delete:

Default cpuset updated to include all CPUs again.

Minikube Kubelet cpumanager Startup

Here are some interesting log lines showing cpumanager startup cpuset assignments

Raw log lines

Edited for clarity.

  • CPU manager detected 4 CPUs
  • reserved 1 CPUs (“0”) not available for exclusive assignment … probably due to — kube-reserved=”cpu=500m”. Part of that CPU is reserved so that full CPU cannot be available for exclusive assignment.
  • default cpuset: “0–3” … all 4 CPUs available to all Pods … including remaining 500 millicores of CPU 0.

Not Exclusive but Guaranteed Pod

If we create a guaranteed Pod , but with FRACTIONAL CPU requests and limits, there are no lines added to kubelet logs about changes in cpuset.

Such Pods will share usage of CPUs in the pool: default cpuset: “0–3”

If you now investigate the tail end of minikube logs you will not find any mention of cpuset being changed.

kube-reserved=”cpu=500m”

You can see how this setting ( we used right at start of tutorial ) reserved CPU resources for Kubernetes.

Extract from my 4 CPU Kubernetes node. Note that 500m CPU is reserved.

cpu : 4 = 4000m

CPU Manager Benchmarks

This tutorial gave practical experience on Kuberentes CPU manager, you can find more theoretical information about the Kubernetes CPU manager at the official Kubernetes blog: https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/

Original Source

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store