Kubernetes Taints and Tolerations

Image for post
Image for post

By Alwyn Botha, Alibaba Cloud Community Blog author.

Taints and Tolerations are best understood using several exercises, which can be achieved through this tutorial.

Taints spoil a node electronically — marking it as undesirable for Pods. Pods specify tolerations — meaning they will tolerate a node with certain taints.

You can use taints and tolerations to deliberately prevent certain Pods from running on a node, or, to deliberately let certain Pods run on a node ( for example Pods that need ssd or GPUs, etc. ).

This tutorial contains several examples of taints and tolerations to help you get practical experience of this abstract concept.

Prerequisites

This tutorial will work best if run on a Kubernetes cluster with only one node.

One node gets tainted and Pods are run to determine if they tolerate the taints on that one node.

If you have a vast cluster of nodes your Pod will automatically run on the any of the untainted nodes.

To learn taints and tolerations fastest it is best to have access to only one node.

If you must run this tutorial on a multi-node cluster you can simulate a single node: for all Pod specs below also add a nodename. Set nodename equal to the one node you have control over.

Nodename reference information : https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodename

Summary: simply add a nodeName: my-node-name in your YAML specs file.

This way the Kubernetes scheduler will only attempt to run your Pod on that ONE node. You can then learn in a tiny environment that you completely control.

Unfortunately this part of the tutorial will make most sense after you did all the exercises below.

So for the moment just believe me: run on a single node cluster or use nodeName.

After you followed this tutorial redo the Pods that failed in a cluster with more than one node. You will see those — Pods unschedulable on a one tainted node cluster — get scheduled on the other (untainted) nodes. ( These previous 2 sentences will also make sense after you did the complete tutorial. )

Taint Beginner Demo: NoSchedule

We add taints to a node using this syntax:

  • kubectl taint nodes node-name key=value:NoSchedule
  • kubectl taint nodes node-name key=value:NoExecute

You have to supply your node-name , your key and your value.

Our first taint:

We taint our node called minikube so that

  • the key : dedicated-app

with

  • the value : my-dedi-app-a

cannot be scheduled on this node ( NoSchedule )

Note the label below: app: my-dedi-app-a

Our Pod uses busybox to sleep for 3600 seconds.

Note there are no tolerations in our Pod spec.

Our node is tainted, but our Pod does not have a toleration for that taint.

Theory suggests that this Pod will not be able to run on this node. Let’s investigate :

Create Pod

Get list of Pods:

Only relevant output from describe command:

Last line explains what happened: 1 node(s) had taints that the pod didn’t tolerate.

PodScheduled False … Pod cannot be scheduled onto this node.

Tainting works. It prevents Pods that do not have a toleration for that taint from running on that node.

( I only have one Kubernetes node. In a setup where you have several nodes, Kubernetes will automatically seek out all the nodes until it finds a node where this Pod can run … or it will state … 0/397 nodes are available: 397 node(s) had taints that the pod didn’t tolerate.)

Let’s now add a toleration for that taint to our Pod:

Note last 5 last lines add a toleration.

Create:

List Pods:

Success: now Pod is running. It tolerated the node with that taint.

Note the important Tolerations: dedicated-app=my-dedi-app-a:NoSchedule line below.

The other 2 tolerations are automatically added by Kubernetes to all Pods.

NoExecute Taint

We keep our existing taint and toleration in place. ( It works )

We add another taint: Pods with key:dedicated-app-exec and value:my-dedi-app-a must not be allowed to run on our node ( NoExecute ).

Investigate first few lines of our node:

Note we now have 2 taints ( at bottom ).

Our Pod spec is as before: no toleration for this second taint.

We expect our Pod to be unable to run on this tainted node.

List Pods:

As expected, Pod is pending.

Investigate why: look at the final line in the output.

Behavior as expected.

It is disappointing to see the warning does not state WHICH taint the Pod did not tolerate. ( In production you will have many nodes each with many taints and lists of tolerations for your Pods. So you have to manually go through those lists to see which taint is not tolerated. )

NoExecute Toleration

Now we specify a toleration for the taint so that our Pod can run on the node. ( Note last 5 lines )

The tolerationSeconds: 60 specify that our Pod can handle the dedicated-app-exec taint for only 60 seconds. After 60 seconds it will be swiftly and forcefully completely removed from the node.

Create Pod

A minute later …

Pod automatically deleted from our node.

which-end: frontend

A node may have unlimited number of taints.

Pods may have unlimited number of tolerations.

We add more taints by first adding more labels to our Pod.

Below we add which-end: frontend label.

We taint our node with a new taint using the which-end key.

Our Pod has no toleration for this taint. Running will fail.

All this should be old news to you at this point.

In the next section you will learn an alternative way to tolerate taints.

Wildcard Global Tolerations

Note the last 2 lines of our Pod spec:

This special syntax specifies that our Pod tolerates ALL which-end key values.

Note our Pod spec labels below : which-end: frontend … our Pod provides front-end functionality.

Note on the command ‘sleep 10’ . We only let Pod run 10 seconds. It has tolerationSeconds: 60 . So it will run to completion within 10 seconds.

List of tolerations for our Pod — note which-end at bottom. Pod tolerates ALL which-end taints.

Tolerate All Taints

Note the special syntax below: tolerate ALL taints.

Create Pod

We see below that our Pod runs to completion with no problems.

Investigate kubectl describe pod/mybusypod output

Amazing: nothing there specifies our Pod tolerates all taints.

PreferNoSchedule

The NoSchedule taints prevents scheduling Pods on a node.

The PreferNoSchedule taints prevents scheduling Pods on a node, BUT, if no suitable untainted node can be found then it WILL schedule the Pod on that node.

From https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

This is a “preference” or “soft” version of NoSchedule — the system will try to avoid placing a pod that does not tolerate the taint on the node, but it is not required.

Current taints on our node:

Let’s remove all these taints. ( Note syntax hyphen at the end … I understand it as : subtract taint from this node )

Our node now totally untainted:

Let’s only add PreferNoSchedule taint — so that we can see how it works.

Our Pod spec specifies NO tolerations.

Create Pod

PreferNoSchedule could not find any untainted nodes, so it will allow this Pod on this node.

Running Pods on Unschedulable Nodes

Extract from details about our node.

Unschedulable: false means Schedulable: true

( Why the double negative: Unschedulable … I do not know )

We can set the node to Unschedulable = true by cordoning the node.

Node will not allow any new Pods to start running :

Create Pod

Pending as expected

We use uncordon to allow Pods to run on node again.

Our Pod starts running automatically.

Note the last 3 lines of our Pod spec: this is how we tolerate the unschedulable condition.

Taint node as unschedulable:

Attempt to run our Pod :

Success below: our Pod tolerates the unschedulable status.

Using the same syntax as above you can ( during emergencies ) run Pods on nodes with these conditions

  • node.kubernetes.io/unreachable
  • node.kubernetes.io/out-of-disk
  • node.kubernetes.io/memory-pressure
  • node.kubernetes.io/disk-pressure
  • node.kubernetes.io/network-unavailable
  • node.kubernetes.io/unschedulable

These are not tolerations you should add to your Pods in general. This trick is for emergency use only.

Most critical Kubernetes system daemons tolerate those taints. ( These daemons must continue to run during these out of resources conditions to keep the Kubernetes system running. )

Make node available again.

Cleanup

Determine list of taints on this node:

Let’s remove taints.

Original Source

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store