Kubernetes Pod Disruption Budgets (PDB)

Image for post
Image for post

By Alwyn Botha, Alibaba Cloud Community Blog author.

Pod Disruption Budgets specify the number of concurrent disruptions that you allow for your Pods. Disruptions may be caused by deliberate or accidental Pod deletion.

When a Kubernetes node run out of RAM or disk space it will evict Pods in an attempt to keep the node running properly. This eviction also causes disruptions.

You can specify Pod Disruption Budgets for Pods managed by these built-in Kubernetes controllers:

  • Deployment
  • ReplicationController
  • ReplicaSet
  • StatefulSet

This tutorial demonstrates Pod Disruption Budgets using deployments. Before proceeding with this tutorial, you should experiment with Kubernetes Eviction Policies since it demonstrates how Pod Disruption Budgets handle evictions.

As in the Kubernetes Eviction Policies tutorial we start with eviction-hard=”memory.available<480M

We will then deploy some Pods that exceed this limit to see how Pod Disruption Budgets work.

As detailed in Kubernetes Eviction Policies tutorials you need to run these exercises on a node with 2200 MB RAM.

Pod Disruption Budgets Basics

Start minikube with hard threshold.

Check available RAM:

Check if node has MemoryPressure:

No MemoryPressure: 700 MB away from that.

We will now create 3 Pods using 50 MB each.

Check available RAM:

500 MB away from MemoryPressure

Now we are going to create 10 more Pods that will surely push the node into a MemoryPressure condition.

To learn more we will create 2 deployments with 5 replicas each.

The my5mbDeployment Pods uses 5 MB each — via stress benchmark application.

The my10mbDeployment Pods uses 10 MB each — via stress benchmark application.

We also have those three 50 MB Pods already running.

Edit / create YAML files for deployments:

apiVersion: apps/v1
kind: Deployment
metadata:
name: 5mb-deployment
labels:
app: 5mbpods
spec:
replicas: 5
strategy:
type: RollingUpdate
selector:
matchLabels:
app: 5mbpods
template:
metadata:
labels:
app: 5mbpods
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent

command: ['sh', '-c', 'stress --vm 1 --vm-bytes 5M --vm-hang 3000 -t 3600']

resources:
requests:
memory: "1k"

terminationGracePeriodSeconds: 0
nano my10mbDeployment.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: 10mb-deployment
labels:
app: 10mbpods
spec:
replicas: 5
strategy:
type: RollingUpdate
selector:
matchLabels:
app: 10mbpods
template:
metadata:
labels:
app: 10mbpods
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent

command: ['sh', '-c', 'stress --vm 1 --vm-bytes 10M --vm-hang 3000 -t 3600']

resources:
requests:
memory: "1k"

terminationGracePeriodSeconds: 0

Create deployments.

kubectl create -f my10mbDeployment.yaml
deployment.apps/10mb-deployment created

Note that in the deployment we specified replicas: 5 each. We need 5 identical Pods running for each of those 2 deployments.

Now we specify a Pod Disruption Budget:

  • for 5m Pods we need a minimum of 2 Pods to be always running minAvailable: 2
  • for 10m Pods we need a minimum of 1 Pod to be always running minAvailable: 1

Important: the app: 5mbpods in the Pod Disruption Budget must match the same line in the deployment.

ONLY this line links the Pod Disruption Budget to an application — the Pods in the deployment.

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: 5mbpods-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: 5mbpods
nano mypdb10.yamlapiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: 5mbpods-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: 10mbpods

Create Pod Disruption Budgets :

Check available RAM:

MemoryPressure? YES

Mere moments later:

NAME              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
10mb-deployment 5 5 5 5 29s
5mb-deployment 5 5 5 5 33s

Our deployments are running perfectly: 5 DESIRED replicas, 5 AVAILABLE.

List all running Pods.

NAME                               READY   STATUS    RESTARTS   AGE
10mb-deployment-558b949886-f6dvj 1/1 Running 0 34s
10mb-deployment-558b949886-lx8kp 1/1 Running 0 34s
10mb-deployment-558b949886-n2bct 1/1 Running 0 34s
10mb-deployment-558b949886-t2652 1/1 Running 0 34s
10mb-deployment-558b949886-w52hz 1/1 Running 0 34s
5mb-deployment-5964c58c48-2qwvf 1/1 Running 0 38s
5mb-deployment-5964c58c48-4b7z6 1/1 Running 0 38s
5mb-deployment-5964c58c48-9shg9 1/1 Running 0 38s
5mb-deployment-5964c58c48-crsp6 1/1 Running 0 38s
5mb-deployment-5964c58c48-x848b 1/1 Running 0 38s
my50mb-ram-pod-a 1/1 Running 0 84s
my50mb-ram-pod-b 0/1 Evicted 0 85s
my50mb-ram-pod-c 0/1 Evicted 0 85s

We see 2 of our 50 MB Pods got evicted.

We specified minAvailable: 1 for our 10mb-deployment : meaning we allow 4 of the 5 replicas to be disrupted.

Kubernetes choose to evict the 50 MB Pods. Why?

As detailed in the Kubernetes Eviction Policies tutorial, Pods are ranked for eviction.

pods are ranked by Priority, and then usage above request.

Our 50 MB Pods rank much higher than our 10 MB Pods.

Here is an edited extract of the kubelet eviction manager logs.

my50mb-ram-pod-c and my50mb-ram-pod-b listed first — got evicted. System never got close to our 2 deployments.

Feb 05 07:49:03 pods ranked for eviction:
kube-apiserver-minikube_kube-system,
kube-controller-manager-minikube_kube-system,
etcd-minikube_kube-system,
kube-scheduler-minikube_kube-system,
my50mb-ram-pod-b_default, < ----------
my50mb-ram-pod-a_default, < ----------
metrics-server-6486d4db88-rlsdd_kube-system,
kubernetes-dashboard-5bff5f8fb8-27mh9_kube-system,
10mb-deployment-558b949886-n2bct_default,
10mb-deployment-558b949886-w52hz_default,
10mb-deployment-558b949886-lx8kp_default,
10mb-deployment-558b949886-f6dvj_default,
10mb-deployment-558b949886-t2652_default,
5mb-deployment-5964c58c48-crsp6_default,
5mb-deployment-5964c58c48-x848b_default,
5mb-deployment-5964c58c48-9shg9_default,
5mb-deployment-5964c58c48-2qwvf_default,
5mb-deployment-5964c58c48-4b7z6_default,

Feb 05 07:49:03 pod my50mb-ram-pod-b_default is evicted successfully

Kubernetes cannot evict critical static pods :

  • kube-apiserver-minikube_kube-system,
  • etcd-minikube_kube-system,
  • kube-controller-manager-minikube_kube-system,
  • kube-scheduler-minikube_kube-system,

So those Pods are not evicted. I deleted those lines from output since it adds no value to this discussion.

Check if node still has MemoryPressure:

No, those evictions fixed the problem.

17 MB above threshold limit.

Let us use another 20 MB RAM by scaling our 10 MB deployment to 7 replicas.

A second later.

Too fast, system did not realize its MemoryPressure should be TRUE .

We can see RAM below 480 MB limit.

Let’s list the pods

NAME                               READY   STATUS    RESTARTS   AGE
10mb-deployment-558b949886-8j2rb 1/1 Running 0 30s
10mb-deployment-558b949886-f6dvj 1/1 Running 0 4m9s
10mb-deployment-558b949886-lx8kp 1/1 Running 0 4m9s
10mb-deployment-558b949886-n2bct 1/1 Running 0 4m9s
10mb-deployment-558b949886-t2652 1/1 Running 0 4m9s
10mb-deployment-558b949886-w52hz 1/1 Running 0 4m9s
10mb-deployment-558b949886-zbkkv 1/1 Running 0 30s
5mb-deployment-5964c58c48-2qwvf 1/1 Running 0 4m13s
5mb-deployment-5964c58c48-4b7z6 1/1 Running 0 4m13s
5mb-deployment-5964c58c48-9shg9 1/1 Running 0 4m13s
5mb-deployment-5964c58c48-crsp6 1/1 Running 0 4m13s
5mb-deployment-5964c58c48-x848b 1/1 Running 0 4m13s
my50mb-ram-pod-a 0/1 Evicted 0 4m59s
my50mb-ram-pod-b 0/1 Evicted 0 5m
my50mb-ram-pod-c 0/1 Evicted 0 5m

Another 50 MB Pod evicted.

Lesson learned: Pod Disruption Budgets do not mean such Pods will be disrupted first.

pods are ranked by usage above request.

Not ranked by if they have a budget or not.

Pod Disruption Budgets are considered WHEN a Pod qualifies for eviction.

The eviction manager logs shows my50mb-ram-pod-a rank higher: it got evicted.

Check available RAM:

Check if node has MemoryPressure:

Problem solved.

15 MB above eviction threshold.

Let’s use more RAM to cause Pod Disruption Budgets to get used.

List deployments — we can see on first line one Pod got evicted. ( AVAILABLE = 6 but DESIRED = 7 )

NAME              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
10mb-deployment 7 7 7 6 8m53s
5mb-deployment 10 10 10 10 8m57s

MemoryPressure True

Way below threshold.

List all Pods.

NAME                               READY   STATUS    RESTARTS   AGE
10mb-deployment-558b949886-8j2rb 1/1 Running 0 5m25s
10mb-deployment-558b949886-f6dvj 1/1 Running 0 9m4s
10mb-deployment-558b949886-lx8kp 1/1 Running 0 9m4s
10mb-deployment-558b949886-n2bct 1/1 Running 0 9m4s
10mb-deployment-558b949886-r97pz 0/1 Pending 0 15s10mb-deployment-558b949886-t2652 1/1 Running 0 9m4s
10mb-deployment-558b949886-w52hz 1/1 Running 0 9m4s
10mb-deployment-558b949886-zbkkv 0/1 Evicted 0 5m25s
5mb-deployment-5964c58c48-2qwvf 1/1 Running 0 9m8s
5mb-deployment-5964c58c48-4b7z6 1/1 Running 0 9m8s
5mb-deployment-5964c58c48-8jf4r 1/1 Running 0 25s
5mb-deployment-5964c58c48-9shg9 1/1 Running 0 9m8s
5mb-deployment-5964c58c48-cjh2m 1/1 Running 0 25s
5mb-deployment-5964c58c48-crsp6 1/1 Running 0 9m8s
5mb-deployment-5964c58c48-dzx9x 1/1 Running 0 25s
5mb-deployment-5964c58c48-gwbwn 1/1 Running 0 25s
5mb-deployment-5964c58c48-jvkbk 1/1 Running 0 25s
5mb-deployment-5964c58c48-x848b 1/1 Running 0 9m8s
my50mb-ram-pod-a 0/1 Evicted 0 9m54s
my50mb-ram-pod-b 0/1 Evicted 0 9m55s
my50mb-ram-pod-c 0/1 Evicted 0 9m55s

Based on experience this node now has a permanent problem.

One 10mb-deployment evicted, seconds later a replacement 10mb-deployment Pod is Pending

Node does not allow new Pods to start running while in MemoryPressure condition ( Pods are in Pending state ).

Still not enough RAM available. More Pods will be evicted.

The 10 MB deployment notices its evicted Pods and creates replacement Pods — which stay pending.

This very second 3 Pods are pending … waiting for RAM to become available.

The moment RAM becomes available some of those Pods will become ‘running’. Even just one 10 MB Pod will push node into below eviction threshold condition … and so the cycle continues.

The problem is the system does not have standalone Pods running for only a few minutes and then completing successfully. All we have is these 2 massive deployments — both running for an hour.

If you have a node running like this: slightly over capacity with NO Pods constantly completing successfully then evictions cannot save you.

Pod Disruption Budgets cannot save a permanently over-capacity server.

Pod Disruption Budgets DO HELP if some Pods complete successfully over time. Then replacement Pods will not push the node into low RAM again.

NAME                               READY   STATUS    RESTARTS   AGE
10mb-deployment-558b949886-66tml 0/1 Pending 0 22s
10mb-deployment-558b949886-8j2rb 0/1 Evicted 0 6m59s
10mb-deployment-558b949886-9mfx4 0/1 Pending 0 24s
10mb-deployment-558b949886-f6dvj 1/1 Running 0 10m
10mb-deployment-558b949886-hrp2s 0/1 Evicted 0 34s
10mb-deployment-558b949886-lx8kp 1/1 Running 0 10m
10mb-deployment-558b949886-n2bct 1/1 Running 0 10m
10mb-deployment-558b949886-npvjf 0/1 Pending 0 29s
10mb-deployment-558b949886-r97pz 0/1 Evicted 0 109s
10mb-deployment-558b949886-t2652 1/1 Running 0 10m
10mb-deployment-558b949886-w52hz 0/1 Evicted 0 10m
10mb-deployment-558b949886-zbkkv 0/1 Evicted 0 6m59s
5mb-deployment-5964c58c48-2qwvf 1/1 Running 0 10m
5mb-deployment-5964c58c48-4b7z6 1/1 Running 0 10m
5mb-deployment-5964c58c48-8jf4r 1/1 Running 0 119s
5mb-deployment-5964c58c48-9shg9 1/1 Running 0 10m
5mb-deployment-5964c58c48-cjh2m 1/1 Running 0 119s
5mb-deployment-5964c58c48-crsp6 1/1 Running 0 10m
5mb-deployment-5964c58c48-dzx9x 1/1 Running 0 119s
5mb-deployment-5964c58c48-gwbwn 1/1 Running 0 119s
5mb-deployment-5964c58c48-jvkbk 1/1 Running 0 119s
5mb-deployment-5964c58c48-x848b 1/1 Running 0 10m
my50mb-ram-pod-a 0/1 Evicted 0 11m
my50mb-ram-pod-b 0/1 Evicted 0 11m
my50mb-ram-pod-c 0/1 Evicted 0 11m

Note that our 5mb-deployment never lost a Pod due to eviction.

pods are ranked by Priority, and then usage above request.

10mb-deployment Pods rank higher and get evicted all the time.

Kubernetes does not spread evictions evenly across Pod Disruption Budgets. The above simple rule still holds.

We specified minAvailable: 1 for our 10mb-deployment : meaning we allow 4 of the 5 replicas to be disrupted.

The output above shows 4 Pods still running — kubelet focus all its effort on this deployment since it still has scope to delete 3 more Pods as allowed in the Pod Disruption Budget.

The obvious solution in this case is to reduce number of replicas for 10mb-deployment and / or 5mb-deployment . There are no other Pods to consider.

Exercise complete. Delete …

kubectl delete -f my5mbDeployment.yaml
deployment.apps "5mb-deployment" deleted
kubectl delete -f my50mb-ram-pod-a.yaml
pod "my50mb-ram-pod-a" deleted
kubectl delete -f my50mb-ram-pod-b.yaml
pod "my50mb-ram-pod-b" deleted
kubectl delete -f my50mb-ram-pod-c.yaml
pod "my50mb-ram-pod-c" deleted

Delete Pod Disruption Budgets .

minikube stop

3 Identical Deployments

I want to see Kubernetes handle 3 identical simultaneous deployments with identical Pod Disruption Budgets.

Same minikube startup configurations as before.

Here are the 3 new Pod Disruption Budgets. Create it on your node.

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: 10mbpods-pdb-a
spec:
minAvailable: 1
selector:
matchLabels:
app: 10mbpods-a
nano mypdb10-b.yamlapiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: 10mbpods-pdb-b
spec:
minAvailable: 1
selector:
matchLabels:
app: 10mbpods-b
nano mypdb10-v.yamlapiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: 10mbpods-pdb-v
spec:
minAvailable: 1
selector:
matchLabels:
app: 10mbpods-c
kubectl create -f mypdb10-a.yaml
kubectl create -f mypdb10-b.yaml
kubectl create -f mypdb10-c.yaml

get poddisruptionbudgets does not show useful information at this point.

NAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 0 4m19s
10mbpods-pdb-b 1 N/A 0 4m19s
10mbpods-pdb-c 1 N/A 0 4m19s
source ./mem | tail -n1
memory.available_in_mb 1193

YAML manifests for the deployments :

apiVersion: apps/v1
kind: Deployment
metadata:
name: 10mb-deployment-a
labels:
app: 10mbpods-a
spec:
replicas: 4
strategy:
type: RollingUpdate
selector:
matchLabels:
app: 10mbpods-a
template:
metadata:
labels:
app: 10mbpods-a
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent

command: ['sh', '-c', 'stress --vm 1 --vm-bytes 10M --vm-hang 3000 -t 3600']

resources:
requests:
memory: "1k"

terminationGracePeriodSeconds: 0
nano my10mbDeployment-b.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: 10mb-deployment-b
labels:
app: 10mbpods-b
spec:
replicas: 4
strategy:
type: RollingUpdate
selector:
matchLabels:
app: 10mbpods-b
template:
metadata:
labels:
app: 10mbpods-b
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent

command: ['sh', '-c', 'stress --vm 1 --vm-bytes 10M --vm-hang 3000 -t 3600']

resources:
requests:
memory: "1k"

terminationGracePeriodSeconds: 0
nano my10mbDeployment-c.yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: 10mb-deployment-c
labels:
app: 10mbpods-c
spec:
replicas: 4
strategy:
type: RollingUpdate
selector:
matchLabels:
app: 10mbpods-c
template:
metadata:
labels:
app: 10mbpods-c
spec:
containers:
- name: myram-container-1
image: mytutorials/centos:bench
imagePullPolicy: IfNotPresent

command: ['sh', '-c', 'stress --vm 1 --vm-bytes 10M --vm-hang 3000 -t 3600']

resources:
requests:
memory: "1k"

terminationGracePeriodSeconds: 0

Create deployments.

Check RAM

MemoryPressure   False   Tue, 05 Feb 2019 09:01:05 +0200   Tue, 05 Feb 2019 08:50:34 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available

No problem so far.

NAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 3 8m41s
10mbpods-pdb-b 1 N/A 3 8m41s
10mbpods-pdb-c 1 N/A 3 8m41s
memory.available_in_mb 522

One more replica per deployment should push node into RAM problem.

kubectl get poddisruptionbudgetsNAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 4 11m
10mbpods-pdb-b 1 N/A 4 11m
10mbpods-pdb-c 1 N/A 4 11m

No problem seconds later.

Everything seems OK.

NAME                                 READY   STATUS    RESTARTS   AGE
10mb-deployment-a-86589488d-8kxsk 1/1 Running 0 3m36s
10mb-deployment-a-86589488d-9sxbd 1/1 Running 0 3m36s
10mb-deployment-a-86589488d-gdbx9 1/1 Running 0 3m36s
10mb-deployment-a-86589488d-r8z56 1/1 Running 0 3m36s
10mb-deployment-a-86589488d-wm4zr 1/1 Running 0 24s
10mb-deployment-b-6776ffb68c-2zz84 1/1 Running 0 3m36s
10mb-deployment-b-6776ffb68c-4z5dz 1/1 Running 0 24s
10mb-deployment-b-6776ffb68c-q4mmk 1/1 Running 0 3m36s
10mb-deployment-b-6776ffb68c-qrw5z 1/1 Running 0 3m36s
10mb-deployment-b-6776ffb68c-rxknz 1/1 Running 0 3m36s
10mb-deployment-c-b5b56fd58-2thf2 1/1 Running 0 3m36s
10mb-deployment-c-b5b56fd58-2zd78 1/1 Running 0 3m36s
10mb-deployment-c-b5b56fd58-66fjj 1/1 Running 0 24s
10mb-deployment-c-b5b56fd58-m95mn 1/1 Running 0 3m36s
10mb-deployment-c-b5b56fd58-rmn4t 1/1 Running 0 3m36s

A minute later. Kubernetes + Docker + Linux kernel system overhead uses around 30 MB RAM to be able to manage the influx of new replicas.

MemoryPressure True

MemoryPressure   True    Tue, 05 Feb 2019 09:05:05 +0200   Tue, 05 Feb 2019 09:05:05 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory available

List Pods:

NAME                                 READY   STATUS    RESTARTS   AGE
10mb-deployment-a-86589488d-8kxsk 1/1 Running 0 5m3s
10mb-deployment-a-86589488d-9sxbd 1/1 Running 0 5m3s
10mb-deployment-a-86589488d-gdbx9 1/1 Running 0 5m3s
10mb-deployment-a-86589488d-r8z56 1/1 Running 0 5m3s
10mb-deployment-a-86589488d-wm4zr 1/1 Running 0 111s
10mb-deployment-b-6776ffb68c-2zz84 1/1 Running 0 5m3s
10mb-deployment-b-6776ffb68c-4z5dz 1/1 Running 0 111s
10mb-deployment-b-6776ffb68c-q4mmk 1/1 Running 0 5m3s
10mb-deployment-b-6776ffb68c-qrw5z 1/1 Running 0 5m3s
10mb-deployment-b-6776ffb68c-rxknz 1/1 Running 0 5m3s
10mb-deployment-c-b5b56fd58-2r4sp 0/1 Pending 0 26s
10mb-deployment-c-b5b56fd58-2thf2 1/1 Running 0 5m3s
10mb-deployment-c-b5b56fd58-2zd78 0/1 Evicted 0 5m3s
10mb-deployment-c-b5b56fd58-66fjj 0/1 Evicted 0 111s
10mb-deployment-c-b5b56fd58-fff2g 0/1 Pending 0 28s
10mb-deployment-c-b5b56fd58-m95mn 1/1 Running 0 5m3s
10mb-deployment-c-b5b56fd58-rbkbx 0/1 Evicted 0 34s
10mb-deployment-c-b5b56fd58-rmn4t 1/1 Running 0 5m3s

To fix MemoryPressure 3 Pods get evicted ( all from 10mb-deployment-c )

Perfect fairness would have demanded one Pod evicted per different deployment.

Somehow three 10mb-deployment-c Pods got ranked first.

We can see in the eviction manager logs below that the remaining Pods are ranked roughly randomly evenly for our a to c series deployments.

10mb-deployment-c-b5b56fd58-2zd78_default,
metrics-server-6486d4db88-k249j_kube-system,
10mb-deployment-c-b5b56fd58-66fjj_default,
10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
kube-proxy-tgcqv_kube-system,
07:05:05 pod 10mb-deployment-c-b5b56fd58-2zd78_default is evicted successfully07:05:08 pods ranked for eviction: metrics-server-6486d4db88-k249j_kube-system,
10mb-deployment-c-b5b56fd58-66fjj_default,
10mb-deployment-c-b5b56fd58-rbkbx_default,
10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
kube-proxy-tgcqv_kube-system,
07:05:09 pod metrics-server-6486d4db88-k249j_kube-system is evicted successfully07:05:11 pods ranked for eviction: 10mb-deployment-c-b5b56fd58-66fjj_default,
10mb-deployment-c-b5b56fd58-rbkbx_default,
10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
kube-proxy-tgcqv_kube-system,
07:05:11 pod 10mb-deployment-c-b5b56fd58-66fjj_default is evicted successfully07:05:13 pods ranked for eviction: 10mb-deployment-c-b5b56fd58-rbkbx_default,
10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
kube-proxy-tgcqv_kube-system,
07:05:13 pod 10mb-deployment-c-b5b56fd58-rbkbx_default is evicted successfully

A minute later we start to see a Pod each from the 3 different deployments get evicted.

NAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 4 13m
10mbpods-pdb-b 1 N/A 4 13m
10mbpods-pdb-c 1 N/A 3 13m

A minute later.

NAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 3 14m
10mbpods-pdb-b 1 N/A 3 14m
10mbpods-pdb-c 1 N/A 3 14m
kubectl get podsNAME READY STATUS RESTARTS AGE
10mb-deployment-a-86589488d-8kxsk 1/1 Running 0 6m12s
10mb-deployment-a-86589488d-8wdk8 0/1 ContainerCreating 0 43s
10mb-deployment-a-86589488d-9sxbd 1/1 Running 0 6m12s
10mb-deployment-a-86589488d-gdbx9 1/1 Running 0 6m12s
10mb-deployment-a-86589488d-r8z56 1/1 Running 0 6m12s
10mb-deployment-a-86589488d-wm4zr 0/1 Evicted 0 3m
10mb-deployment-b-6776ffb68c-2zz84 1/1 Running 0 6m12s
10mb-deployment-b-6776ffb68c-4z5dz 0/1 Evicted 0 3m
10mb-deployment-b-6776ffb68c-9kx67 0/1 ContainerCreating 0 46s
10mb-deployment-b-6776ffb68c-q4mmk 1/1 Running 0 6m12s
10mb-deployment-b-6776ffb68c-qrw5z 1/1 Running 0 6m12s
10mb-deployment-b-6776ffb68c-rxknz 1/1 Running 0 6m12s
10mb-deployment-c-b5b56fd58-2r4sp 0/1 Evicted 0 95s
10mb-deployment-c-b5b56fd58-2thf2 1/1 Running 0 6m12s
10mb-deployment-c-b5b56fd58-2zd78 0/1 Evicted 0 6m12s
10mb-deployment-c-b5b56fd58-66fjj 0/1 Evicted 0 3m
10mb-deployment-c-b5b56fd58-8wzgv 0/1 ContainerCreating 0 49s
10mb-deployment-c-b5b56fd58-fff2g 0/1 Evicted 0 97s
10mb-deployment-c-b5b56fd58-m95mn 1/1 Running 0 6m12s
10mb-deployment-c-b5b56fd58-rbkbx 0/1 Evicted 0 103s
10mb-deployment-c-b5b56fd58-rmn4t 1/1 Running 0 6m12s
10mb-deployment-c-b5b56fd58-xz8nw 1/1 Running 0 53s

At the top of the ranks we can see a,b, and c deployments Pods.

10mb-deployment-c-b5b56fd58-fff2g_default,
10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
metrics-server-6486d4db88-rdqmw_kube-system,
kube-proxy-tgcqv_kube-system,
07:05:55 pod 10mb-deployment-c-b5b56fd58-fff2g_default is evicted successfully07:05:57 pods ranked for eviction: 10mb-deployment-c-b5b56fd58-2r4sp_default,
10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
10mb-deployment-c-b5b56fd58-xz8nw_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
metrics-server-6486d4db88-rdqmw_kube-system,
kube-proxy-tgcqv_kube-system,
07:05:59 pod 10mb-deployment-c-b5b56fd58-2r4sp_default is evicted successfully07:06:02 pods ranked for eviction: 10mb-deployment-b-6776ffb68c-4z5dz_default,
10mb-deployment-a-86589488d-wm4zr_default,
10mb-deployment-c-b5b56fd58-xz8nw_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
metrics-server-6486d4db88-rdqmw_kube-system,
kube-proxy-tgcqv_kube-system,
07:06:03 pod 10mb-deployment-b-6776ffb68c-4z5dz_default is evicted successfully07:06:05 pods ranked for eviction: 10mb-deployment-a-86589488d-wm4zr_default,
10mb-deployment-c-b5b56fd58-xz8nw_default,
kubernetes-dashboard-5bff5f8fb8-rq727_kube-system,
10mb-deployment-a-86589488d-9sxbd_default,
10mb-deployment-a-86589488d-gdbx9_default,
10mb-deployment-b-6776ffb68c-qrw5z_default,
10mb-deployment-c-b5b56fd58-m95mn_default,
10mb-deployment-b-6776ffb68c-2zz84_default,
10mb-deployment-b-6776ffb68c-rxknz_default,
10mb-deployment-a-86589488d-r8z56_default,
10mb-deployment-b-6776ffb68c-q4mmk_default,
10mb-deployment-c-b5b56fd58-rmn4t_default,
10mb-deployment-a-86589488d-8kxsk_default,
10mb-deployment-c-b5b56fd58-2thf2_default,
metrics-server-6486d4db88-rdqmw_kube-system,
kube-proxy-tgcqv_kube-system,
07:06:05 pod 10mb-deployment-a-86589488d-wm4zr_default is evicted successfully

When we started the node easily handled 4 replicas per deployment.

Now, due to increasing system RAM overheads, it is unable to do so.

Reduce replicas to 4.

RAM still below needed minimum of 480.

More Pods get evicted.

NAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 2 17m
10mbpods-pdb-b 1 N/A 3 17m
10mbpods-pdb-c 1 N/A 2 17m

A minute later — more evictions.

NAME             MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
10mbpods-pdb-a 1 N/A 1 18m
10mbpods-pdb-b 1 N/A 3 18m
10mbpods-pdb-c 1 N/A 2 18m

This cycle of evict, recreate and more evict, recreate uses even more RAM.

Scale replicas to 3.

Still does not fix RAM problem.

MemoryPressure   True    Tue, 05 Feb 2019 09:12:16 +0200   Tue, 05 Feb 2019 09:12:16 +0200   KubeletHasInsufficientMemory   kubelet has insufficient memory availablekubectl get poddisruptionbudgets
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
10mbpods-pdb-a 1 N/A 1 20m
10mbpods-pdb-b 1 N/A 0 20m
10mbpods-pdb-c 1 N/A 1 20m

Scale replicas to 3.

Still below threshold of 480.

memory.available_in_mb 457

Delete all 3 deployments completely.

MemoryPressure?

№5 MB above threshold.

Ongoing growing system RAM overhead now makes it impossible to run these 3 deployments with 4 replicas each.

The kubelet eviction manager logs for last 8 minutes.

There is no one large Pod to fix problem forever.

No large standalone Pod completes to give back RAM to rest of node.

Do not try to run Kubernetes nodes permanently at the RAM eviction threshold.

Syntax: Specifying a PodDisruptionBudget

You can find comprehensive detail on the allowed syntax:

https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget

Pod Disruption Budget Documentation

This tutorial focused on practical exercises. The official Kubernetes documentation has more details on the concepts discussed here:

Run Your Own Experiments

As a follow up, consider this experiment.

Define 4 Pod Disruption Budgets for 4 different deployments. Pods should have same RAM usage as in previous example.

Pod Disruption Budgets should only allow 1 or 2 maxUnavailable

This way you can see Kubernetes act on Pod Disruption Budgets for 4 deployments.

Original Source

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store