Kubernetes : Configure Liveness and Readiness Probes

Alibaba Cloud
13 min readMay 29, 2019

By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

This tutorial teaches you about two independent types of probes to help ensure your Pods run smoothly:

  • Liveness Probes: checks your containers are alive
  • Readiness Probes: checks your containers are able to do productive work

Kubernetes assumes responsibility that your containers in your Pods are alive. If not, it restarts the containers that fail liveness probes. Kubernetes do not assume responsibility for your Pods to be ready. Readiness may be a complicated set of interrelated networked components that enables a Pod to be ready.

Restarting a container with a failing readiness probe will not fix it, so readiness failures receive no automatic reaction from Kubernetes. A Pod may have several containers running inside it. All those containers may have different liveness and readiness probes ( since different software runs inside each ).

This tutorial demonstrates Pods with just one simple container. This way we can focus only on liveness and readiness probes.

This tutorial will cover the following topics:

  • httpGet livenessProbe: restartPolicy: Always
  • httpGet livenessProbe: restartPolicy: Never
  • httpGet livenessProbe: failureThreshold = 1
  • tcpSocket livenessProbe
  • tcpSocket readinessProbe
  • Readiness and liveness commands

1) httpGet livenessProbe: restartPolicy: Always

Create the following YAML file with your favorite Linux editor.

nano myLiveness-Pod.yamlapiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 10

Explanation of Pod spec above:

  • we use a httpd:2.4 image
  • image only gets downloaded from Docker hub once imagePullPolicy: IfNotPresent
  • command gives container work otherwise it just exits immediately upon startup
  • ports this container port containerPort: 80 will be accessible via hostPort: 8080
  • httpGet livenessProbe: access port 80 at path /
  • initialDelaySeconds: 2 waits 2 seconds after container got created before probing starts
  • periodSeconds: 10 liveness probe probes every 10 seconds.

A httpGet livenessProbe uses http get command to probe if a container is alive.

Let’s create the Pod to see how this works.

Create the Pod.

kubectl create -f myLiveness-Pod.yamlpod/myliveness-pod created

Truncated list of describe output … only relevant fields shown.

kubectl describe pod/myliveness-podName:               myliveness-pod
Status: Running
Containers:
myliveness-container:
Image: httpd:2.4
Port: 80/TCP
Host Port: 8080/TCP
State: Running
Started: Wed, 16 Jan 2019 07:37:02 +0200
Ready: True
Restart Count: 0
Liveness: http-get http://:80/ delay=2s timeout=1s period=10s #success=1 #failure=3
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 3s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 3s kubelet, minikube Created container
Normal Started 3s kubelet, minikube Started container

This Pod looks identical to any other successfully running Pod — zero difference even in events list. Liveness probes waiting those defined seconds before probing starts.

Still looks like any other Pod for first 30 seconds.

kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 12s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 22s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 33s

Let’s investigate what is happening in detail.

kubectl describe pod/myliveness-podName:               myliveness-pod
Start Time: Wed, 16 Jan 2019 07:37:01 +0200
Status: Running
Containers:
myliveness-container:
State: Running
Started: Wed, 16 Jan 2019 07:37:02 +0200
Ready: True
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 37s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 36s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 36s kubelet, minikube Created container
Normal Started 36s kubelet, minikube Started container
Warning Unhealthy 9s (x3 over 29s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

We had 3 liveness probe failures so far. Overall Pod status stays READY and RUNNING. ( This is a confusing fact: the container is not alive, but it is in status: ready )

Wait around 15 seconds and redo describe

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 62s default-scheduler Successfully assigned default/myliveness-pod to minikube
Warning Unhealthy 34s (x3 over 54s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
Normal Pulled 4s (x2 over 61s) kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 4s (x2 over 61s) kubelet, minikube Created container
Normal Started 4s (x2 over 61s) kubelet, minikube Started container
Normal Killing 4s kubelet, minikube Killing container with id docker://myliveness-container:Container failed liveness probe.. Container will be killed and recreated.
  • 62 seconds ago Pod got scheduled
  • 34 seconds ago liveness probe failed 3 times
  • 4 seconds ago: new container created

Apache is not running in the container. This causes liveness probe to fail. There is no working port 80 to connect to : dial tcp 172.17.0.6:80: connect: connection refused

Let’s fix that. Enter the Pod and start apache:

kubectl exec myliveness-pod -i -t -- /bin/sh
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
httpd (pid 15) already running
# exit

AH00558 warning is easy to fix, but irrelevant to liveness probes, so feel free to ignore it.

I entered httpd twice — second time it shows it is running already ( exactly what I wanted to see ).

kubectl get poNAME             READY   STATUS    RESTARTS   AGE
myliveness-pod 1/1 Running 1 104s

Our Pod is running, it restarted once.

We fixed the problem.

Unhealthy 32s (x5 over 102s) will not shown any more failures.

8 seconds later

Warning Unhealthy 40s (x5 over 110s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

20 seconds later

Warning Unhealthy 57s (x5 over 2m7s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

Restart count does not increase anymore. Liveness probes succeed.

Unfortunately the events do not SHOW a log entry about this success. You have to deduce it and assume it now works. There is no field that displays the liveness status.

kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m19s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m27s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m34s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m45s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 1 2m59s

Delete Pod.

kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0pod "myliveness-pod" force deleted

This demo worked since the default restartPolicy: Always is in effect.

2) httpGet livenessProbe: restartPolicy: Never

Let’s see what happens with a restartPolicy: Never.

nano myLiveness-Pod.yamlapiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 2

restartPolicy: Never

I made periodSeconds 2 seconds. Now we will quickly see what happens.

Create the Pod.

kubectl create -f myLiveness-Pod.yamlpod/myliveness-pod created

Investigate just the tail ( events ) part of kubectl describe pod/myliveness-pod

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 8s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 8s kubelet, minikube Created container
Normal Started 8s kubelet, minikube Started container
Warning Unhealthy 2s (x3 over 6s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
6 seconds later ..........Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 14s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 14s kubelet, minikube Created container
Normal Started 14s kubelet, minikube Started container
Warning Unhealthy 8s (x3 over 12s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
10 seconds later .........Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 25s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 25s kubelet, minikube Created container
Normal Started 25s kubelet, minikube Started container
Warning Unhealthy 19s (x3 over 23s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused
another 10 seconds later .........Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 34s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 34s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 34s kubelet, minikube Created container
Normal Started 34s kubelet, minikube Started container
Warning Unhealthy 28s (x3 over 32s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

restartPolicy: Never works. No restarts done.

The default failureThreshold is 3 times. After 3 failures a container is classified as failed.

Only 3 failed probes done.

Here we see the Pod status turns to error.

kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 1/1 Running 0 37s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 0/1 Error 0 55s
kubectl get po
NAME READY STATUS RESTARTS AGE
myliveness-pod 0/1 Error 0 67s

Investigate overall Pod status (below) :

  • Pod Status: Failed
  • Container State: Terminated; Reason: Error; Exit Code: 137
  • Ready False; ContainersReady False
  • Restart Count: 0 … because restartPolicy: Never

Pod status is failed: 3 liveness probe failures and restartPolicy: Never prevents Kubernetes from restarting it in an effort to fix it.

kubectl describe pod/myliveness-podName:               myliveness-pod
Start Time: Wed, 16 Jan 2019 07:45:47 +0200
Status: Failed
Containers:
myliveness-container:
State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 16 Jan 2019 07:45:47 +0200
Finished: Wed, 16 Jan 2019 07:46:23 +0200
Ready: False
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True

Delete Pod.

kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0pod "myliveness-pod" force deleted

3) httpGet livenessProbe: failureThreshold = 1

By default failureThreshold equals 3. 3 tries before container declared a failure.

Let’s set failureThreshold equal to 1 and experiment. ( note last line in spec )

nano myLiveness-Pod.yamlapiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 2
periodSeconds: 2
failureThreshold: 1

Create the Pod.

kubectl create -f myLiveness-Pod.yamlpod/myliveness-pod createdkubectl get poNAME             READY   STATUS    RESTARTS   AGE
myliveness-pod 1/1 Running 0 6s

After around a minute:

desc pod/myliveness-pod|tailEvents:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 64s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 31s (x2 over 64s) kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 31s (x2 over 64s) kubelet, minikube Created container
Normal Started 31s (x2 over 64s) kubelet, minikube Started container
Normal Killing 31s kubelet, minikube Killing container with id docker://myliveness-container:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 28s (x2 over 62s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

Determine number of restarts:

kubectl get poNAME             READY   STATUS    RESTARTS   AGE
myliveness-pod 1/1 Running 2 68s

2 restarts after 2 liveness probe failures.

Another 30 seconds later.

desc pod/myliveness-pod|tailEvents:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 86s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 20s (x3 over 86s) kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 20s (x3 over 86s) kubelet, minikube Created container
Normal Started 20s (x3 over 86s) kubelet, minikube Started container
Normal Killing 20s (x2 over 53s) kubelet, minikube Killing container with id docker://myliveness-container:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 18s (x3 over 84s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

Determine number of restarts:

kubectl get poNAME             READY   STATUS    RESTARTS   AGE
myliveness-pod 1/1 Running 3 108s

3 restarts after 3 liveness probe failures.

You have to determine the suitable failureThreshold for your production environment.

Different containers in the same Pod may have / need different suitable failureThreshold values.

The default timeoutSeconds is one seconds.

Similary, you have to determine the suitable timeoutSeconds for your production environment — for each container with different software.

Delete Pod.

kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0pod "myliveness-pod" force deleted

4) tcpSocket livenessProbe

Till now we used httpGet livenessProbes

For software that does not support http gets, you can use tcp Socket liveness probes.

Create using your editor:

nano myLiveness-Pod.yamlapiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] ports:
- name: liveness-port
containerPort: 80
hostPort: 8080
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 3
periodSeconds: 10

Only difference from before is tcpSocket instead of httpGet

Create the Pod.

kubectl create -f myLiveness-Pod.yamlpod/myliveness-pod created

Based on what you learned so far you can do this exercise on your own.

Container liveness probes will fail.

The following will fix it, just as before.

kubectl exec myliveness-pod -i -t -- /bin/sh
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
# exit

Delete Pod.

kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0pod "myliveness-pod" force deleted

Based on the software running in each of your production containers you have to determine which liveness probe to use:

  • tcpSocket
  • httpGet

5) tcpSocket readinessProbe

We did liveness probes thus far.

Readinessprobes are independent of liveness probes.

Readinessprobes probe to ensure your containers are ready to do productive work.

You have to determine exactly what to test to ensure a readiness probe tests readiness.

readinessProbe and livenessProbe syntax are identical.

You can have both these probes defined for a Pod.

Our Pod spec below demonstrates one readiness probe.

nano myLiveness-Pod.yamlapiVersion: v1
kind: Pod
metadata:
name: myliveness-pod
spec:
containers:
- image: httpd:2.4
imagePullPolicy: IfNotPresent
name: myliveness-container
command: ['sh', '-c', 'echo Container 1 is Running ; sleep 3600'] ports:
- name: liveness-port
containerPort: 80
hostPort: 8080

readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 3
periodSeconds: 2

Note we use short delay seconds at bottom of spec: to see what happens quickly.

Create the Pod.

kubectl create -f myLiveness-Pod.yamlpod/myliveness-pod created

The Pod is running on the node, but it is not ready. Kubernetes noticed the readiness probe that needs to succeed. Then it will convert the ready state to true.

kubectl get poNAME             READY   STATUS    RESTARTS   AGE
myliveness-pod 0/1 Running 0 3s

Truncated list of describe output … only relevant EVENT fields shown.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 8s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 8s kubelet, minikube Created container
Normal Started 8s kubelet, minikube Started container
Warning Unhealthy 1s (x3 over 5s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused

Last line: readiness probe failed 3 times.

6 seconds later …

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 21s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 20s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 20s kubelet, minikube Created container
Normal Started 20s kubelet, minikube Started container
Warning Unhealthy 1s (x9 over 17s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused

6 more failures.

Another 10 seconds later …

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 31s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 30s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 30s kubelet, minikube Created container
Normal Started 30s kubelet, minikube Started container
Warning Unhealthy 1s (x14 over 27s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused

Another 5 failures. Note no mention of restarts. Kubernetes does not restart failed readiness probes.

This is the MAJOR difference between readiness and liveness probes.

kubectl get poNAME             READY   STATUS    RESTARTS   AGE
myliveness-pod 0/1 Running 0 36s

Detailed Pod status:

  • Pod: Ready False
  • Containers: ContainersReady False
kubectl describe pod/myliveness-podName:               myliveness-pod
Status: Running
Containers:
myliveness-container:
State: Running
Started: Wed, 16 Jan 2019 08:42:09 +0200
Ready: False
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 73s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 72s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 72s kubelet, minikube Created container
Normal Started 72s kubelet, minikube Started container
Warning Unhealthy 27s (x22 over 69s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused

Fix the Pod, start Apache.

kubectl exec myliveness-pod -i -t -- /bin/sh
# httpd
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
# exit

Check Pod status again … now it is ready

kubectl describe pod/myliveness-podName:               myliveness-pod
Status: Running
Containers:
myliveness-container:
State: Running
Started: Wed, 16 Jan 2019 08:42:09 +0200
Ready: True
Restart Count: 0
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 99s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 98s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 98s kubelet, minikube Created container
Normal Started 98s kubelet, minikube Started container
Warning Unhealthy 53s (x22 over 95s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused

A minute later. x22 failed readiness probe count does not increase anymore.

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m19s default-scheduler Successfully assigned default/myliveness-pod to minikube
Normal Pulled 2m18s kubelet, minikube Container image "httpd:2.4" already present on machine
Normal Created 2m18s kubelet, minikube Created container
Normal Started 2m18s kubelet, minikube Started container
Warning Unhealthy 93s (x22 over 2m15s) kubelet, minikube Readiness probe failed: dial tcp 172.17.0.6:80: connect: connection refused

Delete Pod.

kubectl delete -f myLiveness-Pod.yaml --force --grace-period=0pod "myliveness-pod" force deleted

This final exercise demonstrated THE readiness versus liveness difference:

  • Liveness failures — Kubernetes restarts the failed container
  • Readiness failures — Kubernetes does nothing when a container fails. ContainersReady set to False

Excellent official reference documentation about liveness and readyness probe settings:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#configure-probes

Conclusion: Readiness and Liveness Commands

We only did tcp socket and http get probes in this tutorial. The last way to do probes is via commands.

Official Kubernetes demo using commands

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-liveness-command

As a final exercise I suggest you follow those instructions.

You will note that the kubectl describe pod/myliveness-pod output they show is using a previous format.

This concludes this tutorial.

Carefully read the text of that last link and determine appropriate settings for each container in each Pod in your production environment.

Reference:https://www.alibabacloud.com/blog/kubernetes-configure-liveness-and-readiness-probes_594833?spm=a2c41.12911419.0.0

--

--

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com