Kubernetes : Configure Liveness and Readiness Probes

By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

This tutorial teaches you about two independent types of probes to help ensure your Pods run smoothly:

  • Liveness Probes: checks your containers are alive
  • Readiness Probes: checks your containers are able to do productive work

Kubernetes assumes responsibility that your containers in your Pods are alive. If not, it restarts the containers that fail liveness probes. Kubernetes do not assume responsibility for your Pods to be ready. Readiness may be a complicated set of interrelated networked components that enables a Pod to be ready.

Restarting a container with a failing readiness probe will not fix it, so readiness failures receive no automatic reaction from Kubernetes. A Pod may have several containers running inside it. All those containers may have different liveness and readiness probes ( since different software runs inside each ).

This tutorial demonstrates Pods with just one simple container. This way we can focus only on liveness and readiness probes.

This tutorial will cover the following topics:

  • httpGet livenessProbe: restartPolicy: Always
  • httpGet livenessProbe: restartPolicy: Never
  • httpGet livenessProbe: failureThreshold = 1
  • tcpSocket livenessProbe
  • tcpSocket readinessProbe
  • Readiness and liveness commands

1) httpGet livenessProbe: restartPolicy: Always

Create the following YAML file with your favorite Linux editor.

Explanation of Pod spec above:

  • we use a httpd:2.4 image
  • image only gets downloaded from Docker hub once imagePullPolicy: IfNotPresent
  • command gives container work otherwise it just exits immediately upon startup
  • ports this container port containerPort: 80 will be accessible via hostPort: 8080
  • httpGet livenessProbe: access port 80 at path /
  • initialDelaySeconds: 2 waits 2 seconds after container got created before probing starts
  • periodSeconds: 10 liveness probe probes every 10 seconds.

A httpGet livenessProbe uses http get command to probe if a container is alive.

Let’s create the Pod to see how this works.

Create the Pod.

Truncated list of describe output … only relevant fields shown.

This Pod looks identical to any other successfully running Pod — zero difference even in events list. Liveness probes waiting those defined seconds before probing starts.

Still looks like any other Pod for first 30 seconds.

Let’s investigate what is happening in detail.

We had 3 liveness probe failures so far. Overall Pod status stays READY and RUNNING. ( This is a confusing fact: the container is not alive, but it is in status: ready )

Wait around 15 seconds and redo describe

  • 62 seconds ago Pod got scheduled
  • 34 seconds ago liveness probe failed 3 times
  • 4 seconds ago: new container created

Apache is not running in the container. This causes liveness probe to fail. There is no working port 80 to connect to : dial tcp 172.17.0.6:80: connect: connection refused

Let’s fix that. Enter the Pod and start apache:

AH00558 warning is easy to fix, but irrelevant to liveness probes, so feel free to ignore it.

I entered httpd twice — second time it shows it is running already ( exactly what I wanted to see ).

Our Pod is running, it restarted once.

We fixed the problem.

Unhealthy 32s (x5 over 102s) will not shown any more failures.

8 seconds later

Warning Unhealthy 40s (x5 over 110s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

20 seconds later

Warning Unhealthy 57s (x5 over 2m7s) kubelet, minikube Liveness probe failed: Get http://172.17.0.6:80/: dial tcp 172.17.0.6:80: connect: connection refused

Restart count does not increase anymore. Liveness probes succeed.

Unfortunately the events do not SHOW a log entry about this success. You have to deduce it and assume it now works. There is no field that displays the liveness status.

Delete Pod.

This demo worked since the default restartPolicy: Always is in effect.

2) httpGet livenessProbe: restartPolicy: Never

Let’s see what happens with a restartPolicy: Never.

I made periodSeconds 2 seconds. Now we will quickly see what happens.

Create the Pod.

Investigate just the tail ( events ) part of kubectl describe pod/myliveness-pod

restartPolicy: Never works. No restarts done.

The default failureThreshold is 3 times. After 3 failures a container is classified as failed.

Only 3 failed probes done.

Here we see the Pod status turns to error.

Investigate overall Pod status (below) :

  • Pod Status: Failed
  • Container State: Terminated; Reason: Error; Exit Code: 137
  • Ready False; ContainersReady False
  • Restart Count: 0 … because restartPolicy: Never

Pod status is failed: 3 liveness probe failures and restartPolicy: Never prevents Kubernetes from restarting it in an effort to fix it.

Delete Pod.

3) httpGet livenessProbe: failureThreshold = 1

By default failureThreshold equals 3. 3 tries before container declared a failure.

Let’s set failureThreshold equal to 1 and experiment. ( note last line in spec )

Create the Pod.

After around a minute:

Determine number of restarts:

2 restarts after 2 liveness probe failures.

Another 30 seconds later.

Determine number of restarts:

3 restarts after 3 liveness probe failures.

You have to determine the suitable failureThreshold for your production environment.

Different containers in the same Pod may have / need different suitable failureThreshold values.

The default timeoutSeconds is one seconds.

Similary, you have to determine the suitable timeoutSeconds for your production environment — for each container with different software.

Delete Pod.

4) tcpSocket livenessProbe

Till now we used httpGet livenessProbes

For software that does not support http gets, you can use tcp Socket liveness probes.

Create using your editor:

Only difference from before is tcpSocket instead of httpGet

Create the Pod.

Based on what you learned so far you can do this exercise on your own.

Container liveness probes will fail.

The following will fix it, just as before.

Delete Pod.

Based on the software running in each of your production containers you have to determine which liveness probe to use:

  • tcpSocket
  • httpGet

5) tcpSocket readinessProbe

We did liveness probes thus far.

Readinessprobes are independent of liveness probes.

Readinessprobes probe to ensure your containers are ready to do productive work.

You have to determine exactly what to test to ensure a readiness probe tests readiness.

readinessProbe and livenessProbe syntax are identical.

You can have both these probes defined for a Pod.

Our Pod spec below demonstrates one readiness probe.

Note we use short delay seconds at bottom of spec: to see what happens quickly.

Create the Pod.

The Pod is running on the node, but it is not ready. Kubernetes noticed the readiness probe that needs to succeed. Then it will convert the ready state to true.

Truncated list of describe output … only relevant EVENT fields shown.

Last line: readiness probe failed 3 times.

6 seconds later …

6 more failures.

Another 10 seconds later …

Another 5 failures. Note no mention of restarts. Kubernetes does not restart failed readiness probes.

This is the MAJOR difference between readiness and liveness probes.

Detailed Pod status:

  • Pod: Ready False
  • Containers: ContainersReady False

Fix the Pod, start Apache.

Check Pod status again … now it is ready

A minute later. x22 failed readiness probe count does not increase anymore.

Delete Pod.

This final exercise demonstrated THE readiness versus liveness difference:

  • Liveness failures — Kubernetes restarts the failed container
  • Readiness failures — Kubernetes does nothing when a container fails. ContainersReady set to False

Excellent official reference documentation about liveness and readyness probe settings:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#configure-probes

Conclusion: Readiness and Liveness Commands

We only did tcp socket and http get probes in this tutorial. The last way to do probes is via commands.

Official Kubernetes demo using commands

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-liveness-command

As a final exercise I suggest you follow those instructions.

You will note that the kubectl describe pod/myliveness-pod output they show is using a previous format.

This concludes this tutorial.

Carefully read the text of that last link and determine appropriate settings for each container in each Pod in your production environment.

Reference:https://www.alibabacloud.com/blog/kubernetes-configure-liveness-and-readiness-probes_594833?spm=a2c41.12911419.0.0

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store