This section describes the background of Kubernetes application orchestration and management. For instance, to directly manage all pods in a cluster, pods of applications A, B, and C are scattered in the cluster, as shown in the following figure.
Now, answer the following questions to proceed further:
- How to ensure the number of available pods in a cluster? In other words, how to ensure the number of available pods when a host encounters a fault or network faults occur on the four pods of application A?
- How to update the image version for all pods? Is it necessary to update the version of a pod?
- How to ensure service availability during the update?
- How to roll back a pod to the previous version if a problem occurs during the update?
Deployment Management Controller for Deployment and Release
This section introduces the key topic of this article: Deployments.
As shown in the preceding figure, applications A, B, and C are assigned to different Deployments. Each Deployment manages pods of the same application, and the pods have the same replicas. So what does a Deployment do?
1) Deployment defines an expected number of pods. For example, if you want four pods for application A, the controller maintains four pods as expected. When any network faults or host faults occur on the pods, the controller scales up new pods so that the number of pods is the same as the expected number.
2) The controller configures the pod release mode. Thus, it updates pods according to the specified policy. During the update, set the quantity range for unavailable pods.
3) If any faults occur during the update, perform a one-click rollback to change all pods of a Deployment back to an earlier version just by running a command or modifying a line.
2) Use Cases
This section uses a simple case to explain how to operate a Deployment.
The preceding figure shows a simple YAML file of a Deployment.
apiVersion: apps/v1 indicates that the Deployment belongs to the apps group and the version is v1. "metadata" indicates the Deployment metadata, including the labels, selectors, and pod images mentioned in the previous articles.
The Deployment, a Kubernetes resource, has its own metadata information. The Deployment.name defined here is
Deployment.spec must contain the "replicas" core field, which defines the expected number of pods as three. "selector" indicates the pod selector. Labels of all new pods must match the
image.labels of the selector, that is,
The preceding pod template contains two parts:
- One is the metadata of the expected pods, including labels that match
- The other is
Pod.spec.In the template,
Pod.specis used by Deployment to create pods. Here,
container.nginxis defined, with the image version Nginx:1.7.9.
Familiarize yourself with the following new concepts:
- Replicas indicate the number of expected pods or the final number of pods in Deployment.
- Templates indicate pod-related templates.
View the Deployment Status
While creating a Deployment, run the
kubectl get deployment command to view the overall status of the Deployment, as shown in the following figure.
The preceding figure shows the following information:
- DESIRED: The number of expected pods, which is 3.
- CURRENT: The number of the current pods, which is 3.
- UP-TO-DATE: The number of pods with the latest expected version.
- AVAILABLE: The number of pods available in the running process. Actually, AVAILABLE also includes the number of pods that have exceeded their availability period.
- AGE: The duration of a created Deployment. In the preceding figure, the Deployment has been running for 80 minutes.
View the pods, as shown in the following figure.
The preceding figure shows three pods.
In a pod name,
nginx-deployment indicates the name of the Deployment to which a pod belongs, and
template-hash is the same for the three pods because the three pods were created from the same template.
The last part in the pod name is a random string, run the
kubectl get pod command to view the same. ownerReferences, the controller resource of a pod, is a ReplicaSet rather than the Deployment of the pod. The ReplicaSet name consists of
pod.template-hash, which will be described later. All pods are created by a ReplicaSet, which corresponds to a specific version of a Deployment template.
This section describes how to update the image versions of all pods for a given Deployment. Run the
kubectl set image deployment.v1.apps/nginx-deployment nginx=nginx:1.9.1 command.
- In the command, “kubectl” is always followed by “set image”, indicating that this command is run to set images.
- The following “set image” is
deployment.v1.apps, indicating the type of the resource to be operated on. This part is fixed, where "deployment" indicates the resource name, "v1" indicates the resource version, and "apps" indicates the resource group. Use "deployment" or
deployment.v1.apps. For example, while using "deployment" to replace
deployment.v1.apps, the v1 version of apps will be used by default.
- The third part is “nginx-deployment”, indicating the name of the deployment with the image to be updated. “nginx” following the name indicates a template, which is the
container.nameof a pod. A pod may contain multiple containers and the
container.namespecified here, Nginx is the container name of the image to be updated.
- The last part is
nginx:1.9.1, indicating the expected image version of the container. Running this command shows that
template.specof the Deployment is updated to nginx: 1.9.1.
Perform a quick rollback when problems occur during the release. Run the
kubectl rollout undo command to roll back the Deployment to the previous version or run the
kubectl rollout undo command suffixed with
to-revision to roll back the Deployment to a specific version.
This section introduces DeploymentStatus. Each resource has its
spec.Status. DeploymentStatus describes the conversion statuses, including Processing, Complete, and Failed, as shown in the following figure.
Processing indicates that the Deployment is being scaled up and released. For a Deployment in the Processing state, when all its replicas and pod replicas are available and up to date, the Deployment can enter the Complete state. A Deployment in the Complete state will enter the Processing state upon scaling.
If any problem occurs during processing, for example, if image pulling or a readiness probe check fails, the Deployment enters the Failed state. If the readiness probe check of a pod fails, a Deployment in the Complete state enters the Failed state. A Deployment in the Failed state enters the Complete state only when all replicas are available and up to date.
Deployment Creation and Status
An Alibaba Cloud service cluster is connected. The cluster contains several available nodes.
First, create a Deployment. Note that values of DESIRED, CURRENT, UP-TO-DATE, and AVAILABLE are all the expected values.
As shown in the figure, there are three replicas in “spec”, the “labels” defined in “selector” and “template” is
app:nginx, the "image" in "spec" is the expected value
nginx: 1.7.9, and values of
updatedReplicas in "status" are all 3.
You can view the status of a pod.
In the pod, ownerReferences is ReplicaSet and the version of the image in pod.spec.container is 1.7.9. The pod is in the Running state and its conditions.status is “true”, indicating that its services are available.
Currently, only the ReplicaSet of the latest version is available. Now, try to upgrade the Deployment.
kubectl set image command is followed by "deployment",
deployment.name, the container name, and the expected image version number.
The version of the image in the template of the Deployment has been upgraded to 1.9.1, as shown in the following figure.
Now, run the
kubectl get pod command to view the status.
The three pods have been upgraded to the latest version, and
pod-template-hash in each pod name has been updated.
Both the spec quantity and pod quantity of the ReplicaSet of the earlier version are 0, and the pod quantity of the new version is 3, as shown in the following figure.
Assume that another update is implemented. Run the
kubectl get pod command. Note that two pods of the earlier version are in the Running state, another pod of the earlier version is in the Terminating state, and of the two pods of the new version, one is in the Running state and the other one is in the ContainerCreating state.
In this case, the number of available pods (pods that are not terminated) is four, which is greater than the value three set in the Deployment. The reason is that the Deployment has the MaxAvailable and MaxSurge configurations, which restricts some policies during the release. This will be described in the architecture design section.
As shown in the preceding figure, the ReplicaSet of the latest version contains three pods, and there are two ReplicaSets of earlier versions. You may wonder whether the quantity of earlier-version ReplicaSets will constantly grow as the Deployment is continuously updated. In fact, the Deployment provides a mechanism to prevent this problem. In spec of the Deployment, the default value of RevisionHistoryLimit is 10, specifying the number of ReplicaSets of historical versions that are retained. You may change it to 1.
As shown in the preceding figure, there are two ReplicaSets. That is, in addition to the ReplicaSet of the current version, only one earlier-version ReplicaSet is retained.
This section describes how to perform rollback. Run the
kubectl get replicaset command. The number of ReplicaSets of earlier versions increases from 0 to 2 while the number of ReplicaSets of the new version decreases from 3 to 1, indicating that the rollback has started. After a while, the number of ReplicaSets of the earlier version becomes 3, indicating that the rollback was successful. At this time, the number of ReplicaSets of the new version is 0.
kubectl get pod command.
The template-hash of the three pods has been updated to the hash of the earlier version. However, the three pods are new pods, not those created in the earlier version. In other words, during the rollback, three pods of the earlier version were created rather than retrieving the previous three pods.
4) Architecture Design
The Deployment only manages ReplicaSets of different versions, which then manage the number of pod replicas. Each ReplicaSet corresponds to a version of the Deployment template. As mentioned above, a ReplicaSet is generated each time the template is modified. Pods of the same ReplicaSet have the same version.
The Deployment creates ReplicaSets and ReplicaSets create pods, as shown in the preceding figure. Their ownerReferences corresponds to the controller resources.
This section describes the controller implementation principle.
All the controllers operate handlers for events from Informer and watch these events. The received Deployment and ReplicaSet events are added to a queue. After obtaining events from the queue, the Deployment controller checks “Paused”, which indicates whether a new release is required. If “Paused” is set to “true”, the Deployment controller only maintains the quantity, without any new releases.
If “Paused” is set to “true” (Yes), the Deployment controller only synchronizes replicas. That is, the Deployment controller synchronizes replicas to the corresponding ReplicaSet and updates the Deployment status. In this case, the ReplicaSet of the controller ends.
If “Paused” is set to “false”, the Deployment controller performs rollout, that is, it performs an update in creating or rolling mode. The update is actually performed by creating, updating, or deleting ReplicaSet.
After ReplicaSets are assigned to the Deployment, the ReplicaSet controller watches certain events from Informer, including the ReplicaSet and pod events. After obtaining events from the queue, the ReplicaSet controller only manages the number of replicas. If the ReplicaSet controller finds that the number of replicas is greater than the number of pods, it scales up the pods. If the ReplicaSet controller finds that the actual number of pods exceeds the expected number, it deletes pods.
As shown in the figure, the Deployment controller executes more complex tasks, including managing versions, while the ReplicaSet maintains the quantity in each version.
Scale-up and Scale-down Simulation
This section describes some simulated operations, such as the scale-up operation. Assume there is a Deployment with two replicas and its corresponding ReplicaSet has Pod1 and Pod2. If you modify the Deployment replicas, the controller synchronizes the replicas to the ReplicaSet of the current version. When the ReplicaSet finds two pods, and not the three pods expected; the ReplicaSet creates the third pod.
This section describes how to simulate release, which is a little more complex. For example, assume the Deployment template of the current version is template1. The ReplicaSet corresponding to template1 contains three pods: Pod1, Pod2, and Pod3.
When you modify images in a container of the template, the Deployment controller creates a ReplicaSet corresponding to template2. After ReplicaSet is created, the controller modifies the quantity in the two ReplicaSets. For example, it gradually increases the expected number of replicas in ReplicaSet2 while gradually decreasing the number of pods in ReplicaSet1.
Finally, the pods of the new version are Pod4, Pod5, and Pod6, and the pods of the earlier version are deleted. This completes the release process.
This section describes how to simulate a rollback. After the release simulation in the preceding section, Pod4, Pod5, and Pod6 are released. Assume you discover a problem with the current business version. In this case, run a rollout command or perform a rollback to roll back the template to template1.
After the rollback, the Deployment changes the expected number of pods in ReplicaSet1 to three and gradually reduces replicas in ReplicaSet2, so that pods of the earlier version are created again.
After the rollback, Pod7, Pod8, and Pod9 are created, respectively corresponding to Pod1, Pod2, and Pod3 in the initial version. This implies that the rollback does not retrieve the previous pods but creates new pods that follow the template of the earlier version.
This section describes some fields in a Deployment. The following describes the spec fields in a Deployment:
- MinReadySeconds: The Deployment determines that a pod is available when it is ready. However, after setting MinReadySeconds to, for example, 30s, the Deployment determines that the pod is available only after it has been in the ready state for more than 30s. An available pod must be a ready pod, but a ready pod is not necessarily available. The ready pod is available only after the time specified by MinReadySeconds.
- RevisionHistoryLimit: The number of old ReplicaSets that can be retained. The default value is 10. Set this parameter to 1 or 2. If rollback is highly probable, set it to a value greater than 10.
- Paused: A label indicating that the Deployment is paused and only maintains the quantity, without any new releases. This field may be used in debugging scenarios.
- ProgressDeadlineSeconds: When the Deployment is in the Scaling or Releasing state, its condition is Processing. A timeout interval is set for the Processing state. If the state is still Processing after the timeout interval, the controller considers the pod to have entered the Failed state.
Upgrade Policy Fields
The Deployment provides two policies in RollingUpdate, MaxUnavailable, and MaxSurge. These two fields are also explained in the following figure.
- MaxUnavailable: The maximum number of unavailable pods during the update.
- MaxSurge: The maximum number of pods that are scheduled above the expected number of replicas during the update.
As mentioned above, a Deployment with three ReplicaSets may have two replicas for the new-version ReplicaSet and two for earlier-version ReplicaSets, exceeding the expected number three. This is because the default values of MaxUnavailable and MaxSurge are 25%, indicating that during the release, there may be additional unavailable replicas up to 25% of the original replicas and additionally available replicas, more than 25% of the original replicas. The maximum number of replicas is 125% of the number of original replicas.
Set the two fields based on the actual requirements. For example, if you have sufficient resources and require higher availability during the release, set MaxUnavailable to a smaller value and MaxSurge to a larger value. If you have insufficient resources, set MaxSurge to a smaller value or even to 0. However, MaxSurge and MaxUnavailable cannot be 0 at the same time.
This is because, when MaxSurge is set to 0, pods must be deleted before new pods are added. Otherwise, the total number of pods will exceed the expected number. If both fields are set to 0, MaxSurge ensures that no pod is created while MaxUnavailable cannot ensure that there are available pods in a ReplicaSet, which causes problems. Therefore, MaxSurge and MaxUnavailable cannot be set to 0 at the same time. Set the fields to proper values based on your actual requirements.
This section concludes the article.
- A Deployment is a common workload in Kubernetes, which supports the deployment and management of pods of multiple versions.
- The Deployment creates a ReplicaSet for the template of each version, and then the ReplicaSet maintains a certain number of pod replicas. The Deployment only needs to specify the number of pods in ReplicaSets of each version.
- In short, a Deployment adjusts the final number of replicas in ReplicaSets to upgrade and roll back pods of different versions.