When your application is user-facing, ensuring continuous availability and minimal downtime is a challenge. Hence, monitoring the health of the application is essential to avoid any outages.
HealthChecks in Rancher 1.6
Cattle provided the ability to add HTTP or TCP healthchecks for the deployed services in Rancher 1.6. Healthcheck support is provided by Rancher’s own healthcheck microservice. You can read more about it here.
In brief, a Cattle user can add a TCP healthcheck to a service. Rancher’s healthcheck containers, which are launched on a different host, will test if a TCP connection opens at the specified port for the service containers. Note that with the latest release (v1.6.20), healthcheck containers are also scheduled on the same host as the service containers, along with other hosts.
HTTP healthchecks can also be added while deploying services. You can ask Rancher to make an HTTP request at a specified path and specify what response is expected.
These healthchecks are done periodically at a configurable interval, and retries/timeouts are also configurable. Upon failing a healthcheck, you can also instruct Rancher if and when the container should be recreated.
Consider a service running an Nginx image on Cattle, with an HTTP healthcheck configured as below.
The healthcheck parameters appear in the
rancher-compose.yml file and not the
docker-compose.yml because healthcheck functionality is implemented by Rancher.
Lets see if we can configure corresponding healthchecks in Rancher 2.0.
HealthChecks in Rancher 2.0
In 2.0, Rancher uses the native Kubernetes healthcheck mechanisms:
As documented here, probes are diagnostics performed periodically by the Kubelet on a container. In Rancher 2.0, healthchecks are done by the Kubelet running locally, as compared to the cross-host healthchecks in Rancher 1.6.
A Quick Kubernetes Healthcheck Summary
livenessProbeis an action performed on a container to check if the container is running. If the probe reports failure, Kubernetes will kill the pod container, and it is restarted as per the restart policy specified in the specs.
readinessProbeis used to check if a container is ready to accept and serve requests. When a
readinessProbefails, the pod container is not exposed via the public endpoints so that no requests are made to the container.
If your workload is busy doing some startup routine before it can serve requests, it is a good idea to configure a
readinessProbefor the workload.
The following types of
readinessProbe can be configured for Kubernetes workloads:
- tcpSocket - the Kubelet checks if TCP connections can be opened against the container’s IP address on a specified port.
- httpGet - An HTTP/HTTPS GET request is made at the specified path and reported as successful if it returns a HTTP response code within
- exec - the Kubelet executes a specified command inside the container and checks if the command exits with status
More configuration details for the above probes can be found here.
Configuring Healthchecks in Rancher 2.0
Via Rancher UI, users can add TCP or HTTP healthchecks to Kubernetes workloads. By default, Rancher asks you to configure a
readinessProbe for the workload and applies a
livenessProbe using the same configuration. You can choose to define a separate
If the healthchecks fail, the container is restarted per the
restartPolicy defined in the workload specs. This is equivalent to the
strategy parameter in
rancher-compose.yml files for 1.6 services using healthchecks in Cattle.
While deploying a workload in Rancher 2.0, users can configure TCP healthchecks to check if a TCP connection can be opened at a specific port.
Here are the Kubernetes YAML specs showing the TCP
readinessProbe configured for the Nginx workload as shown above. Rancher also adds a
livenessProbe to your workload using the same config.
Healthcheck parameters from 1.6 to 2.0:
You can also specify an HTTP healthcheck and provide a path in the pod container at which HTTP/HTTPS GET requests will be made by the Kubelet. However, Kubernetes only supports an HTTP/HTTPS GET request, unlike any HTTP method supported by healthchecks in Rancher 1.6.
Here are the Kubernetes YAML specs showing the HTTP
livenessProbe configured for the Nginx workload as shown above.
Healthcheck in Action
Now let’s see what happens when a healthcheck fails and how the workload recovers in Kubernetes.
Consider the above HTTP healthcheck on our Nginx workload doing an HTTP GET on the
To make the healthcheck fail, I did a exec into the pod container using the
Execute Shell UI option in Rancher.
Once I exec’ed to the container, I moved the file that the healthcheck does a GET on.
livenessProbe check failed, and the workload status changed to
The pod was killed and recreated soon by Kubernetes, and the workload came back up since the
restartPolicy was set to
Using Kubectl, you can see these healthcheck event logs.
As a quick tip, the Rancher 2.0 UI provides the helpful option to
Launch Kubectl from the Kubernetes Cluster view, where you can run native Kubernetes commands on the cluster objects.
Migrate Healthchecks via Docker Compose to Kubernetes Yaml?
Rancher 1.6 provided healthchecks via its own microservice, which is why the healthcheck parameters that a Cattle user added to the services appear in the
rancher-compose.yml file and not in the
docker-compose.yml config file. The Kompose tool we used earlier in this blog series works on standard
docker-compose.yml parameters and therefore cannot parse the Rancher healthcheck constructs. So as of now, we cannot use this tool for converting the Rancher healthchecks from compose config to Kubernetes Yaml.
As seen in this blog post, the configuration parameters available to add TCP or HTTP healthchecks in Rancher 2.0 are very similar to Rancher 1.6. The healthcheck config used by Cattle services can be transitioned completely to 2.0 without loss of any functionality.
In the upcoming article, I plan to explore how to map scheduling options that Cattle supports to Kubernetes in Rancher 2.0. Stay tuned!