Illumina Innovates with Rancher and Kubernetes
An important and complex aspect of container orchestration is scheduling the application containers. Appropriate placement of containers onto the shared infrastructure resources that are available is the key to achieve maximum performance at optimum compute resource usage.
Cattle, which is the default orchestration engine for Rancher 1.6, provided various scheduling abilities to effectively place services, as documented here.
With the release of the 2.0 version based on Kubernetes, Rancher now utilizes native Kubernetes scheduling. In this article we will look at the scheduling methods available in Rancher 2.0 in comparison to Cattle’s scheduling support.
Key terminology differences, implementing key elements, and transforming Compose to YAML
Based on the native Kubernetes behavior, by default, pods in a Rancher 2.0 workload will be spread across the nodes (hosts) that are schedulable and have enough free capacity. But just like the 1.6 version, Rancher 2.0 also facilitates:
Here is how scheduling in the 1.6 UI looks. Rancher lets you either run all containers on a specific host, specify hard/soft host labels, or use affinity/anti-affinity rules while deploying services.
And here is the equivalent node scheduling UI for Rancher 2.0 that provides the same features while deploying workloads.
Rancher uses the underlying native Kubernetes constructs to specify node affinity/anti-affinity. Detailed documentation from Kubernetes is available here.
Let’s run through some examples that schedule workload pods using these node scheduling options, and then check how the Kubernetes YAML specs look like in comparison to the 1.6 Docker Compose config.
While deploying a workload (navigate to your Cluster > Project > Workloads), it is possible to schedule all pods in your workload to a specific node.
Here I am deploying a workload of scale = 2 using the nginx image on a specific node.
scale = 2
Rancher will choose that node if there is enough compute resource availability and no port conflicts if hostPort is used. If the workload exposes itself using a nodePort that conflicts with another workload, the deployment gets created successfully, but no nodePort service is created. Therefore, the workload doesn’t get exposed at all.
On the Workloads tab, you can list the workload group by node. I can see both of the pods for my Nginx workload are scheduled on the specified node:
Now here is what this scheduling rule looks like in Kubernetes pod specs:
I added a label foo=bar to node1 in my Rancher 2.0 cluster to test the label-based node scheduling rules.
Here is how to specify a host label affinity rule in the Rancher 2.0 UI. A hard affinity rule means that the host chosen must satisfy all the scheduling rules. If no such host can be found, the workload will fail to deploy.
In the PodSpec YAML, this rule translates to field nodeAffinity. Also note that I have included the Rancher 1.6 docker-compose.yml used to achieve the same scheduling behavior using labels.
If you are a Rancher 1.6 user, you know that a soft rule means that the scheduler should try to deploy the application per the rule, but can deploy even if the rule is not satisfied by any host. Here is how to specify this rule in Rancher 2.0 UI.
The corresponding YAML specs for the pod are shown below.
Apart from the key = value host label matching rule, Kubernetes scheduling constructs also support the following operators:
key = value
So to achieve anti-affinity, you can use the operators NotIn and DoesNotExist for the node label.
If you are a Cattle user, you will be familiar with a few other scheduling options available in Rancher 1.6:
If you are using these options in your Rancher 1.6 setups, it is possible to replicate them in Rancher 2.0 using native Kubernetes scheduling options. As of v2.0.8, there is no UI support for these options while deploying workloads, but you can always use them by importing the Kubernetes YAML specs on a Rancher cluster.
This 1.6 option lets you schedule containers to a host where a container with a specific label is already present. To do this on Rancher 2.0, use Kubernetes inter-pod affinity and anti-affinity feature.
As noted in these docs, Kubernetes allows you to constrain which nodes your pod can be scheduled to based on pod labels rather than node labels.
One of the most-used scheduling features in 1.6 was anti-affinity to the service itself using labels on containers. To replicate this behavior in Rancher 2.0, we can use pod anti-affinity constructs in Kubernetes YAML specs. For example, consider a Nginx web workload. To ensure that pods in this workload do not land on the same host, you can use the podAntiAffinity construct as shown below. By specifying podAntiAffinity using labels, we ensure that each Nginx replica does not co-locate on a single node.
Using Rancher CLI, this workload can be deployed onto the Kubernetes cluster. Note that the above deployment specifies three replicas, and I have three schedulable nodes in the Kubernetes cluster.
Since podAntiAffinity is specified, the three pods end up on different nodes. To further check how podAntiAffinity applies, I can scale up the deployment to four pods. Notice that the fourth pod cannot get scheduled since the scheduler cannot find another node to satisfy the podAntiAffinity rule.
While you are creating a service in Rancher 1.6, you can specify the memory reservation and mCPU reservation in the Security/Host tab in the UI. Cattle will schedule containers for the service onto hosts that have enough available compute resources.
In Rancher 2.0, you can specify the memory and CPU resources required by your workload pods using resources.requests.memory and resources.requests.cpu under the pod container specs. You can find more detail about these specs here.
When you specify these resource requests, the Kubernetes scheduler will assign the pod to a node with capacity.
Rancher 1.6 has the ability to specify container labels on the host to only allow specific containers to be scheduled to it.
To achieve this in Rancher 2.0, use the equivalent Kubernetes feature of adding node taints (like host tags) and using tolerations in your pod specs.
In Rancher 1.6, a global service is a service with a container deployed on every host in the environment.
If a service has the label io.rancher.scheduler.global: 'true', then the Rancher 1.6 scheduler will schedule a service container on each host in the environment. As mentioned in the documentation, if a new host is added to the environment, and the host fulfills the global service’s host requirements, the service will automatically be started on it by Rancher.
The sample below is an example of a global service in Rancher 1.6. Note that just placing the required label is sufficient to make a service global.
How can we deploy a global service in Rancher 2.0 using Kubernetes?
For this purpose, Rancher deploys a Kubernetes DaemonSet object for the user’s workload. A DaemonSet functions exactly like the Rancher 1.6 global service. The Kubernetes scheduler will deploy a pod on each node of the cluster, and as new nodes are added, the scheduler will start new pods on them provided they match the scheduling requirements of the workload.
Additionally, in 2.0, you can also limit a DaemonSet to be deployed to nodes that have a specific label, as mentioned here.
If you are a Rancher 1.6 user, to migrate your global service to Rancher 2.0 using the UI, navigate to your Cluster > Project > Workloads view. While deploying a workload, you can choose the following workload types:
This is what the corresponding Kubernetes YAML specs look like for the above DaemonSet workload:
- image: nginx
To migrate a Rancher 1.6 global service to Rancher 2.0 using its Compose config, follow these steps.
You can convert the docker-compose.yml file from Rancher 1.6 to Kubernetes YAML using the Kompose tool, and then deploy the application using either the Kubectl client tool or Rancher CLI in the Kubernetes cluster.
Consider the docker-compose.yml specs mentioned above where the Nginx service is a global service. This is how it can be converted to Kubernetes YAML using Kompose:
Now configure the Rancher CLI against your Kubernetes Cluster and deploy the generated *-daemonset.yaml file.
As shown above, my Kubernetes cluster has two worker nodes where workloads can be scheduled, and deploying the global-daemonset.yaml started two pods for the Daemonset, one on each node.
In this article, we reviewed how the various scheduling functionalities of Rancher 1.6 can be migrated to Rancher 2.0. Most of the scheduling techniques have equivalent options available in Rancher 2.0, or they can be achieved via native Kubernetes constructs.
In the upcoming article, I will explore a bit about how service discovery support in Cattle can be replicated in a Rancher 2.0 setup - stay tuned!