A Detailed Overview of Rancher's Architecture

This newly-updated, in-depth guidebook provides a detailed overview of the features and functionality of the new Rancher: an open-source enterprise Kubernetes platform.

Get the eBook

Objective: In this article, we will walk through running a distributed, production-quality database setup managed by Rancher and characterized by stable persistence. We will use Stateful Sets with a Kubernetes cluster in Rancher for the purpose of deploying a stateful distributed Cassandra database.

Pre-requisites: We assume that you have a Kubernetes cluster provisioned with a cloud provider. Consult the Rancher resource if you would like to create a K8s cluster in Amazon EC2 using Rancher 2.0.

Databases are business-critical entities and data loss or leak leads to major operational risk scenarios in any organization. A single operational or architectural failure can lead to significant loss of time and resources and this necessitates failover systems or procedures to mitigate a loss scenario. Prior to migrating a database architecture to Kubernetes, it is essential to complete a cost-benefit analysis of running a database cluster on a container architecture versus bare metal, including the potential pitfalls of doing so by evaluating disaster recovery requirements for Recovery Time Objective (RTO) and Recovery Point Objective (RPO). This is especially true in data-sensitive applications that require true high availability, geographic separation for scale and redundancy and low latency in application recovery. In the following walk-thru, we will analyze the various options that are available in Rancher High Availability and Kubernetes in order to design a production quality database.

A. Drawbacks of Container Architectures for Stateful Systems

Containers deployed in a Kubernetes-like cluster are naturally stateless and ephemeral, meaning they do not maintain a fixed identity and they lose and forget data in case of error or restart. In designing a distributed database environment that provides high availability and fault tolerance, the stateless architecture of Kubernetes presents a challenge as both replication and scale out requires state to be maintained for the following: (1) Storage; (2) Identity; (3) Sessions; and (4) Cluster Role.

Consider our containerized database application and we can immediately start to see challenges in going with a stateless architecture as our application is required to fulfill a set of requirements:

  1. Our database is required to store Data and Transactions in files that are persistent and exclusive to each database container;

  2. Each container in the database application is required to maintain a fixed identity as a database node in order that we may route traffic to it by either name, address or index;

  3. Database client sessions are required to maintain state to ensure read-write transactions are terminated prior to state change for consistency and to ensure that state transformations survive failure for durability; and

  4. Each database node requires a persistent role in its database cluster, such as master, replica or shard unless changed by an application-specific event and as necessitated by schema changes.

Transient solutions to these challenges may be to attach a PersistentVolume to our Kubernetes pods that has a lifecycle independent of any individual pod that uses it. However, PersistentVolume does not provide a consistent assignment of roles to cluster nodes, i.e. parent, child or seed nodes. The cluster does not guarantee that database states are maintained throughout the application lifecycle, and specifically, that new containers will be created with nondeterministic random names and pods can be scheduled to be started, terminated or scaled at any time and in any order. So our challenge remains.

B. Advantages of Kubernetes for a Deploying a Distributed Database

Given the challenges of deploying a distributed database in a Kubernetes cluster, is it even worth the effort? There are a plethora of advantages and possibilities that Kubernetes opens up, including managing numerous database services together with common automated operations to support their healthy lifecycle with recoverability, reliability and scalability. Database clusters may be deployed at a fraction of the time and cost needed to deploy bare metal clusters, even in a virtualized environment.

Stateful Sets provides a way forward from the challenges outlined in the previous section. With Stateful Sets introduced in the 1.5 release, Kubernetes now implements Storage and Identity stateful qualities. The following is ensured:

  1. Each pod has a persistent volume attached, with a persistent link from pod to storage, solving storage state issue from (A);
  2. Each pod starts in the same order and terminates in reverse order, solving sessions state issue from (A);
  3. Each pod has a unique and determinable name, address and ordinal index assigned solving identity and cluster role issue from (A).

C. Deploying Stateful Set Pod with Headless Service

Note: We will use the kubectl service in this section. Consult the Rancher resource here on deploying the kubectl service using Rancher.

Stateful Set Pods require a headless service to manage the network identity of the Pods. Essentially, a headless service has a non-defined Cluster IP address, meaning that no cluster IP is defined on the service. Instead, the service definition has a selector and when the service is accessed, DNS is configured to return multiple address records or addresses. At this point, service fqdn gets mapped to all IPs of all the pod IPs behind that service with the same selector.

Let’s create a Headless Service for Cassandra using this template:

$ kubectl create -f cassandra-service.yaml
service "cassandra" created

Use get svc to list the attributes of the cassandra service.

$ kubectl get svc cassandra
NAME        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
cassandra   None         <none>        9042/TCP   10s

And describe svc to list the attributes of the cassandra service with verbose output.

$ kubectl describe svc cassandra
Name:     cassandra
Namespace:    default
Labels:   app=cassandra
Annotations:    <none>
Selector:   app=cassandra
Type:     ClusterIP
IP:     None
Port:     <unset> 9042/TCP
TargetPort:   9042/TCP
Endpoints:    <none>
Session Affinity: None
Events:   <none>

D. Creating Storage Classes for Persistent Volumes

In Rancher, we can use a variety of options to manage our persistent storage through native Kubernetes API resources, PersistentVolume and PersistentVolumeClaim. Storage classes in Kubernetes tells us which storage classes are supported by our cluster. We can use dynamic provisioning for our persistent storage to automatically create and attach volumes to pods. For example, the following storage class will specify AWS as its storage provider and use type gp2 and availability zone us-west-2a.

storage-class.yml
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: us-west-2a

It is also possible to create a new Storage Class, if needed such as:

Kubectl create -f azure-stgcls.yaml
Storageclass “stgcls” created

Upon creation of a StatefulSet, a PersistentVolumeClaim is initiated for the StatefulSet pod based on its Storage Class. With dynamic provisioning, the PersistentVolume is dynamically provisioned for the pod according to the Storage Class that was requested in the PersistentVolumeClaim.

You can manually create the persistent volumes via Static Provisioning. You can read more about Static Provisioning here.

Note: For static provisioning, it is a requirement to have the same number of Persistent Volumes as the number of Cassandra nodes in the Cassandra server.

E. Creating Stateful Sets

We can now create the StatefulSet which will provided our desired properties of ordered deployment and termination, unique network names and stateful processing. We invoke the following command and start a single Cassandra server:

$ kubectl create -f cassandra-statefulset.yaml

F. Validating Stateful Set

We then invoke the following command to validate if the Stateful Set has been deployed in the Cassandra server.

$ kubectl get statefulsets
NAME        DESIRED   CURRENT   AGE
cassandra   1         1         2h

The values under DESIRED and CURRENT should be equivalent once the Stateful Set has been created. Invoke get pods to view an ordinal listing of the Pods created by the Stateful Set.

$ kubectl get pods -o wide
NAME         READY  STATUS    RESTARTS  AGE  IP              NODE
cassandra-0  1/1    Running   0         1m   172.xxx.xxx.xxx   169.xxx.xxx.xxx

During node creation, you can perform a nodetool status to check if the Cassandra node is up.

$ kubectl exec -ti cassandra-0 -- nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address          Load       Tokens       Owns (effective)   Host ID                               Rack
UN  172.xxx.xxx.xxx  109.28 KB  256          100.0%             6402e90e-7996-4ee2-bb8c-37217eb2c9ec  Rack1

G. Scaling Stateful Set

Invoke the scale command to increase or decrease the size of the Stateful Set by replicating the setup in (F) x number of times. In the example, below we replicate with a value of x = 3.

$ kubectl scale --replicas=3 statefulset/cassandra

Invoke get statefulsets to validate if the Stateful Sets have been deployed in the Cassandra server.

$ kubectl get statefulsets
NAME        DESIRED   CURRENT   AGE
cassandra   3         3         2h

Invoke get pods again to view an ordinal listing of the Pods created by the Stateful Set. Note that as the Cassandra pods deploy, they are created in a sequential fashion.

$ kubectl get pods -o wide
NAME          READY  STATUS    RESTARTS  AGE   IP               NODE
cassandra-0   1/1    Running   0         13m   172.xxx.xxx.xxx 169.xxx.xxx.xxx
cassandra-1   1/1    Running   0         38m   172.xxx.xxx.xxx 169.xxx.xxx.xxx
cassandra-2   1/1    Running   0         38m   172.xxx.xxx.xxx 169.xxx.xxx.xxx

We can perform a nodetool status check after 5 minutes to verify that the Cassandra nodes have joined and formed a Cassandra cluster.

$ kubectl exec -ti cassandra-0 -- nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load       Tokens       Owns (effective)  Host ID                              Rack
UN  172.xxx.xxx.xxx  103.25 KiB  256          68.7%             633ae787-3080-40e8-83cc-d31b62f53582  Rack1
UN  172.xxx.xxx.xxx  108.62 KiB  256          63.5%             e95fc385-826e-47f5-a46b-f375532607a3  Rack1
UN  172.xxx.xxx.xxx  177.38 KiB  256          67.8%             66bd8253-3c58-4be4-83ad-3e1c3b334dfd  Rack1

We can perform a host of database operations by invoking CQL once the status of our nodes in nodetool changes to Up/Normal.

H. Invoking CQL for database access and operations

Once we see a status of U/N we can access the Cassandra container by invoking cqlsh.

kubectl exec -it cassandra-0 cqlsh    
Connected to Cassandra at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> describe tables

Keyspace system_traces
----------------------
events  sessions

Keyspace system_schema
----------------------
tables     triggers    views    keyspaces  dropped_columns
functions  aggregates  indexes  types      columns        

Keyspace system_auth
--------------------
resource_role_permissons_index  role_permissions  role_members  roles

Keyspace system
---------------
available_ranges          peers               batchlog        transferred_ranges
batches                   compaction_history  size_estimates  hints             
prepared_statements       sstable_activity    built_views   
"IndexInfo"               peer_events         range_xfers   
views_builds_in_progress  paxos               local         

Keyspace system_distributed
---------------------------
repair_history  view_build_status  parent_repair_history

I. Moving Forward: Using Cassandra as a Persistence Layer for a High-Availability Stateless Database Service

In the foregoing exercise, we deployed a Cassandra service in a K8s cluster and provisioned persistent storage via PersistentVolume. We then used StatefulSets to endow our Cassandra cluster with stateful processing properties and scaled our cluster to additional nodes. We are now able to use a CQL schema for database access and operations in our Cassandra cluster. The advantage of a CQL schema is the ease with which we can use natural types and fluent APIs that makes for seamless data modeling especially in solutions involving scaling and time series data models, such as fraud detection solutions. In addition, CQL leverages partition and clustering keys which increases speed of operation in data modeling scenarios.

In the next sequence in this series, we will explore how we can use Cassandra as our persistence layer in a Database-as-a-Microservice or a stateless database by leveraging the unique architectural properties of Cassandra and using the Rancher toolset as our starting point. We will then analyze the operational performance and latency of our Cassandra-driven stateless database application and evaluate its usefulness in designing high-availability services with low latency between the edge and the cloud.

By combining Cassandra with a microservices architecture, we can explore alternatives to stateful databases, both in-memory SQL databases (such as SAP HANA) prone to poor latency /ifor read/write transactions and HTAP workloads as well as NoSQL databases that are slow in performing advanced analytics that require multi-table queries or complex filters. In parallel, a stateless architecture can deliver improvements on issues that stateful databases face arising from memory exceptions, both due to in-memory indexes in SQL databases and high memory usage in multi-model NoSQL databases. Improvements on both these fronts will deliver better operational performance for massively scaled queries and time-series modeling.

Hisham Hasan

Hisham is a consulting Enterprise Solutions Architect with experience in leveraging container technologies to solve infrastructure problems and deploy applications faster and with higher levels of security, performance and reliability. Recently, Hisham has been leveraging containers and cloud-native architecture for a variety of middleware applications to deploy complex and mission-critical services across the enterprise. Prior to entering the consulting world, Hisham worked at Aon Hewitt, Lexmark and ADP in software implementation and technical support.