Continental Innovates with Rancher and Kubernetes
In this guide, we recommend best practices for cluster-level logging and application logging.
Prior to Rancher v2.5, logging in Rancher has historically been a pretty static integration. There were a fixed list of aggregators to choose from (ElasticSearch, Splunk, Kafka, Fluentd and Syslog), and only two configuration points to choose (Cluster-level and Project-level).
Logging in 2.5 has been completely overhauled to provide a more flexible experience for log aggregation. With the new logging feature, administrators and users alike can deploy logging that meets fine-grained collection criteria while offering a wider array of destinations and configuration options.
“Under the hood”, Rancher logging uses the Banzai Cloud logging operator. We provide manageability of this operator (and its resources), and tie that experience in with managing your Rancher clusters.
For some users, it is desirable to scrape logs from every container running in the cluster. This usually coincides with your security team’s request (or requirement) to collect all logs from all points of execution.
In this scenario, it is recommended to create at least two ClusterOutput objects - one for your security team (if you have that requirement), and one for yourselves, the cluster administrators. When creating these objects take care to choose an output endpoint that can handle the significant log traffic coming from the entire cluster. Also make sure to choose an appropriate index to receive all these logs.
Once you have created these ClusterOutput objects, create a ClusterFlow to collect all the logs. Do not define any Include or Exclude rules on this flow. This will ensure that all logs from across the cluster are collected. If you have two ClusterOutputs, make sure to send logs to both of them.
ClusterFlows have the ability to collect logs from all containers on all hosts in the Kubernetes cluster. This works well in cases where those containers are part of a Kubernetes pod; however, RKE containers exist outside of the scope of Kubernetes.
Currently (as of v2.5.1) the logs from RKE containers are collected, but are not able to easily be filtered. This is because those logs do not contain information as to the source container (e.g. etcd or kube-apiserver).
A future release of Rancher will include the source container name which will enable filtering of these component logs. Once that change is made, you will be able to customize a ClusterFlow to retrieve only the Kubernetes component logs, and direct them to an appropriate output.
Best practice not only in Kubernetes but in all container-based applications is to direct application logs to stdout/stderr. The container runtime will then trap these logs and do something with them - typically writing them to a file. Depending on the container runtime (and its configuration), these logs can end up in any number of locations.
In the case of writing the logs to a file, Kubernetes helps by creating a /var/log/containers directory on each host. This directory symlinks the log files to their actual destination (which can differ based on configuration or container runtime).
Rancher logging will read all log entries in /var/log/containers, ensuring that all log entries from all containers (assuming a default configuration) will have the opportunity to be collected and processed.
Log collection only retrieves stdout/stderr logs from pods in Kubernetes. But what if we want to collect logs from other files that are generated by applications? Here, a log streaming sidecar (or two) may come in handy.
The goal of setting up a streaming sidecar is to take log files that are written to disk, and have their contents streamed to stdout. This way, the Banzai Logging Operator can pick up those logs and send them to your desired output.
To set this up, edit your workload resource (e.g. Deployment) and add the following sidecar definition:
- mountPath: /path/to/your/log
This will add a container to your workload definition that will now stream the contents of (in this example) /path/to/your/log/file.log to stdout.
This log stream is then automatically collected according to any Flows or ClusterFlows you have setup. You may also wish to consider creating a Flow specifically for this log file by targeting the name of the container. See example: