Elasticsearch is an open-source search engine based on Apache Lucene and developed by Elastic. It focuses on features like scalability, resilience, and performance, and companies all around the world, including Mozilla, Facebook, Github, Netflix, eBay, the New York Times, and others, use it every day. Elasticsearch is one of the most popular analytics platforms for large datasets and is present almost everywhere that you find a search engine. It uses a document-oriented approach when manipulating data, and it can parse it in almost real-time while a user is performing a search. It stores data in JSON and organizes data by index and type.

If we draw analogs between the components of a traditional relational database and those of Elasticsearch, they look like this:

  • Database or Table -> Index
  • Row/Column -> Document with properties

Elasticsearch Advantages

  • It originates from Apache Lucene, which provides the most robust full-text search capabilities of any open source product.
  • It uses a document-oriented architecture to store complex real-world entities as structured JSON documents. By default, it indexes all fields, which provides tremendous performance when searching.
  • It doesn’t use a schema with its indices. Documents add new fields by including them, which gives the freedom to add, remove, or change relevant fields without the downtime associated with a traditional database schema upgrade.
  • It performs linguistic searches against documents, returning those that match the search condition. It scores the results using the TFIDF algorithm, bringing more relevant documents higher up in the list of results.
  • It allows fuzzy searching, which helps find results even with misspelled search terms.
  • It supports real-time search autocompletion, returning results while the user types their search query.
  • It uses a RESTful API, exposing its power via a simple, lightweight interface.
  • Elasticsearch executes complex queries with tremendous speed. It also caches queries, returning cached results for other requests that match a cached filter.
  • It scales horizontally, making it possible to extend resources and balance the load between cluster nodes.
  • It breaks indices into shards, and each shard has any number of replicas. Each node knows the location of every document in the cluster and routes requests internally as necessary to retrieve the data.


Elasticsearch uses specific terms to define its components.

  • Cluster: A collection of nodes that work together.
  • Node: A single server that acts as part of the cluster, stores the data, and participates in the cluster’s indexing and search capabilities.
  • Index: A collection of documents with similar characteristics.
  • Document: The basic unit of information that can be indexed.
  • Shards: Indexes are divided into multiple pieces called shards, which allows the index to scale horizontally.
  • Replicas: Copies of index shards


To perform this demo, you need one of the following:

  • An existing Rancher deployment and Kubernetes cluster, or
  • Two nodes in which to deploy Rancher and Kubernetes, or
  • A node in which to deploy Rancher and a Kubernetes cluster running in a hosted provider such as GKE.

This article uses the Google Cloud Platform, but you may use any other provider or infrastructure.

Launch Rancher

If you don’t already have a Rancher deployment, begin by launching one. The quick start guide covers the steps for doing so.

Launch a Cluster

Use Rancher to set up and configure your cluster according to the guide most suited to your environment.

Deploy Elasticsearch

If you are already comfortable with kubectl, you can apply the manifests directly. If you prefer to use the Rancher user interface, scroll down for those instructions.

We will deploy Elasticsearch as a StatefulSet with two Services: a headless service for communicating with the pods and another for interacting with Elasticsearch from outside of the Kubernetes cluster.


apiVersion: v1
kind: Service
  name: elasticsearch-cluster
  clusterIP: None
    app: es-cluster
  - name: transport
    port: 9300
$ kubectl apply -f svc-cluster.yaml
service/elasticsearch-cluster created


apiVersion: v1
kind: Service
  name: elasticsearch-loadbalancer
    app: es-cluster
  - name: http
    port: 80
    targetPort: 9200
  type: LoadBalancer
$ kubectl apply -f svc-loadbalancer.yaml
service/elasticsearch-loadbalancer created


apiVersion: v1
kind: ConfigMap
  name: es-config
  elasticsearch.yml: | my-elastic-cluster ""
    bootstrap.memory_lock: false elasticsearch-cluster
    discovery.zen.minimum_master_nodes: 1 false
    xpack.monitoring.enabled: false
  ES_JAVA_OPTS: -Xms512m -Xmx512m
apiVersion: apps/v1beta1
kind: StatefulSet
  name: esnode
  serviceName: elasticsearch
  replicas: 2
    type: RollingUpdate
        app: es-cluster
        fsGroup: 1000
      - name: init-sysctl
        image: busybox
        imagePullPolicy: IfNotPresent
          privileged: true
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
      - name: elasticsearch
                memory: 1Gi
          privileged: true
          runAsUser: 1000
            - IPC_LOCK
            - SYS_RESOURCE
        - name: ES_JAVA_OPTS
                  name: es-config
                  key: ES_JAVA_OPTS
            scheme: HTTP
            path: /_cluster/health?local=true
            port: 9200
          initialDelaySeconds: 5
        - containerPort: 9200
          name: es-http
        - containerPort: 9300
          name: es-transport
        - name: es-data
          mountPath: /usr/share/elasticsearch/data
        - name: elasticsearch-config
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          subPath: elasticsearch.yml
        - name: elasticsearch-config
            name: es-config
              - key: elasticsearch.yml
                path: elasticsearch.yml
  - metadata:
      name: es-data
      accessModes: [ "ReadWriteOnce" ]
          storage: 5Gi
$ kubectl apply -f es-sts-deployment.yaml
configmap/es-config created
statefulset.apps/esnode created

Deploy Elasticsearch via the Rancher UI

If you prefer, import each of the manifests above into your cluster via the Rancher UI. The screenshots below shows the process for each of them.

Import svc-cluster.yaml





Import svc-loadbalancer.yaml



Import es-sts-deployment.yaml





Retrieve the Load Balancer IP

You’ll need the address of the load balancer that we deployed. You can retrieve this via kubectl or the UI.

Use the CLI

$ kubectl get svc elasticsearch-loadbalancer
NAME                         TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE
elasticsearch-loadbalancer   LoadBalancer   80:30604/TCP   33m

Use the UI


Test the Cluster

Use the address we retrieved in the previous step to query the cluster for basic information.

$ curl
  "name" : "d7bDQcH",
  "cluster_name" : "my-elastic-cluster",
  "cluster_uuid" : "e3JVAkPQTCWxg2vA3Xywgg",
  "version" : {
    "number" : "6.5.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "816e6f6",
    "build_date" : "2018-11-09T18:58:36.352602Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  "tagline" : "You Know, for Search"

Query the cluster for information about its nodes. The asterisk in the master column highlights the current master node.

$ curl
ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name           24          97   5    0.05    0.12     0.13 mdi       -      d7bDQcH           28          96   4    0.01    0.05     0.04 mdi       *      WEOeEqC

Check the available indices:

$ curl
health status index uuid pri rep docs.count docs.deleted store.size

Because this is a fresh install, it doesn’t have any indices or data. To continue this tutorial, we’ll inject some sample data that we can use later. The files that we’ll use are available from the Elastic website. Download them and then load them with the following commands:

$ curl -H 'Content-Type: application/x-ndjson' -XPOST \
    '' --data-binary @shakespeare_6.0.json
$ curl -H 'Content-Type: application/x-ndjson' -XPOST \
    '' --data-binary @accounts.json
$ curl -H 'Content-Type: application/x-ndjson' -XPOST \
    '' --data-binary @logs.json

When we recheck the indices, we see that we have five new indices with data.

$ curl
health status index               uuid                   pri rep docs.count docs.deleted store.size
green  open   logstash-2015.05.20 MFdWJxnsTISH0Z9Vr0aT3g   5   1       4750            0     49.9mb         25.2mb
green  open   logstash-2015.05.18 lLHV2nzvTOG9mzlpKaG9sg   5   1       4631            0     46.5mb         23.5mb
green  open   logstash-2015.05.19 PqNnVUgXTyaDSfmCQZwbLQ   5   1       4624            0     48.2mb         24.2mb
green  open   shakespeare         rwl3xBgmQtm8B3V7GFeTZQ   5   1     111396            0       46mb         23.1mb
green  open   bank                z0wVGsbrSiG2cQwRXwaCOg   5   1       1000            0    949.2kb        474.6kb

Each of these contains a different type of document. For the shakespeare index, we can search for the name of a play. For the logstash-2015.05.19 index we can query and filter data based on an IP address, and for the bank index we can search for information about a particular account.





Elasticsearch is extremely powerful. It is both simple and complex – simple to deploy and use, and complex in the way that it interacts with its data.

This article has shown you the basics of how to deploy it with Rancher and Kubernetes and how to query it via the RESTful API.

If you wish to explore ways to use Elasticsearch in everyday situations, we encourage you to explore the other parts of the ELK stack: Kibana, Logstash, and Beats. These tools round out an Elasticsearch deployment and make it useful for storing, retrieving, and visualizing a broad range of data from systems and applications.

Calin Rus