gluster
logoGlusterFS is a scalable, highly available, and distributed network file system widely used for applications that need shared storage including cloud computing, media streaming, content delivery networks, and web cluster solutions. High availability is ensured by the fact that storage data is redundant, so in case one node fails another will cover it without service interruption. In this post I’ll show you how to create a GlusterFS cluster for Docker that you can use to store your containers data. The storage volume where data resides is replicated twice, so data will be accessible if at least one Gluster container is working. We’ll use Rancher for Docker management and orchestration. In order to test storage availability and reliability I’ll be deploying an Asteroids game. GlusterFS-Asteroids-Architecture

Prerequisites

Preparing AWS environment Before deploying the GlusterFS cluster you need to satisfy the following requirements in AWS:

  • Create an Access Key to use Rancher AWS provisioning feature. You can get an Access Ke
    • Allow 22/tcp, 2376/tcp and 8080/tcp ports from any source, needed for Docker machine to provision hosts
    • Allow 500/udp and 4500/udp ports from any source, needed for Rancher network
    • Allow 9345/tcp and 9346/tcp ports from any source, needed for UI features like graphs, view logs, and execute shell
    • Allow 80/tcp and 443/tcp ports from any source, needed to publish the Asteroids game
  • Create a RancherOS instance (look for RancherOS AMI in Community AMIs). Configure it to run Rancher Server by defining the following user data and associate it to the Gluster Security Group. Once the instance is running you can browse to Rancher UI: http://RANCHER_INSTANCE_PUBLIC_IP:8080/

#!/bin/bash
docker run -d -p 8080:8080 rancher/server:v0.17.1

Preparing Docker images

I have prepared two Docker images that we are using later. This is how I built them. The GlusterFS server image This is the Dockerfile:

FROM ubuntu:14.04

MAINTAINER Manel Martinez <[email protected]>

RUN apt-get update && \
    apt-get install -y python-software-properties software-properties-common
RUN add-apt-repository -y ppa:gluster/glusterfs-3.5 && \
    apt-get update && \
    apt-get install -y glusterfs-server supervisor

RUN mkdir -p /var/log/supervisor

ENV GLUSTER_VOL ranchervol
ENV GLUSTER_REPLICA 2
ENV GLUSTER_BRICK_PATH /gluster_volume
ENV GLUSTER_PEER **ChangeMe**
ENV DEBUG 0

VOLUME ["/gluster_volume"]

RUN mkdir -p /usr/local/bin
ADD ./bin /usr/local/bin
RUN chmod +x /usr/local/bin/*.sh
ADD ./etc/supervisord.conf /etc/supervisor/conf.d/supervisord.conf

CMD ["/usr/local/bin/run.sh"]

As you can see, we are using 2 replicas for distributing the Gluster volume ranchervol. All its data will be persisted in Docker volume /gluster_volume. Note that we are not exposing any port because GlusterFS containers are connecting through Rancher network. The run.sh script is as follows:

#!/bin/bash

[ "$DEBUG" == "1" ] && set -x

prepare-gluster.sh &
/usr/bin/supervisord

It will invoke another script to prepare GlusterFS cluster in background. This is required because Gluster commands need to be executed when its daemon is running. This is the content for prepare-gluster.sh script:

#!/bin/bash

set -e

[ "$DEBUG" == "1" ] && set -x

if [ "${GLUSTER_PEER}" == "**ChangeMe**" ]; then
   # This node is not connecting to the cluster yet
   exit 0
fi

echo "=> Waiting for glusterd to start..."
sleep 10

if gluster peer status | grep ${GLUSTER_PEER} >/dev/null; then
   echo "=> This peer is already part of Gluster Cluster, nothing to do..."
   exit 0
fi

echo "=> Probing peer ${GLUSTER_PEER}..."
gluster peer probe ${GLUSTER_PEER}

echo "=> Creating GlusterFS volume ${GLUSTER_VOL}..."
my_rancher_ip=`echo ${RANCHER_IP} | awk -F\/ '{print $1}'`
gluster volume create ${GLUSTER_VOL} replica ${GLUSTER_REPLICA} ${my_rancher_ip}:${GLUSTER_BRICK_PATH} ${GLUSTER_PEER}:${GLUSTER_BRICK_PATH} force

echo "=> Starting GlusterFS volume ${GLUSTER_VOL}..."
gluster volume start ${GLUSTER_VOL}

As we can see, if we don’t provide GLUSTER_PEER environment variable the container will only start GlusterFS daemon and wait for a second peer container to join the cluster. The second container needs to know about GLUSTER_PEER address in order to contact it (peer probe) and create the shared storage volume. This is the supervisor configuration file, needed to start GlusterFS daemon:

[supervisord]
nodaemon=true

[program:glusterd]
command=/usr/sbin/glusterd -p /var/run/glusterd.pid

The following commands are required to publish the Docker image:

docker build -t nixel/rancher-glusterfs-server .
docker push nixel/rancher-glusterfs-server .

The Asteroids game image This is the image we are using to publish the Asteroids HTML5 game for testing Gluster HA capabilities. This container acts as a GlusterFS client that will mount the shared volume where the following game content is being stored:

  • static files (HTML, JS, CSS) needed to open the client-side game in your browser. A Nginx server will publish this to the Internet.
  • A WebSocket server application used to handle user connections and control game logics. A Node.js service will publish this application to the Internet.

This is the Dockerfile which defines the image:

FROM ubuntu:14.04

MAINTAINER Manel Martinez <[email protected]>

RUN apt-get update && \
    apt-get install -y python-software-properties software-properties-common
RUN add-apt-repository -y ppa:gluster/glusterfs-3.5 && \
    apt-get update && \
    apt-get install -y git nodejs nginx supervisor glusterfs-client dnsutils

ENV GLUSTER_VOL ranchervol
ENV GLUSTER_VOL_PATH /mnt/${GLUSTER_VOL}
ENV GLUSTER_PEER **ChangeMe**
ENV DEBUG 0

ENV HTTP_CLIENT_PORT 80
ENV GAME_SERVER_PORT 443
ENV HTTP_DOCUMENTROOT ${GLUSTER_VOL_PATH}/asteroids/documentroot

EXPOSE ${HTTP_CLIENT_PORT}
EXPOSE ${GAME_SERVER_PORT}

RUN mkdir -p /var/log/supervisor ${GLUSTER_VOL_PATH}
WORKDIR ${GLUSTER_VOL_PATH}

RUN mkdir -p /usr/local/bin
ADD ./bin /usr/local/bin
RUN chmod +x /usr/local/bin/*.sh
ADD ./etc/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
ADD ./etc/nginx/sites-available/asteroids /etc/nginx/sites-available/asteroids

RUN echo "daemon off;" >> /etc/nginx/nginx.conf
RUN rm -f /etc/nginx/sites-enabled/default
RUN ln -fs /etc/nginx/sites-available/asteroids /etc/nginx/sites-enabled/asteroids
RUN perl -p -i -e "s/HTTP_CLIENT_PORT/${HTTP_CLIENT_PORT}/g" /etc/nginx/sites-enabled/asteroids
RUN HTTP_ESCAPED_DOCROOT=`echo ${HTTP_DOCUMENTROOT} | sed "s/\//\\\\\\\\\//g"` && perl -p -i -e "s/HTTP_DOCUMENTROOT/${HTTP_ESCAPED_DOCROOT}/g" /etc/nginx/sites-enabled/asteroids

RUN perl -p -i -e "s/GAME_SERVER_PORT/${GAME_SERVER_PORT}/g" /etc/supervisor/conf.d/supervisord.conf
RUN HTTP_ESCAPED_DOCROOT=`echo ${HTTP_DOCUMENTROOT} | sed "s/\//\\\\\\\\\//g"` && perl -p -i -e "s/HTTP_DOCUMENTROOT/${HTTP_ESCAPED_DOCROOT}/g" /etc/supervisor/conf.d/supervisord.conf

CMD ["/usr/local/bin/run.sh"]

And this is the run.sh script:

#!/bin/bash

set -e

[ "$DEBUG" == "1" ] && set -x && set +e

if [ "${GLUSTER_PEER}" == "**ChangeMe**" ]; then
   echo "ERROR: You did not specify "GLUSTER_PEER" environment variable - Exiting..."
   exit 0
fi

ALIVE=0
for PEER in `echo "${GLUSTER_PEER}" | sed "s/,/ /g"`; do
    echo "=> Checking if I can reach GlusterFS node ${PEER} ..."
    if ping -c 10 ${PEER} >/dev/null 2>&1; then
       echo "=> GlusterFS node ${PEER} is alive"
       ALIVE=1
       break
    else
       echo "*** Could not reach server ${PEER} ..."
    fi
done

if [ "$ALIVE" == 0 ]; then
   echo "ERROR: could not contact any GlusterFS node from this list: ${GLUSTER_PEER} - Exiting..."
   exit 1
fi

echo "=> Mounting GlusterFS volume ${GLUSTER_VOL} from GlusterFS node ${PEER} ..."
mount -t glusterfs ${PEER}:/${GLUSTER_VOL} ${GLUSTER_VOL_PATH}

echo "=> Setting up asteroids game..."
if [ ! -d ${HTTP_DOCUMENTROOT} ]; then
   git clone https://github.com/BonsaiDen/NodeGame-Shooter.git ${HTTP_DOCUMENTROOT}
fi

my_public_ip=`dig -4 @ns1.google.com -t txt o-o.myaddr.l.google.com +short | sed "s/\"//g"`
perl -p -i -e "s/HOST = '.*'/HOST = '${my_public_ip}'/g" ${HTTP_DOCUMENTROOT}/client/config.js
perl -p -i -e "s/PORT = .*;/PORT = ${GAME_SERVER_PORT};/g" ${HTTP_DOCUMENTROOT}/client/config.js

/usr/bin/supervisord

As you can see we need to inform about GlusterFS containers where ranchervol storage is being served using GLUSTER_PEER environment variable. Although GlusterFS client does not need to know about all cluster nodes, this is useful for Asteroids container to be able to mount the volume if at least one GlusterFS container is alive. We are proving this HA feature later. In this case we are exposing 80 (Nginx) and 443 (Node.js Websocket server) ports so we can open the game in our browser. This is the Nginx configuration file:

server {
    listen HTTP_CLIENT_PORT;
    location / {
        root HTTP_DOCUMENTROOT/client/;
    }
}

And the following supervisord configuration is required to run Nginx and Node.js:

[supervisord]
nodaemon=true

[program:nginx]
command=/usr/sbin/nginx

[program:nodejs]
command=/usr/bin/nodejs HTTP_DOCUMENTROOT/server/server.js GAME_SERVER_PORT

Finally, the run.sh script will download the Asteroids source code and save it on GlusterFS shared volume. The last step is to replace the required parameters on configuration files to run Nginx and Node.js server application. The following commands are needed to publish the Docker image:

docker build -t nixel/rancher-glusterfs-client .
docker push nixel/rancher-glusterfs-client .

Creating Docker hosts

Now we need to create three Docker hosts, two of them used to run GlusterFS server containers, and the third to publish the Asteroids game. Create\_Amazon\_Instance In Rancher UI, click + Add Host button and choose Amazon EC2 provider. You need to specify, at least, the following information:

  • Container names
  • Amazon Access Key and Secret Key that you got before.
  • EC2 Region, Zone and VPC/Subnet ID. Be sure to choose the same region, zone and VPC/subnet ID where Rancher Server is deployed.
  • Type the Security Group name that we created before: Gluster.

Repeat this step three times to create gluster01, gluster02, and asteroids hosts. Gluster
hosts

Adding GlusterFS server containers

Now you are ready to deploy your GlusterFS cluster. First, click + Add Container button on gluster01 host and enter the following information:

  • Name: gluster01
  • Image: nixel/rancher-glusterfs-server:latest

Expand Advanced Options and follow these steps:

  • Volumes section - Add this volume: /gluster_volume:/gluster_volume
  • Networking section - Choose Managed Network on Docker0
  • Security/Host section - Enable Give the container full access to the host checkbox

Create\_gluster\_server\_container\_1 Now wait for gluster01 container to be created and copy its Rancher IP address, you are needing it now. Then click + Add Container button on gluster02 host to create the second GlusterFS server container with the following configuration:

  • Name: gluster02
  • Image: nixel/rancher-glusterfs-server:latest

Expand Advanced Options and follow these steps:

  • Command section - Add an Environment Variable named GLUSTER_PEER which value is the gluster01 container IP. In my case it is 10.42.46.31
  • Volumes section - Add this volume: /gluster_volume:/gluster_volume
  • Networking section - Choose Managed Network on Docker0
  • Security/Host section - Enable Give the container full access to the host checkbox

Gluster02\_container\_env\_vars Now wait for gluster02 container to be created and open its menu, then click View Logs option. Gluster02\_container\_logs\_menu You will see the following messages at the bottom of log screen confirming that shared volume was successfully created. Gluster02\_container\_logs

Adding Asteroids container

Now it is time to create our GlusterFS client container, which is publishing an Asteroids game to the Internet. Click + Add Container on asteroids host and enter the following container information:

  • Name: asteroids
  • Image: nixel/rancher-glusterfs-client:latest
  • Port Map: map 80 (public) port to 80 (container) TCP port
  • Port Map: map 443 (public) port to 443 (container) TCP port

Expand Advanced Options and follow these steps:

  • Command section - Add an Environment Variable named GLUSTER_PEER which value is a comma separated list of gluster01 and gluster02 containers IPs. In my case I’m typing this: 10.42.46.31,10.42.235.105
  • Networking section - Choose Managed Network on Docker0
  • Security/Host section - Enable Give the container full access to the host checkbox

Note that we are not configuring any container volume, because all data is stored in GlusterFS cluster. asteroids\_container\_port\_mapping Wait for asteroids container to be created and show its logs. You will find something like this at the top: asteroids\_container\_top\_logs You will also see how Nginx server and Node.js application are started at the bottom: asteroids\_container\_bottom\_logs At this point your Rancher environment is up and running. all\_containers\_gluster

Testing GlusterFS HA capabilities

Asteroids\_game It is time to play and test GLusterFS HA capabilities. What you are doing now is to stop one GlusterFS container and check that game will not suffer downtimes. Browse to http://ASTEROIDS_HOST_PUBLIC_IP and you will access Asteroids game, enter your name and try to explode some asteroids. Go to Rancher UI and stop gluster02 container, then open a new browser tab and navigate to the game again. The game is accessible. You can start gluster02 container, then stop gluster01 container, and try again. You are still able to play. Finally, keep gluster01 container stopped, restart asteroids container and wait for it to start. As you can see, if at least one GlusterFS server container is running you are able to play. Finally you may want to stop gluster01 and gluster02 containers to check how game becomes unavailable because its public content is not reachable now. To recover the service start gluster01 and/or gluster02 containers again.

Conclusion

Shared storage is a required feature when you have to deploy software that needs to share information across all nodes. In this post you have seen how to easily deploy a Highly Available shared storage solution for Rancher based on GlusterFS Docker images. By using an Asteroids game you have checked that storage is available when, at least, one GlusterFS container is running. In future posts we are combining the shared storage solution with Rancher Load Balancing feature, added in 0.16 version, so you will see how to build scalable, distributable, and Highly Available Web server solutions ready for production use. To learn more about Rancher, please join us for our next online meetup, where we’ll be demonstrating some of these features and answering your questions. Manel Martinez is a Linux systems engineer with experience in the design and management of scalable, distributable and highly available open source web infrastructures based on products like KVM, Docker, Apache, Nginx, Tomcat, Jboss, RabbitMQ, HAProxy, MySQL and XtraDB. He lives in spain, and you can find him on Twitter @manel_martinezg.