Reducing Your AWS Spend with AutoSpotting and Rancher

on Dec 7, 2017

Ye Olde Worlde

Back in older times, B.C. as in Before Cloud, to put a service live you had to:

  1. Spend months figuring out how much hardware you needed
  2. Wait at least eight weeks for your hardware to arrive
  3. Allow another four weeks for installation
  4. Then, configure firewall ports
  5. Finally, add servers to config management and provision them


All of this was in an organised company!

The Now

The new norm is to use hosted instances. You can scale these up and down based on requirements and demand. Servers are available in a matter of seconds.

With containers, you no longer care about actual servers. You only care about compute resource. Once you have an orchestrator like Rancher, you don’t need to worry about maintaining scale or setting where containers run, as Rancher takes care of all of that.

Rancher continuously monitors and assesses the requirements that you set and does its best to ensure that everything is running. Obviously, we need some compute resource, but it can run for days or hours. The fact is, with containers, you pretty much don’t need to worry.

Reducing Cost

So, how can we take advantage of the flexibility of containers to help us reduce costs?

There are a couple of things that you can do. Firstly (and this goes for VMs as well as containers), do you need all your environments running all the time? In a world where you own the kit and there is no cost advantage to shutting down environments versus keeping them running, this practice was accepted. But in the on-demand world, there is a cost associated with keeping things running. If you only utilise a development or testing environment for eight hours a day, then you are paying four times as much by keeping it running 24 hours a day! So, shutting down environments when you’re not using them is one way to reduce costs.

The second thing you can do (and the main reason behind this post) is using Spot Instances. Not heard of them? In a nutshell, they’re a way of getting cheap compute resource in AWS. Interested in saving up to 80% of your AWS EC2 bill? Then keep reading.

The challenge with Spot Instances is that they can terminate after only two minutes’ notice. That causes problems for traditional applications, but containers handle this more fluid nature of applications with ease.

Within AWS, you can directly request Spot Instances, individually or in a fleet, and you set a maximum price for the instance. Once you breach this price, AWS gives you two minutes and then terminates the instance.

AutoSpotting

What if you could have an Auto Scaling Group (ASG) with an On-Demand Instance type to which you could revert if you breached the Spot Instance price? Step forward an awesome piece of open-source software called AutoSpotting. You can find the source and more information on GitHub.

AutoSpotting works by replacing On-Demand Instances from within an ASG with individual Spot Instances. AutoSpotting takes a copy of the launch config of the ASG and starts a Spot Instance (of equivalent spec or more powerful) with the exact same launch config. Once this new instance is up and running, AutoSpotting swaps out one of the On-Demand Instances in the ASG with this new Spot Instance, in the process terminating the more expensive On-Demand Instance. It will continue this process until it replaces all instances. (There is a configuration option that allows you to specify the percentage of the ASG that you want to replace. By default, it’s 100%.)

AutoSpotting isn’t application aware. It will only start a machine with an identical configuration. It doesn’t perform any graceful halting of applications. It purely replaces a virtual instance with a cheaper virtual instance. For these reasons, it works great for Docker containers that are managed by an orchestrator like Rancher. When a compute instance disappears, then Rancher takes care of maintaining the scale.

To facilitate a smoother termination, I’ve created a helper service, AWS-Spot-Instance-Helper, that monitors to see if a host is terminating. If it is, then the helper uses the Rancher evacuate function to more gracefully transition running containers from the terminating host. This helper isn’t tied to AutoSpotting, and anyone who is using Spot Instances or fleets with Rancher can use it to allow for more graceful terminations.

Want an example of what it does to the cost of running an environment?

Can you guess which day I implemented it? OK, so I said up to 80% savings but, in this environment, we didn’t replace all instances at the point when I took this measurement.

So, why are we blogging about it now? Simple: We’ve taken it and turned it into a Rancher Catalog application so that all our Rancher AWS users can easily consume it.

3 Simple Steps to Saving Money

Step 1

Go to the Catalog > Community and select AutoSpotting.

Step 2

Fill in the AWS Access Key and Secret Key. (These are the only mandatory fields.)

The user must have the following AWS permissions:

autoscaling:DescribeAutoScalingGroups

autoscaling:DescribeLaunchConfigurations

autoscaling:AttachInstances

autoscaling:DetachInstances

autoscaling:DescribeTags

autoscaling:UpdateAutoScalingGroup

ec2:CreateTags

ec2:DescribeInstances

ec2:DescribeRegions

ec2:DescribeSpotInstanceRequests

ec2:DescribeSpotPriceHistory

ec2:RequestSpotInstances

ec2:TerminateInstances

Optionally, set the Tag Name. By default, it will look for spot-enabled. I’ve slightly modified the original code to allow the flexibility of running multiple AutoSpotting containers in an environment. This modification allows you to use multiple policies in the same AWS account.

Then, click Launch.

Step 3

Add the tag (user-specified or spot-enabled, with a value of true) to any AWS ASGs on which you want to save money. Cheap (and often more powerful) Spot Instances will gradually replace your instances.

To deploy the AWS-Spot-Instance-Helper service, simply browse to the Catalog > Community and launch the application.

Thanks goes out to Cristian Măgherușan-Stanciu and other contributors for writing such a great piece of open-source software.

 

About the Author

Chris Urwin works as a field engineer for Rancher Labs based out of the UK, helping our enterprise clients get the most out of Rancher.

Online Meetup: Managing Kubernetes Clusters with Rancher 2.0

Thursday, November 30 at 1PM ET

One of the things we’re really excited about in the Rancher 2.0 tech preview is centralized management of multiple Kubernetes clusters.

Join us Thursday, November 30 as we explore how the new cluster management features significantly increase visibility into and control of your Kubernetes clusters.

Register here

Recent Posts


Upcoming Events