Container platforms should provide a magical experience for developers. Autoscaling stateless applications is one compelling feature that most developers love.
Kubernetes has had the ability to autoscale both pods and nodes for a while. However, these autoscaling features are quite basic. For example the Horizontal Pod Autoscaler will only scale pods based on CPU and memory. Similarly the Cluster Autoscaler uses a very rough approximation for resource utilisation as a metric to scale.
A number of projects now exist that improve how auto-scaling works. In this blog we’ll compare the 11 that I’ve found by searching Github. We’ll group them under two categories.
Cluster Autoscaler is the de-facto method for scaling nodes up and down in a Kubernetes cluster. It currently works on AWS, Azure and GCP. It scales nodes count by modifying the autoscaling group. In the background it constantly checks to see if pods on the cluster are failing to start because of insufficient resources and if they are it will trigger a scaling event. Effectively this autoscaler will wait until you have exhausted every resource and then begin to scale. It will also scale down when a certain threshold of resources are idle.
The kubernetes-ec2-autoscaler is a node-level autoscaler for Kubernetes on AWS EC2 that is designed for batch jobs. The key metric used for scaling is the number of pending scheduled pods. This is more predictable if you know that spinning up a certain number of pods requires a certain number of nodes due to consistent resource usage on each. It also has a couple of nice features like AWS multi region support and the ability to safely drain nodes when scaling down.
Based on some frustration with the Cluster Autoscaler somebody decided to make a better one but specifically for AWS. This became the kube-aws-autoscaler. One frustration is that the Cluster Autoscaler only scales when it’s too late i.e. resources are fully exhausted. It also doesn’t respect AZ placement so you could end up with an unbalanced cluster in terms of fault tolerance. The kube-aws-autoscaler also supports scaling multiple ASG’s.
Finally, for cluster autoscaling we have the JupyterHub Autoscaler. JupyterHub Notebooks are used frequently by data scientists working across large amounts of data.
It’s good to see Kubernetes being used for a wide range of use cases including big data.
One of the major drawbacks of the default Horizontal Pod Autoscaler (HPA) is the fact that it can only scale off of CPU and memory utilisation. Luckily k8s-prom-hpa solves this problem by allowing you to scale your pods based on any metric collected by Prometheus on your cluster. So could scale your applications based on a combination of system resources, application performance metrics or even business metrics.
Another interesting choice for pod autoscaling is the cluster-proportional-autoscaler. This allows you to scale the cluster and the pods will proportionately increase or decrease with cluster size. Whereas with the Horizontal Pod Autoscaler you must pre-configure CPU and memory on all containers, the cluster-proportional-autoscaler doesn’t need this as an input and will calculate it dynamically.
Finally for the pod autoscaling we have a selection of autoscalers that will scale your pods based on number of messages on a queue. The ones I found work with AMQP, RabbitMQ and SQS. These seem to be quite handy scaling the asynchronous job queues a lot of web applications rely on.
I found an autoscaler called the kubernetes-webhook-autoscaler this looks like a cool thing to use for really complex autoscaling tasks. You could build a simple API service outside of Kubernetes and abstract all of your cloud autoscaling logic behind it.
That’s it for my research so far. I’m keen to look at other autoscalers so if you find any please leave me a message or send me the details by the contact page.
Tell us about a new Kubernetes application
Never miss a thing! Sign up for our newsletter to stay updated.
Discover and share new Kubernetes applications