In practice Kubernetes capacity is rarely limited by hardware. In fact, our observation tends to indicate that configuration of Kubernetes namespaces are more likely to lead to ineffective use of hardware capacity due to ‘bloat’. The net result is that cluster nodes cannot be driven up to high levels of utilisation and this causes the Kubernetes scheduler difficulties when trying to assign new pod workloads to nodes in the cluster. Additional costs are then incurred to increase cluster node capacity with more capacity that cannot be used effectively.
But, before I dive too deep into the detail, a primer on Kubernetes…
Kubernetes is a distributed system for orchestrating the scheduling of application programs across a cluster of hosting nodes. Kubernetes programs take the form of pods that have one or many containers associated with them. Application code runs in the containers.
Here are some compelling facts about Kubernetes:
| 69% |
The number of organisations adopting Kubernetes for managing container workloads |
| 71% |
The percentage of Kubernetes users who cite scalability as being an essential requirement when evaluating tool options |
| $2.7B |
The expected overall market for containers in 2020. This is 3.5 times higher than 2017 |
It is clear the future of open source Kubernetes is bright as enterprise organisations continue to invest in containerisation technology. The top IT companies including Google, AWS, Oracle, Microsoft and many more are developing capability to support this platform. AWS are promoting their EKS Managed Service to simplify adoption:
Other dominant flavours of Kubernetes include:
AWS Elastic Container Service for Kubernetes:
Microsoft have announced the Azure Kubernetes Service (AKS) for managing the hosted Kubernetes environment:
Part of the appeal of the Kubernetes architecture is the abstraction of containers which allows application code to run independently of operating system guest virtual machines. This allows containers to be freed from the constraints of traditional application deployment and facilitates the rapid and scalable deployment processes ideally suited for DevOps practices.
Architecturally the Kubernetes platform is structured to achieve a high degree of concurrent processing across many different application workloads that are co-hosted in the cluster of nodes. In the simple representation of the Kubernetes topology there are several components. The Master server provide scheduling and control functions for the container workloads whilst the Nodes provide processing capacity in the managed cluster. In the example the distinction between roles is clearly shown.
The capacity of a Kubernetes namespace can be restricted in one of four ways:
It should be noted that CPU and Memory capacity can be limited at the namespace level by configuration of ‘Quota’ limits. In theory the aggregate capacity of all pods in the namespaces could exceed the capacity hardware. The use of pods and quota is intended to prevent this situation from arising and protect other workloads running in the cluster.
A major consideration for the cluster is the ratio of CPU Cores to Memory GB. The node hardware specification should provide a ratio that closely matches the CPU and Memory usage ratio overall. A poor match may suggest that the cluster is not optimised for workloads and result in more node capacity being required.
From a technical point of view a series of factors contribute to capacity ‘bloat’ of Kubernetes. These factors can resolve into very expensive hosting costs and therefore need to be considered when undertaking namespace sizing. At a high level the technical factors are:
How to drive down your Kubernetes costs.
We operate a 6-phase cost optimisation process to ensure the benefits of Cloud Cost management are realised. The process is agnostic of technology and has been successfully utilised in the context of AWS Cost control of Kubernetes container applications.
The process resolves 4 key problem areas that contribute to over provisioning and high Opex costs:
For a more detailed description of the process download our Guide to Cloud Cost Optimisation in AWS.
In delivering the process 6 data inputs are exploited. In each case the data is captured and analysed using proprietary methodology to identify and prioritise suitable optimisation opportunities. Appropriate recommendations for optimisation are then taken to the development teams and a plan of action is agreed and implemented. Later the optimisation benefits are measured and reviewed.
Underpinning these data and analysis requirements is a central Capacity Management Information System which captures all data entities used and generated in the process.
In a large cost saving exercise conducted in 2018, Capacitas achieved an 8:1 ROI by exercising the rightsizing process with the Development teams and DevOps teams performing deployment. The client in question had invested heavily AWS hosting and had built a large cost Kubernetes Clusters.
Working with the client’s development teams we achieved several core Kubernetes savings:
| 59% |
Kubernetes production and development cluster CPU core capacity identified for reduction. |
| 38% |
Kubernetes production and development Memory GB RAM capacity identified for reduction. |
| 2.7k |
Total number of oversized CPU Cores identified |
| 3.7K |
Total amount of oversized Memory GB identified |
| 67 |
Number of AWS EC2 instances to be saved when the sizing recommendations are applied. |
Relative cost savings broken down by environment:
Kubernetes optimisation is not a straight forward process. And in our experience, many Development teams are not yet equipped to deal with the challenge. The result is a capacity provisioning issue at the cluster layer and has causing excessive consumption of hosting capacity.
To help organisations like yours control ‘bloat’ associated with Kubernetes, we’ve devised a methodical approach and process to right-sizing. These methods and processes have achieved an 8:1 ROI for some of our customers.
To learn more about our Cloud Cost optimisation service, follow this link.