Metrics Server is a cluster-wide aggregator of resource usage data. Kubernetes released the new “metrics-server”, as an Alpha feature in Kubernetes 1.7 and slated for beta in 1.8. The collection of documentation and code seems haphazard and difficult to collect and digest. This is my attempt to collect and summarize.
The detailed design of the project can be found in the following docs:
Mostly when we talk about “Kubernetes Metrics” we are interested in the node/container level metrics; CPU, memory, disk, and network. These are also referred to as the “Core” metrics. “Custom” metrics will refer to application metrics, e.g. HTTP request rate.
This document describes API part of MVP version of Resource Metrics API effort in Kubernetes. Once the agreement will be made the document will be extended to also cover implementation details. The shape of the effort may be also a subject of changes once we will have more well-defined use cases.
The goal of the effort is to provide resource usage metrics for pods and nodes through the API server. This will be a stable, versioned API which core Kubernetes components can rely on. In the first version only the well-defined use cases will be handled, although the API should be easily extensible for potential future use cases.
This section describes well-defined use cases which should be handled in the first version. Use cases which are not listed below are out of the scope of MVP version of Resource Metrics API.
HPA uses the latest value of cpu usage as an average aggregated across 1 minute (the window may change in the future). The data for a given set of pods (defined either by pod list or label selector) should be accessible in one request due to performance issues.
Scheduler, in order to schedule best-effort pods, requires node level resource usage metrics as an average aggregated across 1 minute (the window may change in the future). The metrics should be available for all resources supported in the scheduler. Currently, the scheduler does not need this information, because it schedules best-effort pods without considering node usage. But having the metrics available in the API server is a blocker for adding the ability to take node usage into account when scheduling best-effort pods.
Resource Metrics API is an effort to provide a first-class Kubernetes API (stable, versioned, discoverable, available through API server and with client support) that serves resource usage metrics for pods and nodes. The use cases were discussed and the API was proposed a while ago in another proposal. This document describes the architecture and the design of the second part of this effort: making the mentioned API available in the same way as the other Kubernetes APIs.
We want to collect up to 10 metrics from each pod and node running in a cluster. Starting with Kubernetes 1.6 we support 5000 nodes clusters with 30 pods per node. Assuming we want to collect metrics with 1 minute granularity this means:
10 x 5000 x 30 / 60 = 25000 metrics per second by average
Kubernetes API server persists all Kubernetes resources in its key-value store etcd. It’s not able to handle such a load. On the other hand, metrics tend to change frequently, are temporary and in case of loss of them, we can collect them during the next housekeeping operation. We will store them in memory then. This means that we can’t reuse the main API server and instead we will introduce a new one – metrics server.
The API has been already implemented in Heapster, but users and Kubernetes components can only access it through the master proxy mechanism and have to decode it on their own. Heapster serves the API using go HTTP library which doesn’t offer a number of functions that is offered by Kubernetes API servers like authorization/authentication or client generation. There is also a prototype of Heapster using the generic API server library.
The API is in alpha and there is a plan to graduate it to beta (and later to GA), but it’s out of the scope of this document.
In order to make metrics server available for users in exactly the same way as the regular Kubernetes API, we need a mechanism that redirects requests to /APIs/metrics endpoint from the API server to metrics server. The solution to this problem is Kube-aggregator. The effort is on track to be completed for Kubernetes 1.7 release. Previously metrics server was blocked on this dependency.
Metrics server will be implemented in line with Kubernetes monitoring architecture and inspired by Heapster. It will be a cluster level component which periodically scrapes metrics from all Kubernetes nodes served by Kubelet through Summary API. The metrics will be aggregated, stored in memory (see Scalability limitations) and served in the Metrics API format.
Metrics server will use API server library to implement HTTP server functionality. The library offers common Kubernetes functionality like authorization/authentication, versioning, support for an auto-generated client. To store data in memory we will replace the default storage layer (etcd) by introducing an in-memory store which will implement Storage interface.
Only the most recent value of each metric will be remembered. If a user needs an access to historical data they should either use a 3rd party monitoring solution or archive the metrics on their own (more details in the mentioned vision).
Since the metrics are stored in memory, once the component is restarted, all data are lost. This is an acceptable behavior because shortly after the restart the newest metrics will be collected, though we will try to minimize the priority of this (see also Deployment).
Since metrics server is a prerequisite for a number of Kubernetes components (HPA, scheduler, kubectl top) it will run by default in all Kubernetes clusters. Metrics server initiates connections to nodes, due to security reasons (our policy allows only connection in the opposite direction) so it has to run on the user’s node.
There will be only one instance of metrics server running in each cluster. In order to handle high metrics volume, the metrics server will be vertically autoscaled by addon-resizer. We will measure its resource usage characteristic. Our experience from profiling Heapster shows that it scales vertically effectively. If we hit performance limits we will consider scaling it horizontally, though it’s rather complicated and is out of the scope of this doc.
Metrics server will be Kubernetes addon, create by the Kube-up script and managed by addon-manager. Since there are a number of dependent components, it will be marked as a critical addon. In the future when the priority/preemption feature is introduced we will migrate to use this proper mechanism for marking it as a high-priority, system component.
Tell us about a new Kubernetes application
Never miss a thing! Sign up for our newsletter to stay updated.
Discover and learn about everything Kubernetes