kube-state-metrics

kube-state-metrics is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects.  It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods.

That kube-state-metrics is about generating metrics from Kubernetes API objects without modification. This ensures, that features provided by kube-state-metrics have the same grade of stability as the Kubernetes API objects themselves. In turn this means, that kube-state-metrics in a certain situation may not show the exact same values as kubectl, as kubectl applies certain heuristics to display comprehensible messages. kube-state-metrics exposes raw data unmodified from the Kubernetes API, this way users have all the data they require and perform heuristics as they see fit.

The metrics are exported through the Prometheus golang client on the HTTP endpoint /metrics on the listening port (default 80). They are served either as plaintext or protobuf depending on the Accept header. They are designed to be consumed either by Prometheus itself or by a scraper that is compatible with scraping a Prometheus client endpoint.

Resource recommendation

Resource usage changes with the size of the cluster. As a general rule, you should allocate

  • 200MiB memory
  • 0.1 cores
  • For clusters of more than 100 nodes, allocate at least
  • 2MiB memory per node
  • 0.001 cores per node

How does Sysdig Monitor use kube-state-metrics?

Getting started with Kubernetes object state monitoring is easy. The Sysdig agent automatically polls the Kubernetes API for kube-state-metrics and makes them available for analysis, correlation, and alerting in the Sysdig Monitor UI. At the moment, Sysdig Monitor features 25 new metrics on the Kubernetes object state, but this will grow as more objects and resources become available upstream. Five out-of-the-box groupings help to organize your infrastructure in meaningful hierarchies and provide you with infrastructure overviews that simplify drill-down to the data you want to see. Plus, we’ve set up 13 default dashboards categorized by Kubernetes object to give you flexibility in what and how you view the kube-state-metrics output. We’ve made it simple – zero effort to get started – just click on your desired Kubernetes object and we’ll display the relevant choices. In addition, you can build-your-own views incorporating kube-state-metrics and non-kube-state-metrics in the same dashboard.

Answering key Kubernetes questions with kube-state-metrics

So you may be asking, “What do I get with this object state information?” There are a lot of new scenarios you will now be able to explore. When paired with resource consumption metrics, kube-state-metrics help you identify whether the condition of your cluster is having an impact on application behavior. Here are examples of questions you’ll be able to answer:

Pods:

  • Are there enough available pods compared to the desired pods?
  • How many pods are in available status, ready to serve requests?
  • Is there enough capacity to serve pod requests?
  • How many pods are waiting to be scheduled?
  • How many container restarts occurred within a pod?
  • How many CPU cores and how much memory has been requested by the containers in a pod?

Deployments:

  • Does each deployment have sufficient resources?
  • How many pods are running per deployment? How many are desired?
  • Does each deployment have sufficient available pods?
  • How many have been updated?

Namespaces:

  • How many namespaces exist?
  • What is the number of services, deployments, replicaSets, or jobs per namespace?

Nodes:

  • How many nodes are ready?
  • Is there enough capacity to serve the pods running on the nodes?
  • How many nodes are unavailable? How many nodes are out of disk space?
  • What are the pod resources of a node that are available for scheduling?
  • What is the allocatable capacity vs. requested capacity on the node?
  • How many nodes have memory, disk, or network pressure?

ReplicaSets:

  • How many pods are in a replicaSet?
  • How many pods per replicaSet are ready?
  • What is the desired number of pods per replicaSet?

ReplicationControllers:

  • How many pods are in a replication controller?
  • How many pods per ReplicationController are ready?
  • What is the number of desired pods per ReplicationController?

DaemonSets:

  • How many nodes are running at a daemonSet?
  • How many should be running a daemonSet and how many should not?

Jobs:

  • What are the running jobs?
  • What is the maximum or desired number of concurrent jobs?
  • How many jobs are actively running?
  • How many hiled?

Alerting with kube-state-metrics

Along with the new dashboards and groupings, we also provide 16 new pre-configured alerts. As a result, you can be notified on a wide range of conditions related to the state of Kubernetes objects. Here are a couple of great Kubernetes alerting examples for common conditions you may encounter:

Pods available are less than desired: Let’s say you’ve built a java app and have specified a policy for pods desired = 4 per deployment. You can set an alert to be notified if for a period of time – say 10 minutes – the number of pods in the deployment drops and remains below 4. In this state, the danger is that your app performance is degraded or not running at the redundancy required. By receiving a notification, you are able to proactively investigate what is happening to resolve the issue before it severely impacts your user’s experience. Sysdig’s adaptive alerting will automatically extend this alert to new deployments as they come online, eliminating the need to manually assign new alerts as your environment changes.

Deployments with no available pods: What if you’ve defined a deployment, but end up with no pods running and available – meaning your app is not serving requests? Getting alerted on this condition means you can spring into action to find and resolve the issue. Is it a cluster issue? Is it a resource issue? Is it a scheduler error? Using the metrics and information available within Sysdig Monitor you can troubleshoot, pinpoint and solve the problem quickly.

Tell us about a new Kubernetes application

Newsletter

Never miss a thing! Sign up for our newsletter to stay updated.

About

Discover and share new Kubernetes applications

Navigation