DEV Community

Hamdi (KHELIL) LION
Hamdi (KHELIL) LION

Posted on

๐Ÿง  Demystifying Metrics in Kubernetes

Kubernetes does not magically know when to scale your workloads ๐Ÿค–
It relies on metrics exposed through dedicated APIs to make scaling decisions.

There are three types of metrics you need to understand:

โš™๏ธ Resource Metrics
๐Ÿ“Š Custom Metrics
๐ŸŒ External Metrics

Each one represents a different kind of pressure on your system.

๐Ÿค” Why Metrics Matter for Autoscaling

Autoscaling in Kubernetes is mainly handled by the Horizontal Pod Autoscaler (HPA).

The HPA keeps asking:

โ€œHeyโ€ฆ are my Pods struggling?โ€ ๐Ÿ˜…

The answer comes from metrics. Without them, Kubernetes is basically guessing.

โš™๏ธ 1 Resource Metrics = Pod Health Signals

These are the native Kubernetes metrics.

They come from the Metrics Server and only cover:

๐Ÿงฎ CPU usage
๐Ÿง  Memory usage

They are exposed via:

metrics.k8s.io

๐Ÿงฉ What they represent

They describe resource consumption, not business traffic.

Your app might be slow because of a databaseโ€ฆ but CPU could still be chill ๐ŸงŠ

๐Ÿ“„ Example HPA based on CPU

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
Enter fullscreen mode Exit fullscreen mode

If average CPU across Pods goes above 70 percent โ†’ more replicas ๐Ÿ”ฅ

โœ… Pros

Super simple
Works out of the box

โŒ Limits

CPU is not always equal to user traffic
Memory often reacts too late

๐Ÿ“Š 2 Custom Metrics = Your App Talking to Kubernetes

Custom metrics come from applications inside your cluster.

They describe business or application load ๐Ÿ’ผ

They are exposed through:

custom.metrics.k8s.io

๐Ÿ” Typical data flow

App exposes /metrics
Prometheus scrapes
Prometheus Adapter maps metrics โ†’ Kubernetes API

Now Kubernetes can ask:

โ€œHow busy is this Deployment really?โ€ ๐Ÿ‘€

๐ŸŒ Example 1 HTTP requests per second

Metric in Prometheus:

http_requests_per_second

๐Ÿง  What it means
Real traffic handled by each Pod

๐Ÿ“ฆ In Kubernetes
A metric attached to Pods

metrics:
- type: Pods
  pods:
    metric:
      name: http_requests_per_second
    target:
      type: AverageValue
      averageValue: 200
Enter fullscreen mode Exit fullscreen mode

If each Pod handles more than 200 requests per second โ†’ scale out ๐Ÿš€

โฑ Example 2 Request duration

Metric:

request_duration_seconds

๐Ÿง  What it means
Application performance and saturation

Used as an Object metric:

- type: Object
  object:
    metric:
      name: avg_request_duration
    describedObject:
      apiVersion: apps/v1
      kind: Deployment
      name: api
    target:
      type: Value
      value: 0.5
Enter fullscreen mode Exit fullscreen mode

If average latency goes above 500ms โ†’ time to add Pods ๐Ÿƒโ€โ™‚๏ธ

๐Ÿงต Example 3 Active background jobs

Metric:

active_background_jobs

๐Ÿง  What it means
Internal workload of a worker

Each Pod reports its own load, and HPA scales when workers are overloaded ๐Ÿ“ˆ

โœ… Pros

Scaling reflects real app behavior
Way smarter than CPU only

โŒ Limits

Requires Prometheus + Prometheus Adapter
More components to maintain ๐Ÿ˜ฌ

๐ŸŒ 3 External Metrics = Work Waiting Outside the Cluster

External metrics come from systems outside Kubernetes.

They are exposed via:

external.metrics.k8s.io

These metrics describe work your Pods must process, even if it lives elsewhere ๐ŸŒŽ

๐Ÿ“ฌ Example 1 SQS queue length

Metric:

ApproximateNumberOfMessagesVisible

๐Ÿง  What it means
Number of messages waiting in the queue

In Kubernetes
A global metric not tied to specific Pods

If the queue grows โ†’ spawn more workers ๐Ÿ’ช

๐Ÿ˜ Example 2 Kafka consumer lag

Metric:

kafka_consumer_lag

๐Ÿง  What it means
Delay between producers and consumers

More lag = your consumers are falling behind ๐Ÿ˜ฑ
Scale them up!

๐Ÿ“ฆ Example 3 Redis job queue size

Metric:

redis_list_length

๐Ÿง  What it means
Number of jobs waiting in Redis queues

Perfect for worker autoscaling ๐Ÿ”„

โฐ Example 4 Time based scaling

Scale more Pods during office hours
Scale down at night ๐ŸŒ™

This is also treated as an external signal, because itโ€™s not tied to Pod resource usage.

๐Ÿงฎ How HPA Uses These Metrics

HPA periodically queries:

Metric Type API What it measures
Resource metrics.k8s.io CPU and memory
Custom custom.metrics.k8s.io App level load
External external.metrics.k8s.io Event or system load

Then it calculates how many replicas you need ๐Ÿ“

โšก KEDA vs Prometheus Adapter

Here comes the game changer ๐ŸŽฎ

KEDA (Kubernetes Event Driven Autoscaling) focuses on event driven autoscaling and makes custom and external metrics way easier to use.

๐Ÿงฉ With Prometheus Adapter

You must:

Run Prometheus
Install and configure the Adapter
Write mapping rules
Manage RBAC and certs

It works, but itโ€™s heavy ๐Ÿ‹๏ธ

โšก With KEDA

You define a ScaledObject and KEDA does the magic โœจ

Example SQS scaler:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: worker-scaler
spec:
  scaleTargetRef:
    name: worker
  minReplicaCount: 0
  maxReplicaCount: 20
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.eu-west-1.amazonaws.com/123/my-queue
      queueLength: "10"
      awsRegion: eu-west-1
Enter fullscreen mode Exit fullscreen mode

KEDA fetches the metric
Exposes it to Kubernetes
Creates and manages the HPA
Handles provider auth ๐Ÿ”

All without Prometheus Adapter ๐Ÿคฏ

๐Ÿงญ When to Use What

If you want to scale onโ€ฆ Useโ€ฆ
CPU or memory Resource metrics
HTTP traffic or app load Custom metrics
Queues, streams, SaaS External metrics + KEDA
Events or scale to zero KEDA

๐ŸŽฏ Final Takeaway

Kubernetes becomes truly powerful when scaling is driven by real workload signals, not just CPU.

โš™๏ธ Resource metrics are the starting point
๐Ÿ“Š Custom metrics bring application awareness
๐ŸŒ External metrics unlock event driven architectures
โšก KEDA makes advanced autoscaling simple and production friendly

Once you understand these three metric types, autoscaling stops being magic and becomes a design tool you control ๐Ÿ’ก๐Ÿš€

Happy clustering :)

Top comments (0)