How to Monitor Kubernetes Workloads with Prometheus and Grafana

Are you using Kubernetes to manage your containerized workloads? If so, you know the importance of monitoring your applications and infrastructure to ensure uptime, performance, and reliability. But with so many components and metrics to keep track of, how can you possibly stay on top of it all? The answer: Prometheus and Grafana.

In this article, we'll explore how to set up Prometheus and Grafana to monitor your Kubernetes workloads. We'll cover everything from installation and configuration to visualization and alerting. Let's get started!

What is Prometheus?

Prometheus is an open-source monitoring system that collects metrics from various sources, including Kubernetes, and stores them in a time-series database. Prometheus has a powerful query language that allows you to mine your data and gain insights into your systems' behavior. It also has a flexible alerting system that can notify you of any issues or anomalies.

What is Grafana?

Grafana is an open-source visualization and dashboarding tool. It can connect to various data sources, including Prometheus, and create beautiful, informative visualizations. Grafana's dashboards are highly customizable and can display metrics in real-time, making it an excellent tool for monitoring.

Installing Prometheus and Grafana

Before we get started with Prometheus and Grafana, we need to have a Kubernetes cluster up and running. If you don't have one yet, check out our article on Kubernetes basics.

Once you have a Kubernetes cluster, you can install Prometheus and Grafana using Helm. Helm is a package manager for Kubernetes that allows you to deploy pre-configured applications with ease.

Here's how to install Prometheus and Grafana using Helm:

  1. Create a namespace for the Prometheus and Grafana components:
kubectl create namespace monitoring
  1. Add the Prometheus Helm chart repository:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
  1. Install Prometheus using the Helm chart:
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring
  1. Install Grafana using the Helm chart:
helm install grafana grafana/grafana --set persistence.enabled=true --namespace monitoring

Configuring Prometheus and Grafana

Once you have Prometheus and Grafana installed, it's time to configure them to monitor your Kubernetes workloads. Here's how to do it:

Prometheus Configuration

  1. Access the Prometheus dashboard by running the following command:
kubectl --namespace monitoring port-forward svc/prometheus-kube-prometheus-prometheus 9090
  1. Open your web browser and go to http://localhost:9090. You should see the Prometheus dashboard.

  2. Click on the "Targets" menu on the left-hand side. You should see a list of Kubernetes-related targets that Prometheus is monitoring. If you don't see any targets, you may need to wait a few minutes for the data to propagate.

  3. Click on one of the targets to view its metrics. You should see a list of available metrics related to that target.

Grafana Configuration

  1. Access the Grafana dashboard by running the following command:
kubectl --namespace monitoring port-forward svc/grafana 3000
  1. Open your web browser and go to http://localhost:3000. You should see the Grafana dashboard.

  2. Log in to Grafana using the default credentials (username: admin, password: admin).

  3. Click on the "Configuration" menu on the left-hand side and select "Data Sources".

  4. Click on the "Add data source" button and select "Prometheus".

  5. Enter the following information:

  1. Click on the "Save and Test" button to verify that Grafana can connect to Prometheus.

Visualizing Metrics with Grafana

Now that you have Prometheus and Grafana configured, it's time to start visualizing your metrics. Here are a few examples of how you can do that:

Kubernetes Dashboard

The Kubernetes dashboard provides an overview of your Kubernetes cluster and its components. To add the Kubernetes dashboard in Grafana:

  1. Click on the "Create dashboard" button on the Grafana home page.

  2. Click on the "Add panel" button in the top left-hand corner and select "Dashboard".

  3. Click on the "Import" button and enter the following URL:

https://grafana.com/grafana/dashboards/315

  1. Click on the "Import" button to import the dashboard.

Node Metrics Dashboard

The Node Metrics dashboard provides an overview of the resource usage of your Kubernetes nodes. To add the Node Metrics dashboard in Grafana:

  1. Click on the "Create dashboard" button on the Grafana home page.

  2. Click on the "Add panel" button in the top left-hand corner and select "Dashboard".

  3. Click on the "Import" button and enter the following URL:

https://grafana.com/grafana/dashboards/1860

  1. Click on the "Import" button to import the dashboard.

Container Metrics Dashboard

The Container Metrics dashboard provides an overview of the resource usage of your Kubernetes containers. To add the Container Metrics dashboard in Grafana:

  1. Click on the "Create dashboard" button on the Grafana home page.

  2. Click on the "Add panel" button in the top left-hand corner and select "Dashboard".

  3. Click on the "Import" button and enter the following URL:

https://grafana.com/grafana/dashboards/7249

  1. Click on the "Import" button to import the dashboard.

Alerting with Prometheus and Grafana

Now that you have your metrics visualized in Grafana, it's time to set up alerting with Prometheus. Here's how to do it:

  1. Create an alert rule in Prometheus by modifying the prometheus-kube-prometheus-prometheus-rules.yaml file:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: my-prometheus-rules
spec:
  groups:
  - name: my-alert-rules
    rules:
    - alert: HighPodCPUUsage
      expr: sum(rate(container_cpu_usage_seconds_total{namespace="default"}[1m])) by (pod) > 1
      for: 5m

This rule will trigger an alert if any pod in the "default" namespace has a CPU usage of more than 1 core for more than 5 minutes.

  1. Apply the rule to your Kubernetes cluster:
kubectl apply -f prometheus-kube-prometheus-prometheus-rules.yaml --namespace monitoring
  1. Create a notification channel in Grafana by going to the "Alerting" menu on the left-hand side and selecting "Notification Channels".

  2. Click on the "New channel" button and select the appropriate channel type (e.g., email, Slack, PagerDuty).

  3. Enter the necessary information for the notification channel (e.g., email address, Slack webhook URL).

  4. Add the notification channel to the Prometheus alert manager by modifying the prometheus-kube-prometheus-alertmanager.yaml file:

apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
  name: my-alertmanager
spec:
  route:
    receiver: my-notification-channel
    group_by: ['alertname', 'namespace']
    group_wait: 30s
    group_interval: 5m
    repeat_interval: 1h
  receivers:
  - name: my-notification-channel
    slack_configs:
    - api_url: https://hooks.slack.com/services/TOKEN/SECRET
      channel: '#alerts'
      send_resolved: true

This configuration will send notifications to the specified Slack channel whenever an alert is triggered.

  1. Apply the configuration to your Kubernetes cluster:
kubectl apply -f prometheus-kube-prometheus-alertmanager.yaml --namespace monitoring

Wrapping Up

Congratulations! You've now set up Prometheus and Grafana to monitor your Kubernetes workloads. You can visualize your metrics using Grafana, create alerting rules with Prometheus, and receive notifications through your chosen communication channels. With this setup, you can be confident that your applications and infrastructure are running smoothly and reliably.

If you're looking for more tips and best practices for Kubernetes management, check out our other articles on kctl.dev. We'll help you take your Kubernetes game to the next level!

Additional Resources

persona6.app - persona 6
nftcards.dev - crypto nft collectible cards
painpoints.app - software engineering and cloud painpoints
promptjobs.dev - prompt engineering jobs, iterating with large language models
macro.watch - watching the macro environment and how Fed interest rates, bond prices, commodities, emerging markets, other economies, affect the pricing of US stocks and cryptos
learngcp.dev - learning Google cloud
facetedsearch.app - faceted search. Search that is enriched with taxonomies and ontologies, as well as categorical or hierarchal information
machinelearning.events - machine learning upcoming online and in-person events and meetup groups
speedrun.video - video game speed runs
dsls.dev - domain specific languages, dsl, showcasting different dsls, and offering tutorials
knowledgegraph.solutions - A consulting site related to knowledge graphs, knowledge graph engineering, taxonomy and ontologies
startupvalue.app - assessing the value of a startup
haskell.business - the haskell programming language
costcalculator.dev - calculating total cloud costs, and software costs across different clouds, software, and hardware options
lakehouse.app - lakehouse the evolution of datalake, where all data is centralized and query-able but with strong governance
assetbundle.dev - downloading software, games, and resources at discount in bundles
cryptostaking.business - staking crypto and earning yield, and comparing different yield options, exploring risks
codetalks.dev - software engineering lectures, code lectures, database talks
networking.place - professional business networking
nftsale.app - buying, selling and trading nfts


Written by AI researcher, Haskell Ruska, PhD (haskellr@mit.edu). Scientific Journal of AI 2023, Peer Reviewed