How to Monitor Node Resources with `kubectl top nodes`
Learn how to use `kubectl top nodes` to monitor Kubernetes node CPU and memory usage, troubleshoot performance issues, and optimize cluster resource allocation.
For many engineers, kubectl top nodes fails on first use with the unhelpful error Metrics API not available. The command is not self-contained; it relies on the Kubernetes Metrics Server being installed, reachable, and authorized. This dependency is easy to miss and is a frequent source of confusion when basic observability appears broken. This guide explains how kubectl top nodes works, how to use it effectively for performance and capacity troubleshooting, and how to resolve the installation, configuration, and RBAC issues that commonly prevent it from returning metrics.
Unified Cloud Orchestration for Kubernetes
Manage Kubernetes at scale through a single, enterprise-ready platform.
Key takeaways:
kubectl top nodesis your first-response tool for real-time resource checks: Use this command for an immediate snapshot of CPU and memory usage to quickly identify overloaded nodes. Remember that it depends on the Metrics Server and displays actual consumption, not the resource requests or limits defined in your manifests.- Production monitoring requires historical data and alerting: While useful for ad-hoc diagnostics,
kubectl top nodeslacks the historical context needed for trend analysis and has no built-in alerting. To maintain reliability, you need a system that can track metrics over time and proactively notify you of issues. - A unified dashboard is essential for managing Kubernetes at scale: Relying on command-line tools across multiple clusters is inefficient and insecure. Plural provides a centralized, SSO-integrated dashboard that offers fleet-wide visibility and simplifies monitoring without complex network configurations or credential management.
What is kubectl top nodes?
kubectl top nodes is a CLI command that shows the current CPU and memory usage of every node in a Kubernetes cluster. It provides a real-time snapshot of node-level resource consumption and is typically the fastest way to answer questions like which node is under the highest load or whether a capacity issue is emerging. Conceptually, it is similar to top or htop, but aggregated at the Kubernetes node abstraction rather than the operating system process level.
The command is intentionally simple and point-in-time. It does not store history or trends, which makes it unsuitable for long-term analysis but extremely effective for live debugging, quick health checks, and initial performance investigations. In platforms like Plural, it is often the first signal engineers use before pivoting to deeper observability tools.
Node resource monitoring explained
Node resource monitoring is the practice of tracking how much CPU and memory each worker node is actively consuming. When you run kubectl top nodes, you are performing a basic but highly actionable form of this monitoring. The output reflects how much of each node’s available capacity is currently in use, which is essential for diagnosing performance bottlenecks, identifying uneven scheduling, and deciding when to scale your cluster.
Understanding node-level usage helps platform teams reason about pod placement, autoscaling behavior, and failure risk. A single saturated node can degrade multiple workloads, even if the rest of the cluster appears healthy.
Metrics Server dependency
kubectl top nodes does not work on its own. It depends entirely on the Metrics API, which is implemented by the Kubernetes Metrics Server. Metrics Server runs as a cluster add-on, scraping CPU and memory usage from each node’s Kubelet and exposing that data via the Kubernetes API.
When you execute kubectl top nodes, kubectl is simply querying this API. If Metrics Server is missing, misconfigured, or blocked by RBAC, the command fails with Metrics API not available. Installing and correctly configuring Metrics Server is therefore a hard prerequisite for any use of kubectl top, including autoscalers like HPA.
Clearing up common misconceptions
A frequent misunderstanding is confusing the values shown by kubectl top nodes with pod resource requests and limits. The command reports actual, real-time resource usage. Requests and limits, by contrast, are scheduling and enforcement mechanisms that describe what a pod is guaranteed or capped at, not what it is currently consuming.
This distinction matters in practice. A node may appear lightly loaded in kubectl top nodes even if it is “full” from a scheduler perspective, because many pods request more resources than they actively use. kubectl top nodes tells you what is happening now; allocation-focused views tell you what Kubernetes has reserved. Both are necessary, but they answer different questions.
How to use the kubectl top nodes command
The kubectl top node command is a direct way to inspect the current CPU and memory consumption of nodes in your cluster. It pulls real-time resource usage data from the Metrics Server, providing a quick snapshot of which nodes are under the most pressure. This command is essential for initial triage when investigating performance issues or verifying resource distribution. While comprehensive tools like Plural's multi-cluster dashboard provide fleet-wide visibility, kubectl top node offers an immediate, command-line-based view without the overhead of a complex UI.
Understanding how to execute this command and interpret its output is a fundamental skill for anyone managing a Kubernetes environment. It helps you answer critical questions on the fly: Is a specific node overloaded? Is there enough capacity to schedule new pods? Are resources being consumed as expected? The following sections break down the command’s syntax, output, and the important distinction between actual usage and allocatable resources.
Basic syntax and execution
Executing the command is straightforward. To get a resource usage summary for every node in your cluster, run the following:
kubectl top node
This returns a list of all nodes along with their current CPU and memory usage. If you need to investigate a specific node, you can append its name to the command. Replace NODE_NAME with the actual name of the target node:
kubectl top node NODE_NAME
This targeted approach is useful for focusing on a node that has been flagged by an alert or is exhibiting performance problems. It’s a simple yet powerful way to begin any resource-related investigation.
How to read the output
The output of kubectl top node is presented in a simple table with five columns:
- NAME: The name of the Kubernetes node.
- CPU(cores): The absolute amount of CPU being used by the node, measured in millicores (m). For example,
100mrepresents 0.1 CPU cores. - CPU%: The percentage of the node's total allocatable CPU that is currently in use.
- MEMORY(bytes): The absolute amount of memory being used, typically measured in mebibytes (Mi).
- MEMORY%: The percentage of the node's total allocatable memory that is currently in use.
Understanding these columns allows you to quickly assess the load on each node. High percentages in the CPU% or MEMORY% columns indicate that a node is approaching its capacity for scheduling new workloads.
Decoding CPU and memory metrics vs. allocatable resources
A common point of confusion is the difference between the metrics shown by kubectl top node and the resource requests or limits defined in your pod specifications. The kubectl top node command displays the actual, real-time resource consumption of the node. This includes all running processes, not just your containerized workloads.
The percentages for CPU and memory are calculated based on the node's allocatable resources, not its total capacity. Allocatable resources represent the total capacity of the node minus the resources reserved for the operating system and Kubernetes system daemons like the kubelet. This distinction is critical because it reflects the amount of resources available for Kubernetes to schedule pods. The metrics you see are a direct measurement of current usage against this available pool.
Key flags to enhance kubectl top nodes
The basic kubectl top nodes command provides a valuable snapshot, but its real power for targeted analysis comes from its optional flags. These flags allow you to refine the output to get the exact information you need, turning a general overview into a precise diagnostic tool. By filtering, sorting, and adjusting the data displayed, you can significantly speed up troubleshooting and resource planning tasks directly from your command line.
Filter nodes with selectors and labels
In a large Kubernetes environment, viewing metrics for every single node can be overwhelming. To focus your analysis, you can filter the output using node labels. The -l or --selector flag lets you display only the nodes that match a specific label query. For example, if you want to see resource usage exclusively for your production nodes, you can run kubectl top node -l environment=production. This command is essential for isolating performance issues within specific segments of your infrastructure, such as a particular availability zone or instance type. This level of targeted visibility helps you manage complex deployments more effectively, a core principle we extend across entire fleets in Plural's built-in multi-cluster dashboard.
Display node capacity with --show-capacity
By default, kubectl top nodes shows allocatable resources—the amount of CPU and memory available for your pods to consume. This is useful, but it doesn’t tell the whole story. To see the node's total physical resources, you can add the --show-capacity flag. This reveals the full capacity before the system reserves resources for the OS and Kubernetes components like the kubelet. Comparing allocatable resources to total capacity helps you understand the resource overhead on each node. This information is critical for accurate capacity planning and ensuring your nodes are provisioned correctly to handle both system and application workloads without contention.
Sort and format output for better analysis
When you need to quickly identify which nodes are under the most strain, sorting the output is key. The --sort-by flag allows you to order the list of nodes by either cpu or memory usage. Running kubectl top nodes --sort-by=cpu immediately brings the nodes with the highest CPU consumption to the top, helping you pinpoint potential bottlenecks in seconds. For scripting or piping the output to other tools, you can use the --no-headers flag to produce a clean, header-less list. While these flags are excellent for quick command-line checks, a persistent and visual interface like the one provided by Plural CD offers a more robust way to track and analyze these metrics over time across all your clusters.
kubectl top nodes vs. other monitoring tools
While kubectl top nodes is an essential command for any engineer working with Kubernetes, it's important to understand its role within the broader landscape of monitoring tools. It’s a powerful diagnostic instrument, but it’s not a complete observability solution. Think of it as a stethoscope for a quick check-up, whereas a full monitoring platform is the equivalent of a complete medical imaging lab. Knowing when to use each tool is key to efficiently managing your clusters and ensuring their health and performance. The primary distinction comes down to the difference between a built-in, immediate-feedback tool and a comprehensive, external system designed for deep analysis and long-term data retention.
Built-in vs. external monitoring
The kubectl top command is a built-in utility that gives you immediate insights into the resource usage of your Kubernetes nodes. Its main advantage is its simplicity and accessibility; as long as the Metrics Server is running in your cluster, the command is ready to go. There's no complex setup or additional software to manage.
External monitoring solutions, such as Prometheus, Grafana, or Datadog, offer a much more comprehensive feature set. These platforms are designed to collect, store, and visualize metrics over time. They require a more involved setup process but provide critical capabilities like customizable dashboards, sophisticated alerting mechanisms, and the ability to correlate metrics across different systems. For teams managing production workloads, these external tools are not optional—they are a necessity for maintaining reliability.
Real-time snapshots vs. historical data
The kubectl top command offers a real-time snapshot of resource usage, making it an excellent tool for quick troubleshooting and on-the-spot analysis. If you need to know which node is consuming the most CPU right now, it's the fastest way to get an answer. However, its biggest limitation is that it does not retain historical data. The information is ephemeral, disappearing as soon as you run the command again.
This lack of history makes it unsuitable for long-term performance tracking or capacity planning. To understand resource trends, identify recurring performance issues, or forecast future needs, you need a system that stores metrics over time. This is where external monitoring platforms excel. They use time-series databases to build a detailed history of your cluster's performance, enabling you to analyze trends and make data-driven decisions.
When to use kubectl top nodes vs. a full monitoring platform
The kubectl top command is not intended to replace a comprehensive monitoring solution. It’s best used for tactical, in-the-moment checks. Use it when you need to quickly answer questions like, "Is a specific node under heavy load?" or "Did our recent deployment cause a spike in memory usage?" It’s your go-to for immediate, ad-hoc diagnostics.
For anything more strategic, a dedicated monitoring platform is required. You need a full platform for tasks like setting up alerts for high resource utilization, creating dashboards to visualize fleet-wide health, or analyzing historical data to plan for future growth. For teams managing multiple clusters, a tool like Plural's built-in multi-cluster dashboard becomes essential. It provides a unified, real-time view across your entire fleet from a single interface, giving you the immediate visibility of kubectl top but at the scale required for enterprise operations.
Practical use cases for kubectl top nodes
While kubectl top nodes provides a simple, real-time snapshot, its applications are critical for day-to-day Kubernetes operations. It serves as a first-response tool for identifying resource pressure, validating cluster configuration, and ensuring overall stability. For platform engineers and SREs, mastering its use cases means quicker problem resolution and more efficient cluster management. It's the command you run to get an immediate pulse on your cluster's health before diving into more complex diagnostic tools.
Think of it as the initial check-up. When an application team reports slowness, or an alert fires, running kubectl top nodes gives you an instant, high-level overview of where resource contention might be occurring. This allows you to quickly narrow your focus from the entire cluster down to a specific set of nodes, saving valuable time during an incident. This immediate feedback loop is essential for maintaining responsive and reliable systems, forming the foundation of a solid operational workflow.
Troubleshoot performance bottlenecks
The kubectl top node command is essential for diagnosing performance issues within your Kubernetes cluster. When applications behave sluggishly or pods enter a crash loop, one of the first things to check is whether the underlying nodes have sufficient resources. By providing real-time data on CPU and memory usage for each node, the command allows you to immediately identify nodes that are under heavy load or approaching their capacity limits.
For example, if you see a node consistently running at over 90% CPU or memory, it’s a clear indicator of a performance bottleneck. This immediate visibility helps you pinpoint the source of the problem. Your next step would be to inspect the pods running on that specific node to find the resource-hungry culprit, making your troubleshooting process far more efficient.
Verify capacity planning and autoscaling
Effective capacity planning is crucial for balancing performance and cost. Using kubectl top node is a quick way to assess if your current node provisioning matches your actual workload demands. By monitoring CPU and memory consumption across the cluster, you can make informed decisions about whether to scale your node pools up or down.
This command is also invaluable for validating your autoscaling strategies. If you have a cluster autoscaler configured, you can use kubectl top node to observe its behavior. Are new nodes being added when utilization gets high? Are underutilized nodes being removed to save costs? The command provides an immediate snapshot of resource usage, allowing you to confirm that your scaling policies are working as intended and adjust them proactively.
Perform quick cluster health checks
For cluster administrators, kubectl top node is a fundamental tool for routine health checks. Running this command periodically gives you a quick, high-level assessment of your cluster's operational state. It helps you spot anomalies, like a single node consuming disproportionately high resources or all nodes running hotter than usual, which could indicate a widespread issue.
This simple check helps you ensure the cluster is operating efficiently and can be the first step in identifying nodes that require maintenance or further investigation. While it doesn't replace comprehensive monitoring solutions, it serves as an excellent first-line diagnostic. Catching potential problems early during a routine check contributes to the overall stability and health of your Kubernetes environment long before they escalate into critical incidents.
Troubleshooting common kubectl top nodes issues
When kubectl top nodes fails, the root cause is almost always somewhere in its dependency chain. The command is a thin client that queries the Kubernetes API for metrics exposed by the Metrics API. That data path runs from your kubectl client to the API server, from the API server to the Metrics Server, and from the Metrics Server to each node’s Kubelet. A failure at any point produces opaque errors that look unrelated unless you understand this flow.
In practice, most failures fall into three categories: a broken Metrics Server deployment, missing RBAC permissions, or network and authentication problems. Systematically validating each layer is the fastest way to restore node-level metrics.
Fix Metrics Server installation issues
kubectl top nodes depends entirely on the Kubernetes Metrics Server. If it is not installed, unhealthy, or misconfigured, the Metrics API simply does not exist, and the API server has nothing to return.
Start by checking whether Metrics Server is running:
- List pods in
kube-systemand confirm themetrics-serverpods exist. - Verify they are in a
Runningstate and not crash-looping. - Inspect logs if the pods are restarting, as TLS or Kubelet authentication issues are common causes.
If the deployment is missing or unstable, reinstalling or upgrading Metrics Server using the official manifests is usually sufficient. Until this component is healthy, no kubectl top command will work.
Address RBAC and permission errors
If Metrics Server is running but kubectl top nodes returns Error from server (Forbidden), the issue is RBAC. The user or service account invoking the command must be allowed to read node metrics from the Metrics API.
At a minimum, the caller needs get, list, and watch permissions on the relevant metrics resources. Without these, the API server blocks the request before it ever reaches Metrics Server.
Plural reduces this friction by relying on Kubernetes impersonation. Permissions are derived from your console identity and mapped consistently across clusters, so platform teams can define RBAC once and avoid managing per-user kubeconfigs or ad hoc role bindings.
Solve network and authentication failures
Network restrictions can break metric collection even when everything else looks correct. Firewalls or NetworkPolicies may block:
- The API server from reaching the Metrics Server
- The Metrics Server from scraping node Kubelets
If metrics intermittently fail or never appear, validate connectivity and ensure required ports are open.
Authentication issues are another common source of errors. An expired token, incorrect cluster context, or stale kubeconfig can prevent kubectl from reaching the API server at all. Plural’s agent-based, egress-only architecture avoids these failure modes by removing direct inbound access and kubeconfig sprawl, providing consistent access to metrics and dashboards across clusters without complex networking setup.
Limitations and alternatives
kubectl top nodes is useful for fast, ad-hoc checks, but it is not a production monitoring solution. It optimizes for immediacy, not depth. For teams operating Kubernetes at scale, it should be treated as a debugging entry point rather than a core observability primitive. Understanding its constraints helps avoid over-reliance on a tool that was never designed for long-term operational visibility.
No historical data for trend analysis
kubectl top only returns a point-in-time snapshot. Metrics are not stored, aggregated, or queryable over time. This makes it impossible to answer questions like whether node memory pressure is slowly increasing week over week or if CPU spikes follow a predictable daily pattern.
The Metrics Server itself only exposes ephemeral data, so once the command exits, the information is gone. Capacity planning, forecasting, and regression detection all require time-series storage and visualization, which kubectl top nodes fundamentally cannot provide.
No built-in alerting or automation
The command is entirely manual. There is no way to define thresholds, trigger alerts, or automate responses based on sustained resource pressure. In production systems, waiting for an engineer to notice a hot node by running a CLI command is not an acceptable reliability strategy.
Operational monitoring requires continuous evaluation, alert routing, and often automated remediation. kubectl top nodes was never intended to support those workflows.
Limited correlation with other signals
Modern observability depends on correlating metrics with logs, events, and traces. kubectl top nodes operates in isolation, exposing only coarse CPU and memory usage with no linkage to workloads, deployments, or application behavior.
This makes it poorly suited for diagnosing complex incidents where resource pressure is a symptom rather than the root cause. Purpose-built monitoring stacks based on tools like Prometheus and Grafana address this by combining time-series metrics, alerting, and dashboards.
Plural abstracts this complexity by providing a unified, multi-cluster view of resource usage alongside operational context. Instead of stitching together metrics, access, and visibility across environments, teams get a consolidated control plane that scales beyond what kubectl top nodes can realistically offer.
Go beyond kubectl top nodes with Plural
The kubectl top nodes command is an indispensable tool for quick, real-time checks on a single cluster's resource consumption. However, as organizations scale their Kubernetes footprint, managing resource utilization across a fleet of clusters demands a more integrated and comprehensive approach. Relying on command-line snapshots becomes inefficient and fails to provide the broad visibility needed for effective fleet management. Plural extends beyond these limitations by offering a unified platform for monitoring and managing Kubernetes at scale.
Get enterprise-grade visibility across your fleet
While kubectl top is effective for checking resource usage on a specific cluster, its utility diminishes as your infrastructure grows. Running the same command across dozens or hundreds of clusters is not a scalable monitoring strategy. For organizations managing a fleet, a centralized view is essential for effective operations. Plural provides this with a unified dashboard that aggregates performance metrics from your entire environment. Instead of context-switching between terminals, you can monitor resource consumption and identify potential issues across all nodes and pods from a single interface, giving you true enterprise-grade visibility.
Monitor multiple clusters in real-time with SSO
Managing access credentials for multiple clusters introduces significant operational overhead and security risks. While kubectl top node provides a snapshot of a node's performance, accessing that information often requires juggling different kubeconfigs. Plural’s embedded Kubernetes dashboard simplifies this workflow by integrating with your existing identity provider for Single Sign-On (SSO). This allows your team to securely access real-time resource metrics across all managed clusters with a single login. By using Kubernetes Impersonation, Plural ensures that access is governed by your existing RBAC policies, streamlining authentication without compromising security.
Track resources without complex network setups
kubectl top depends on the Metrics Server, and scaling monitoring across different environments—like private VPCs or on-prem data centers—often involves complex network configurations. Plural’s agent-based architecture eliminates this challenge. A lightweight agent installed in each workload cluster initiates an egress-only connection to the Plural control plane. This design allows you to securely monitor and manage clusters anywhere, without exposing internal cluster endpoints or configuring VPNs. You gain full visibility into resource utilization across your entire fleet, even for clusters behind strict firewalls, simplifying your monitoring stack and improving your security posture.
Related Articles
Unified Cloud Orchestration for Kubernetes
Manage Kubernetes at scale through a single, enterprise-ready platform.
Frequently Asked Questions
Why does kubectl top nodes sometimes show different resource usage than my Prometheus dashboard? The kubectl top nodes command provides an instantaneous, real-time snapshot of resource consumption directly from the Metrics Server. In contrast, monitoring systems like Prometheus scrape metrics at regular intervals, such as every 30 seconds, and often display data that is averaged or aggregated over a specific time window. This difference in collection frequency and data processing means the numbers may not align perfectly at any given moment, as one shows an immediate point-in-time value while the other reflects a recent trend.
Can I use a similar command to see resource usage for individual pods? Yes, the kubectl top command works for pods as well. By running kubectl top pods, you can view the real-time CPU and memory consumption for all pods within the current namespace. This is extremely useful for drilling down after you've identified a high-load node with kubectl top nodes. You can also use the -A flag to view pods across all namespaces, helping you quickly pinpoint which specific application is driving resource usage.
What's the difference between the CPU usage in millicores and the percentage shown in the output? The CPU(cores) column displays the absolute amount of CPU being used, measured in millicores (where 1000m equals one full CPU core). This is a raw measurement of consumption. The CPU% column provides context by showing that absolute usage as a percentage of the node's allocatable resources—the total CPU available for scheduling pods after the system reserves resources for itself. Both metrics are valuable; the absolute value tells you the raw load, while the percentage indicates how close the node is to its scheduling capacity.
If kubectl top nodes fails, is it always a problem with the Metrics Server? While a missing or malfunctioning Metrics Server is the most common cause, it's not the only one. The command can also fail due to insufficient RBAC permissions, meaning your user account lacks the rights to access the metrics API. Network policies that block communication between the API server and the Metrics Server can also be the culprit. Lastly, a simple misconfiguration in your kubeconfig file, such as pointing to the wrong cluster or using an expired token, can prevent the command from executing successfully.
How does kubectl top nodes relate to a node's total physical capacity? The metrics displayed by kubectl top nodes are calculated against a node's allocatable resources, not its total physical capacity. Allocatable resources represent the capacity remaining after Kubernetes reserves a portion of the CPU and memory for the operating system and its own components, like the kubelet. This distinction is critical because it reflects the actual capacity available for your workloads. To see the node's total physical capacity for comparison, you can use the --show-capacity flag.
Newsletter
Join the newsletter to receive the latest updates in your inbox.