
Deploying Redis Clusters on Kubernetes: A Step-by-Step Guide
Learn how to deploy and manage a Redis cluster on Kubernetes with step-by-step instructions, best practices, and tips for high availability and performance.
Managing stateful applications in Kubernetes requires different considerations than running stateless workloads. Databases and caches need persistent storage, stable network identities, and well-defined lifecycle management. Redis clusters add further complexity with requirements for sharding, replication, and failover.
This guide provides a practical walkthrough of deploying and managing a Redis cluster on Kubernetes. You’ll learn the setup process, configuration best practices, and techniques for maintenance and troubleshooting. By the end, you’ll be equipped to run a production-grade Redis cluster that can handle real-world workloads with resilience and reliability.
Unified Cloud Orchestration for Kubernetes
Manage Kubernetes at scale through a single, enterprise-ready platform.
Key takeaways:
- Prioritize data persistence with StatefulSets: Use Kubernetes StatefulSets to manage your Redis pods. This provides the stable network identities and dedicated persistent storage necessary to protect your data during pod restarts and ensure the cluster remains consistent.
- Adopt a unified monitoring strategy: Effective troubleshooting requires correlating Redis-specific metrics, like cache hits and latency, with Kubernetes resource data, like pod CPU and memory usage. A single-pane-of-glass platform simplifies this by providing a holistic view of both the application and its underlying infrastructure.
- Manage configuration as code to prevent drift: Define your entire Redis deployment—including resource limits, security policies, and custom settings—in version-controlled manifests. A GitOps workflow automates the application of these configurations, ensuring consistency and eliminating manual errors across your fleet.
What Is a Redis Cluster?
A Redis Cluster is a distributed setup that splits data across multiple Redis nodes. This design improves both performance and fault tolerance, enabling storage to scale beyond the limits of a single instance. By distributing the dataset, clusters support near-linear scalability and high availability.
Redis Cluster Architecture
A cluster is composed of individual Redis nodes that work together to manage a shared dataset. Data is partitioned into shards and distributed across nodes. Each node owns a subset of the data and communicates with peers to maintain cluster state, coordinate client requests, and handle failover. This architecture allows Redis to efficiently process large workloads while providing resilience.
Key Features and Benefits
The core benefit of Redis Cluster is horizontal scalability. On Kubernetes, you can scale Redis simply by adding nodes, with the platform handling scheduling and recovery. If a node fails, Kubernetes restarts it automatically, while Redis ensures continuity through replication and failover. The combination of Redis’s distributed model with Kubernetes orchestration delivers a fault-tolerant, highly available in-memory datastore suited for production workloads.
Data Sharding in Redis
Redis Cluster divides the keyspace into 16,384 hash slots. A CRC16 hash maps each key to a slot, which is then assigned to a primary node. This guarantees even data distribution and avoids hot spots. The system supports live resharding, allowing hash slots to move between nodes without downtime. As a result, you can scale capacity up or down while keeping applications online.
Prerequisites for Deploying Redis on Kubernetes
Running Redis on Kubernetes requires upfront planning to ensure performance, resilience, and security. By addressing system requirements, networking, storage, and security before deployment, you avoid common pitfalls and set up a production-ready Redis cluster.
Review System Requirements
Even though Redis is lightweight, production clusters demand adequate CPU and memory. Define resource requests and limits in your manifests to avoid performance degradation under load. Ensure your Kubernetes version is compatible with the Redis Operator or Helm chart you plan to use. With Plural’s console, you can monitor utilization across clusters and proactively prevent bottlenecks.
Configure Networking
Redis nodes depend on low-latency communication for sharding, replication, and failover. Use a Headless Service to assign each pod a stable network identity. Enforce NetworkPolicies to restrict traffic—only allowing trusted application pods and Redis nodes to connect—reducing the attack surface. Plural standardizes these security practices across your managed environments.
Plan Storage
To safeguard data during pod rescheduling or node failures, pair StatefulSets with PersistentVolumeClaims. Choose a StorageClass that aligns with your workload; high-IOPS SSD-backed storage is typically best for Redis. Infrastructure as Code (IaC) workflows, such as Plural Stacks, can automate provisioning to ensure consistent, reliable storage across environments.
Address Security
Start with a verified Redis image and configure authentication using the requirepass
directive. Store secrets in Kubernetes Secrets rather than hardcoding them. Apply RBAC to tightly control who can access Redis resources. With Plural’s identity-aware dashboard, you can enforce granular RBAC tied to your existing SSO provider, ensuring consistent security across your fleet.
How to Deploy Redis on Kubernetes: A Step-by-Step Guide
Running Redis on Kubernetes involves more than spinning up containers. Because Redis is stateful, you need stable pod identities, persistent storage, and a controlled initialization process. Below is a structured approach with manifest snippets and automation options to help you deploy Redis clusters in production.
Step 1: Create Kubernetes Manifests
Start by defining a basic manifest that specifies the Redis image, ports, and resource allocations. This example uses the official Redis image and exposes port 6379.
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
ports:
- port: 6379
targetPort: 6379
clusterIP: None # Headless service for stable DNS
selector:
app: redis
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: "redis"
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7.2
ports:
- containerPort: 6379
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
volumeMounts:
- name: data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 5Gi
This manifest:
- Creates a headless service to give each Redis pod a unique DNS identity.
- Uses a StatefulSet with predictable pod names (
redis-0
,redis-1
,redis-2
). - Attaches a PersistentVolumeClaim (PVC) to each pod for durable storage.
Step 2: Set Up StatefulSets
Using a StatefulSet ensures Redis pods retain identity across restarts. For example:
- Pod
redis-0
always maps to the same PVCdata-redis-0
. - Pod ordering ensures Redis cluster discovery works reliably.
If you use Helm, you can achieve the same outcome with:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install redis bitnami/redis-cluster \
--set cluster.nodes=6 \
--set persistence.storageClass=fast-ssd \
--set persistence.size=5Gi
Helm abstracts away much of the StatefulSet and PVC boilerplate while giving you tuning options.
Step 3: Configure Persistent Storage
Redis is an in-memory database, but still writes snapshots (RDB) or append-only logs (AOF) to disk for durability. Ensuring persistent storage prevents data loss during pod rescheduling.
For production workloads:
- Use SSD-backed StorageClasses with high IOPS.
- Configure AOF persistence in
redis.conf
:
appendonly yes
appendfsync everysec
You can inject this configuration into pods using a ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-config
data:
redis.conf: |
appendonly yes
appendfsync everysec
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
template:
spec:
containers:
- name: redis
image: redis:7.2
command: ["redis-server", "/usr/local/etc/redis/redis.conf"]
volumeMounts:
- name: config
mountPath: /usr/local/etc/redis
volumes:
- name: config
configMap:
name: redis-config
Step 4: Initialize the Cluster
Once the pods are running, form the cluster by running redis-cli
inside one of the pods:
kubectl exec -it redis-0 -- redis-cli --cluster create \
redis-0.redis:6379 \
redis-1.redis:6379 \
redis-2.redis:6379 \
--cluster-replicas 1
This command:
- Connects all Redis pods into a cluster.
- Assigns slots across nodes automatically.
- Configures replication (each primary has one replica).
To automate this step, wrap it in a Kubernetes Job:
apiVersion: batch/v1
kind: Job
metadata:
name: redis-init
spec:
template:
spec:
containers:
- name: redis-init
image: redis:7.2
command:
- sh
- -c
- |
redis-cli --cluster create \
redis-0.redis:6379 \
redis-1.redis:6379 \
redis-2.redis:6379 \
--cluster-replicas 1 --cluster-yes
restartPolicy: OnFailure
How to Manage Your Redis Cluster
Once Redis is deployed on Kubernetes, the priority shifts to day-to-day operations. Proper management ensures scalability, observability, and resilience, while minimizing manual intervention. The main areas to focus on are scaling, monitoring, backups, and failover handling.
Scale and Load Balance
Redis clusters can be scaled horizontally by increasing the number of pods in the StatefulSet. Kubernetes maintains stable identities for new pods, preserving cluster integrity. A headless Service provides a stable DNS entry for applications, while a standard Service distributes traffic evenly across nodes.
Example: scaling from 3 to 6 Redis nodes
kubectl scale statefulset redis --replicas=6
With Plural CD, this scaling action can be version-controlled and propagated across environments through GitOps, preventing drift and simplifying multi-cluster management.
Monitor with Prometheus and Grafana
Monitoring is essential to detect performance issues early. Use the Redis Exporter to expose metrics such as memory usage, CPU load, command latency, and cache hit ratio.
Example: Redis Exporter deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-exporter
spec:
replicas: 1
selector:
matchLabels:
app: redis-exporter
template:
metadata:
labels:
app: redis-exporter
spec:
containers:
- name: redis-exporter
image: oliver006/redis_exporter:v1.61.0
ports:
- containerPort: 9121
- Prometheus scrapes these metrics and stores them as time-series data.
- Grafana visualizes them in dashboards for capacity planning and troubleshooting.
- Plural’s dashboard integrates Kubernetes state directly, consolidating observability in one place.
Implement Backup and Recovery
Even with Redis cluster’s built-in resilience, backups are required to protect against accidental data loss or catastrophic failure. Automate backups with Kubernetes CronJobs that trigger Redis snapshotting (RDB or AOF).
Example: CronJob for nightly RDB snapshots
apiVersion: batch/v1
kind: CronJob
metadata:
name: redis-backup
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: redis:7.2
command: ["/bin/sh", "-c"]
args:
- redis-cli save && \
cp /data/dump.rdb /backup/redis-$(date +%F-%H%M).rdb
volumeMounts:
- name: data
mountPath: /data
- name: backup
mountPath: /backup
restartPolicy: OnFailure
volumes:
- name: backup
persistentVolumeClaim:
claimName: backup-pvc
Store these backups in durable object storage (S3, GCS, etc.) for long-term retention.
Handle Failover Scenarios
Redis cluster uses a master–replica model. If a master fails, a replica is automatically promoted. Kubernetes complements this by restarting failed pods and reattaching them to the cluster.
To observe failover:
kubectl logs redis-0
kubectl describe pod redis-0
Plural’s console surfaces these events across your fleet, making it easier to verify automatic promotions and confirm that cluster health is restored after outages.
How to Optimize Redis Performance
Running Redis on Kubernetes gives you scalability and resilience, but performance tuning is what makes it production-ready. Without proper optimization, you risk latency spikes, resource waste, or instability. Key areas to address include resource management, networking, security, and high availability.
Manage Resources Effectively
Redis is memory-bound, so setting the right CPU/memory requests and limits is critical. Under-provisioning can cause evictions or crashes, while over-provisioning wastes cluster capacity. Profile your workload to establish realistic baselines.
Example StatefulSet resource configuration:
resources:
requests:
memory: "4Gi"
cpu: "1"
limits:
memory: "6Gi"
cpu: "2"
Kubernetes enforces these allocations, ensuring Redis pods get guaranteed capacity. With Plural, you can monitor usage across clusters and adjust configurations centrally via GitOps, avoiding both bottlenecks and over-allocation.
Optimize Network Performance
Redis clusters require two ports per node:
- 6379 → Client connections
- 16379 → Cluster bus for gossip, health checks, and failover signals
Any latency in the cluster bus can lead to false node failures and unnecessary failovers.
Best practices:
- Ensure your CNI is tuned for low-latency traffic.
- Allow both ports in your firewall and NetworkPolicies.
- Avoid noisy neighbors by using pod anti-affinity and dedicated node pools when possible.
Example NetworkPolicy for Redis traffic:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: redis-allow
spec:
podSelector:
matchLabels:
app: redis
ingress:
- from:
- podSelector:
matchLabels:
app: my-app
ports:
- protocol: TCP
port: 6379
- protocol: TCP
port: 16379
Harden Your Security
Redis should never be exposed directly without protections. Secure the cluster using multiple Kubernetes-native controls:
- Pod anti-affinity → Prevents co-location of master and replica pods on the same node.
- NetworkPolicies → Restrict access so only trusted application pods can connect.
- Authentication → Enable Redis
requirepass
and store the password in a Kubernetes Secret.
Example Secret for Redis password:
apiVersion: v1
kind: Secret
metadata:
name: redis-auth
type: Opaque
data:
password: c3VwZXJzZWNyZXRwYXNz # base64 encoded
Mount this Secret into your StatefulSet and pass it to Redis via environment variables or config maps. Plural ensures these security settings are consistently applied across environments through GitOps.
Configure for High Availability
Redis provides HA through master–replica replication. In Kubernetes, pair this with scheduling strategies to eliminate single points of failure.
Key practices:
- Deploy replicas for each master shard in your StatefulSet.
- Use pod anti-affinity rules to spread replicas across different nodes.
- Monitor failovers with logs and health checks to validate readiness.
Example anti-affinity rule:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis
topologyKey: "kubernetes.io/hostname"
This ensures no two Redis pods from the same shard run on the same node. Combined with Kubernetes self-healing, you get automatic recovery from both pod-level and node-level failures.
How to Troubleshoot Your Redis Cluster
Deploying Redis on Kubernetes provides significant reliability and performance benefits, but like any distributed system, it can present unique troubleshooting challenges. When issues arise, a systematic approach is key to identifying the root cause, whether it lies within the Redis configuration, the Kubernetes environment, or the interaction between the two. Common problems range from initial deployment failures and performance degradation to unexpected behavior caused by Redis Cluster’s specific design trade-offs.
Solve Common Deployment Issues
Deployment failures often stem from misconfigurations in your Kubernetes manifests. Pods stuck in a Pending
state may indicate resource shortages or issues with PersistentVolumeClaims (PVCs), while CrashLoopBackOff
errors can point to incorrect container images, configuration errors, or failed readiness probes. Start by inspecting pod details with kubectl describe pod <pod-name>
to check for events that reveal the underlying problem. Reviewing container logs with kubectl logs <pod-name>
is also essential for diagnosing application-level failures. Network policies can sometimes block communication between Redis nodes, preventing the cluster from forming correctly. Plural’s embedded Kubernetes dashboard simplifies this process by providing a unified interface to view logs, events, and resource states across your entire fleet, eliminating the need to juggle multiple kubeconfig
files and terminals.
Identify Performance Bottlenecks
Performance issues in a Redis cluster can manifest as high latency or slow command execution. Begin by monitoring key Redis metrics. High memory usage (used_memory
) can lead to key eviction, while a low keyspace_hit_ratio
suggests your cache is ineffective. High CPU utilization might point to inefficient commands or an overloaded node. Running a Redis cluster on Kubernetes allows you to use native tooling like kubectl
to check pod resource consumption (kubectl top pod
). By correlating Redis metrics with Kubernetes pod and node metrics, you can determine if the bottleneck is application-specific or caused by infrastructure constraints. Plural provides a single pane of glass for observability, helping you connect application performance data with underlying infrastructure health to quickly pinpoint the source of slowdowns.
Understand Command Limitations
Some issues aren't bugs but are inherent to Redis Cluster’s design. A primary limitation is the handling of multi-key operations. Commands like MSET
or transactions involving keys that map to different hash slots will fail. Your application must be designed to handle these constraints, for instance, by using hash tags {...}
in keys to ensure related data resides on the same node. Furthermore, Redis Cluster does not promise strong consistency. During a network partition or failover event, the system can lose a small number of writes that were acknowledged by the master but not yet replicated. Understanding this trade-off is critical for applications that require guaranteed data durability.
Fix Data Distribution Problems
A well-balanced cluster is essential for optimal performance. Redis automatically shards data across 16,384 hash slots, which are distributed among the master nodes. However, an imbalance can occur if some nodes manage significantly more slots or data than others, creating "hot spots." This can happen after scaling the cluster or due to keying patterns that concentrate traffic on specific slots. You can inspect the slot distribution using the redis-cli cluster nodes
command. If you find a significant imbalance, you can use redis-cli --cluster rebalance
to redistribute slots evenly. This operation should be performed carefully during off-peak hours, as it can temporarily increase cluster load while data is being moved between nodes.
Related Articles
- How to Install Kubernetes: A Step-by-Step Guide
- Kubernetes Rancher: Simplify Cluster Management
- How to Monitor a Kubernetes Cluster: The Ultimate Guide
- Run MySQL on Kubernetes: The Complete Guide
Unified Cloud Orchestration for Kubernetes
Manage Kubernetes at scale through a single, enterprise-ready platform.
Frequently Asked Questions
Why is a StatefulSet recommended over a Deployment for a Redis cluster? A StatefulSet is designed for stateful applications like Redis that require stable and unique network identifiers and persistent storage. Unlike a standard Deployment, a StatefulSet provides each pod with a predictable, persistent name (e.g., redis-0
, redis-1
). This stability is essential for the Redis cluster's discovery mechanism and for maintaining master-replica relationships, ensuring the cluster can correctly identify its members even after pods are restarted or rescheduled.
My application uses multi-key commands. Will they work with Redis Cluster? Generally, no. Multi-key commands like MSET
or transactions will fail if the keys involved are stored on different nodes. This is a fundamental design trade-off of Redis Cluster's sharded architecture. The solution is to use "hash tags" by placing a common string within curly braces in your key names, such as {user123}:profile
and {user123}:cart
. This forces Redis to map all keys with the same tag to the same hash slot, ensuring they reside on the same node and can be used in atomic multi-key operations.
What's the first thing I should check if my Redis cluster performance is poor? Start by examining resource utilization and network latency. High latency is a common culprit, so ensure your Kubernetes network provides low-latency communication between pods, especially on the cluster bus port. Next, check for resource bottlenecks by monitoring CPU and memory usage for each Redis pod. Since Redis is an in-memory store, insufficient memory can lead to key eviction and performance degradation. Tools like Prometheus can help you track these metrics and identify if a specific node is overloaded.
How do I handle upgrades for my Redis cluster without causing downtime? The safest method is a rolling upgrade that leverages the master-replica architecture. For each master node you need to upgrade, you first perform a manual failover, promoting one of its replicas to become the new master. Once the replica has taken over, you can safely take the old master offline, perform the upgrade, and bring it back online. It will rejoin the cluster as a replica for the new master. This process ensures that a primary node is always available to serve requests for each data shard.
How does a platform like Plural simplify the Redis deployment and management process described here? Plural automates and unifies the entire lifecycle described in this post through a GitOps-based workflow. Instead of manually creating manifests, configuring storage, and initializing the cluster with kubectl
, you can define your Redis setup as code. Plural's continuous deployment ensures this configuration is applied consistently across your entire fleet. For ongoing management, Plural provides a single-pane-of-glass console with an embedded Kubernetes dashboard, allowing you to monitor resources, manage RBAC with SSO, and troubleshoot issues without juggling multiple tools and contexts. This streamlines operations and reduces the risk of manual error.
Newsletter
Join the newsletter to receive the latest updates in your inbox.