Understanding Kubernetes Rolling Updates and Rollbacks
You've just deployed your application to Kubernetes. The deployment controller spins up new pods, terminates old ones, and traffic flows to the updated version. But what happens when something goes wrong? How do you safely roll back to a previous version? This is where Kubernetes rolling updates and rollbacks become essential.
What Are Rolling Updates?
Rolling updates are a deployment strategy where Kubernetes gradually replaces old pods with new ones. Instead of a single big bang where all old pods are terminated and all new pods start simultaneously, the process happens incrementally.
Think of it like a train leaving a station. Old pods exit one by one, and new pods enter one by one, maintaining continuous service throughout the transition. This ensures your application remains available during the update process.
How Rolling Updates Work
When you update a Deployment, Kubernetes performs these steps:
- Scale up the new ReplicaSet by creating new pods
- Wait for the new pods to become ready (pass health checks)
- Scale down the old ReplicaSet by terminating old pods
- Repeat until all old pods are replaced
The number of pods replaced at once is controlled by the maxSurge and maxUnavailable parameters.
Rolling Update Configuration
maxSurge Parameter
maxSurge defines how many additional pods can be created beyond the desired replica count during the update. This ensures you have extra capacity during the transition.
In this example, during an update, Kubernetes can have up to 4 pods (3 + 1) running, ensuring zero downtime.
maxUnavailable Parameter
maxUnavailable defines how many pods can be unavailable during the update. This is the number of old pods that can be terminated before new pods are created.
Here, Kubernetes can terminate one old pod while creating one new pod, maintaining 2 running pods throughout the update.
Common Configuration Patterns
| Scenario | maxSurge | maxUnavailable | Behavior |
|---|---|---|---|
| Zero downtime | 1 | 0 | New pods start before old ones terminate |
| Fast update | 25% | 25% | More pods updated simultaneously |
| Conservative | 0 | 1 | Old pods terminate only after new ones are ready |
Rolling Update Example
Let's walk through a concrete example. Suppose you have a Deployment with 3 replicas:
You update the image to nginx:1.22. Here's what happens:
- Initial state: 3 pods running nginx:1.21
- Step 1: Create 1 new pod with nginx:1.22 (maxSurge: 1)
- Step 2: Wait for new pod to be ready
- Step 3: Terminate 1 old pod with nginx:1.21
- Step 4: Repeat until all 3 pods run nginx:1.22
The update completes in 3 steps, with 2 pods running at all times.
Health Checks During Rolling Updates
Health checks are critical for rolling updates. Kubernetes only considers a pod ready when all its containers are ready and pass their health checks.
Readiness Probes
Readiness probes determine when a pod is ready to receive traffic. They're essential for rolling updates because they prevent traffic from being sent to pods that aren't ready.
Liveness Probes
Liveness probes determine if a container is running. If a liveness probe fails, Kubernetes restarts the container. This is different from readiness probes, which only affect traffic routing.
If the liveness probe fails 3 times in 30 seconds, Kubernetes restarts the container.
Rollback Mechanisms
Despite careful planning, things go wrong. Kubernetes provides built-in rollback capabilities to revert to previous Deployment versions.
Automatic Rollback on Failure
If a rolling update fails, Kubernetes automatically rolls back to the previous stable version. A failure occurs when:
- A new pod fails to become ready after the timeout period
- The Deployment reaches the
progressDeadlineSecondslimit - The update is manually cancelled
Manual Rollback Commands
You can manually roll back using kubectl:
Checking Rollback History
Rollback Example
Suppose you've deployed version 1.22, but it has a critical bug. Here's how to roll back:
Kubernetes will automatically create a new ReplicaSet with the previous image version and perform a rolling update to restore the previous state.
Deployment Strategies Comparison
Different deployment strategies serve different use cases. Here's how rolling updates compare to other strategies:
| Strategy | Description | Use Case | Downtime |
|---|---|---|---|
| Rolling Update | Gradual pod replacement | General purpose, zero downtime | None |
| Recreate | Stop all pods, start new ones | Simple deployments, no health checks | Temporary |
| Blue-Green | Two identical environments | High-risk changes, canary testing | Minimal |
| Canary | Gradual traffic shift | Feature flags, gradual rollout | Minimal |
Rolling Update vs Recreate
Rolling updates maintain service availability, while recreate stops all pods during the update.
Best Practices for Rolling Updates
1. Use Conservative maxSurge and maxUnavailable Values
Start with conservative values to ensure stability:
Gradually increase these values as you gain confidence in your deployment process.
2. Configure Health Checks Properly
Always define readiness and liveness probes:
3. Set Progress Deadline
Allow sufficient time for updates to complete:
4. Test Updates in Non-Production Environments
Always test rolling updates in staging before production:
5. Monitor Rollout Progress
Watch the rollout status during updates:
6. Use Deployment Annotations for Tracking
Add annotations to track deployment metadata:
Troubleshooting Rolling Updates
Update Stuck in Progress
If an update appears stuck:
New Pods Not Starting
Check pod events:
Health Check Failures
Verify health check endpoints:
Advanced Rolling Update Techniques
Rolling Back with Pause
You can pause a rollout to inspect the current state:
Rolling Updates with Multiple Container Images
For multi-container pods, specify which container to update:
Rolling Updates with ConfigMaps
Update configuration without changing the image:
Monitoring Rolling Updates
Track Deployment Status
Set Up Alerts
Create alerts for deployment failures:
Conclusion
Rolling updates and rollbacks are fundamental to safe Kubernetes deployments. By understanding how rolling updates work, configuring health checks properly, and knowing how to rollback when things go wrong, you can confidently deploy changes to production with minimal risk.
Remember these key points:
- Rolling updates replace pods gradually, maintaining service availability
- Configure
maxSurgeandmaxUnavailableto control the update pace - Always define readiness and liveness probes
- Use manual rollback commands when automatic rollback isn't sufficient
- Monitor rollout progress and set up alerts for failures
Platforms like ServerlessBase simplify deployment management by handling reverse proxy configuration and SSL certificate provisioning automatically, so you can focus on implementing robust rolling update strategies for your applications.
The next step is to implement these patterns in your own Kubernetes deployments. Start with conservative configuration values, test thoroughly in staging, and gradually increase your update pace as you gain confidence in your deployment process.