Operations

Deployment Strategies (Canary, Blue-Green)

Intermediate

How you release a change is as important as the change itself. Releasing to everyone at once means a bad change hits everyone at once. Progressive strategies (canary, blue-green, rolling) expose a new version gradually, watch it, and make rollback instant. So a problem is caught on a slice of traffic instead of becoming a full outage.

The goal is to make releases low-risk and reversible. Rolling updates replace instances gradually. Canary sends a small percentage of traffic to the new version, watches the metrics, then ramps up if it is healthy. Blue-green runs two environments and switches traffic over, with an instant switch back. Combined with feature flags, you can fully separate deploying code from releasing a feature.

These build on CI/CD (the pipeline that delivers them), Feature Flags (release control), DORA (they improve change-failure rate and recovery time), Observability (you must watch to know it is healthy), and Backward Compatibility (old and new run side by side, so changes must be compatible).

Release gradually and reversibly

Do it safely through the pipeline

Big-bang, no watch, slow rollback // deploy new version to 100% at once; no canary; no health gate;
// rollback means a manual redeploy taking 30+ minutes

A bug now hits every user instantly, nobody is watching a slice to catch it early, and recovery is slow and manual. A small regression becomes a long, full outage.

Canary with auto-rollback // deploy -> 5% canary -> watch error rate/latency/key metric 15 min
// healthy? ramp 25% -> 50% -> 100%. Regression? auto-rollback.
// feature stays behind a flag until fully validated

A bad change is caught on 5% of traffic and rolled back automatically. A healthy one ramps safely. Small blast radius, fast recovery, which is exactly what improves DORA stability.

Self-review checklist

Why it matters: Most production incidents are triggered by a change, so how we roll changes out directly controls how much damage a bad one does. Progressive, watched, instantly reversible deployment shrinks the blast radius from everyone to a small slice, and turns recovery from minutes or hours into seconds. That is exactly how the best teams ship often and stay stable.