DORA Metrics Reference

Foundation

What are DORA Metrics?

DORA metrics come from the DevOps Research and Assessment program — six years of research (published in the State of DevOps reports and the book Accelerate) identifying which software delivery practices predict organisational performance.

The research identifies four metrics that together capture both throughput (how fast you deliver) and stability (how reliable that delivery is). High performers score well on all four — they are not in tension.

The core finding: High performers deploy more frequently AND have lower failure rates AND recover faster than low performers. Speed and stability are not a tradeoff at the elite level — they are complementary.

The 4 Metrics

The Four DORA Metrics

🚀 Deployment Frequency

How often does the organisation successfully release to production?

Elite

Multiple per day

High

Once per day to once per week

Medium

Once per week to once per month

Low

Once per month to once per 6 months

⏱ Lead Time for Changes

How long does it take for a commit to get into production? (from code committed to code running in production)

Elite

Less than 1 hour

High

1 day to 1 week

Medium

1 week to 1 month

Low

1 month to 6 months

🔴 Change Failure Rate

What percentage of changes to production result in a degraded service or require remediation (hotfix, rollback, patch)?

Elite

0–15%

High

0–15%

Medium

0–15%

Low

16–30%

💓 Time to Restore Service (MTTR)

How long does it take to recover from a failure in production? (from incident start to service restored)

Elite

Less than 1 hour

High

Less than 1 day

Medium

Less than 1 day

Low

1 day to 1 week

Benchmarks

Performance Bands

Band	Deploy Frequency	Lead Time	Change Failure Rate	MTTR
🏆 Elite	Multiple/day	<1 hour	0–15%	<1 hour
🥇 High	1/day–1/week	1 day–1 week	0–15%	<1 day
🥈 Medium	1/week–1/month	1 week–1 month	0–15%	<1 day
🔴 Low	<1/month	1–6 months	16–30%	1 day–1 week

The 2023 State of DevOps report introduced a fifth metric: Reliability (measured as meeting or exceeding reliability targets). Elite performers score well on all five.

Improvement

How to Improve Each Metric

Deployment Frequency

Blockers: long approval chains, manual release processes, infrequent merges
Improvements:
→ Trunk-based development (feature branches <1 day old)
→ Automated CI/CD pipeline (no manual steps to deploy)
→ Feature flags to decouple deploy from release
→ Small, incremental commits (not big-bang merges)

Lead Time for Changes

Blockers: long code review queues, slow test suites, manual approvals
Improvements:
→ Automated testing (unit + integration + smoke) <10 min
→ Pair programming / async code review SLA (24h max)
→ Eliminate manual gates; automate compliance checks
→ Reduce batch size of changes

Change Failure Rate

Blockers: insufficient automated testing, lack of staging parity
Improvements:
→ Test coverage and quality gates in CI
→ Staging environment that mirrors production
→ Progressive rollout: canary → 5% → 25% → 100%
→ Automated regression tests before every deploy

Time to Restore (MTTR)

Blockers: slow detection, complex rollback, unclear ownership
Improvements:
→ Comprehensive monitoring and alerting (detect within minutes)
→ Automated rollback or blue/green deploy capability
→ On-call runbooks with recovery steps
→ Blameless post-mortems with follow-through on action items

Improvement

How to Measure DORA

Deployment Frequency
→ Count deployments to production per week/day from your CD pipeline
→ Use: GitHub Actions logs, Spinnaker, Argo CD, CircleCI deployments

Lead Time for Changes
→ Time from first commit on a change to production deploy
→ Use: PR creation timestamp → merge → deploy pipeline completion
→ Tools: LinearB, Sleuth, DORA dashboard in Google Cloud

Change Failure Rate
→ Count deploys that triggered an incident, hotfix, or rollback
   divided by total deploys in the period
→ Use: PagerDuty incidents + deployment events; label "rollback" deploys

MTTR
→ PagerDuty / Opsgenie incident opened → incident resolved timestamps
→ Average or 85th percentile per month

Start with manual measurement for 1–2 weeks to understand your baseline before investing in tooling. Accuracy matters more than automation at first.

Reference

Anti-Patterns

Anti-Pattern	Problem	Fix
Optimising metrics in isolation	Team game the metric without improving the underlying capability	All four metrics together tell the story; don't cherry-pick
DORA as individual KPI	Developers blamed for team-level metrics	DORA measures the team and the system, never individuals
Measuring DORA without acting on it	Dashboards without action plans; "metrics theatre"	Metrics must drive specific improvement experiments
Counting deployments to staging	Inflates Deployment Frequency; production is what matters	Only production deployments count
Conflating Lead Time with Sprint Cycle Time	Different measurements; sprint metrics ≠ deployment lead time	Lead time = code committed → production; measure end-to-end
MTTR without post-mortems	Recover fast but never learn; same incidents recur	Every P1/P2 incident requires a blameless post-mortem + actions

Reference

DORA Cheat Sheet

4 Metrics
DF   → Deployment Frequency (how often to production?)
LT   → Lead Time for Changes (commit → production)
CFR  → Change Failure Rate (% deploys causing incidents)
MTTR → Time to Restore Service (incident start → resolved)

Elite benchmarks
DF   → multiple times per day
LT   → less than 1 hour
CFR  → 0–15%
MTTR → less than 1 hour

Improving DF & LT (throughput)
→ Trunk-based development
→ Automated CI/CD, no manual gates
→ Feature flags
→ Small batch sizes

Improving CFR & MTTR (stability)
→ Comprehensive automated testing
→ Production-parity staging
→ Progressive rollout (canary)
→ Monitoring + alerting + runbooks
→ Automated rollback capability
→ Blameless post-mortems

Remember
→ High performers excel at ALL four — they are not in tension
→ Measure the system, never individuals