Skip to main content

Postmortem Template

Type: PostmortemCreated: Team: Platform
published

Summary

2–3 sentences in plain language: what happened, the impact, and how it was resolved.

Timeline

Time-ordered events in UTC.

Time (UTC)Event
HH:MMAlert fired
HH:MMOn-call acknowledged
HH:MMRoot cause identified
HH:MMFix deployed
HH:MMService restored

Impact

  • Users affected: …
  • Duration: …
  • Data loss: yes/no, scope
  • Revenue or trust impact: …

Root cause

What actually caused it. Be specific — "X broke because Y", not "human error".

Contributing factors

What conditions made this more likely or harder to detect?

What went well

Things that worked — fast detection, good comms, an effective fix.

What went poorly

What slowed detection or recovery. Focus on systems, not people.

Action items

Track each to completion — a postmortem without follow-through is theater.

ActionOwnerDueStatus
engineeringYYYY-MM-DDopen