sre-incident-response

Official

Streamline incident response, learn from every outage.

AuthorTheBushidoCollective
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill guides you through effective incident response and postmortem processes, minimizing downtime and ensuring continuous learning from production issues. It automates the structured approach to incident management.

Core Features & Use Cases

  • Incident Severity & Process: Standardize incident classification (P0-P3) and follow a clear 5-step response process from detection to follow-up.
  • Role Definition & Communication: Assign clear roles (Incident Commander, Ops Lead, Comms Lead) and utilize templates for consistent internal and external communication.
  • Blameless Postmortems: Conduct structured postmortems to identify root causes, track action items, and foster a culture of continuous improvement without blame.
  • Use Case: During a critical P0 outage, activate this Skill to quickly assign roles, use the communication templates to keep stakeholders informed, and ensure all steps for mitigation and resolution are followed, leading to faster recovery and effective follow-up.

Quick Start

Use the sre-incident-response skill to draft an initial P1 incident notification for a service experiencing elevated error rates, including impact, IC, and next update time.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: sre-incident-response
Download link: https://github.com/TheBushidoCollective/han/archive/main.zip#sre-incident-response

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository