sre-incident-response
OfficialStreamline incident response, learn from every outage.
AuthorTheBushidoCollective
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill guides you through effective incident response and postmortem processes, minimizing downtime and ensuring continuous learning from production issues. It automates the structured approach to incident management.
Core Features & Use Cases
- Incident Severity & Process: Standardize incident classification (P0-P3) and follow a clear 5-step response process from detection to follow-up.
- Role Definition & Communication: Assign clear roles (Incident Commander, Ops Lead, Comms Lead) and utilize templates for consistent internal and external communication.
- Blameless Postmortems: Conduct structured postmortems to identify root causes, track action items, and foster a culture of continuous improvement without blame.
- Use Case: During a critical P0 outage, activate this Skill to quickly assign roles, use the communication templates to keep stakeholders informed, and ensure all steps for mitigation and resolution are followed, leading to faster recovery and effective follow-up.
Quick Start
Use the sre-incident-response skill to draft an initial P1 incident notification for a service experiencing elevated error rates, including impact, IC, and next update time.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: sre-incident-response Download link: https://github.com/TheBushidoCollective/han/archive/main.zip#sre-incident-response Please download this .zip file, extract it, and install it in the .claude/skills/ directory.