investigate-incident
OfficialStreamline incident response, document RCA.
Software Engineering#troubleshooting#Kubernetes#post-mortem#operations#incident response#runbooks#RCA
Authorredhat-et
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Incident investigation is often a chaotic, manual process involving multiple tools and a high risk of missing critical information. This skill provides a structured, automated workflow for incident response, guiding users through evidence gathering, root cause analysis, and documentation, significantly reducing resolution time and improving post-incident learning.
Core Features & Use Cases
- Guided Investigation: Follow alert-specific runbooks and systematic steps to diagnose issues.
- Automated Evidence Collection: Gather logs, metrics, events, and application status with pre-defined commands.
- Structured RCA Documentation: Generate comprehensive incident reports in
TODO_INCIDENTS.mdfor future learning. - Use Case: An alert for "Prometheus Down" has fired. Use this skill to automatically follow the Prometheus Down runbook, gather relevant logs and metrics, and start documenting the incident for root cause analysis.
Quick Start
Use the investigate-incident skill to start an investigation for the "Prometheus Down" alert, following its runbook and gathering initial evidence.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: investigate-incident Download link: https://github.com/redhat-et/kagenti-demo-deployment/archive/main.zip#investigate-incident Please download this .zip file, extract it, and install it in the .claude/skills/ directory.