incident-root-cause-analyzer
CommunityAutomate root-cause analysis, cut outages fast.
AuthorLitheScript
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides a comprehensive, end-to-end methodology for analyzing production incidents in distributed systems, enabling teams to quickly identify cascading failures, resource contention, and backpressure-driven root causes. It helps reduce mean time to detection and resolution by automating data collection, anomaly scoring, hypothesis testing, and evidence-based reporting.
Core Features & Use Cases
- Automated root-cause analysis across microservice boundaries using logs, metrics, and traces.
- Anomaly detection with timeline correlation to reveal the sequence of events leading to an incident.
- Evidence-based hypothesis testing and automated visualizations (Mermaid diagrams) to communicate findings.
- Generates formal Root Cause Analysis reports and structured evidence artifacts to speed incident reviews.
Quick Start
Load the incident root cause analyzer skill into Claude, provide:
- Incident time window (e.g., 07:00:00 ± 30s)
- Metrics data directory (CSV files)
- Optional logs or traces directory Then run the analysis to produce a Root Cause Analysis report, evidence charts, a fault evolution diagram, and an evidence index.
Dependency Matrix
Required Modules
polarsnumpymatplotlib
Components
scriptsreferencesassets
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: incident-root-cause-analyzer Download link: https://github.com/LitheScript/incident-root-cause-analyzer/archive/main.zip#incident-root-cause-analyzer Please download this .zip file, extract it, and install it in the .claude/skills/ directory.