incident-root-cause-analyzer

Community

Automate root-cause analysis, cut outages fast.

AuthorLitheScript
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill provides a comprehensive, end-to-end methodology for analyzing production incidents in distributed systems, enabling teams to quickly identify cascading failures, resource contention, and backpressure-driven root causes. It helps reduce mean time to detection and resolution by automating data collection, anomaly scoring, hypothesis testing, and evidence-based reporting.

Core Features & Use Cases

  • Automated root-cause analysis across microservice boundaries using logs, metrics, and traces.
  • Anomaly detection with timeline correlation to reveal the sequence of events leading to an incident.
  • Evidence-based hypothesis testing and automated visualizations (Mermaid diagrams) to communicate findings.
  • Generates formal Root Cause Analysis reports and structured evidence artifacts to speed incident reviews.

Quick Start

Load the incident root cause analyzer skill into Claude, provide:

  • Incident time window (e.g., 07:00:00 ± 30s)
  • Metrics data directory (CSV files)
  • Optional logs or traces directory Then run the analysis to produce a Root Cause Analysis report, evidence charts, a fault evolution diagram, and an evidence index.

Dependency Matrix

Required Modules

polarsnumpymatplotlib

Components

scriptsreferencesassets

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: incident-root-cause-analyzer
Download link: https://github.com/LitheScript/incident-root-cause-analyzer/archive/main.zip#incident-root-cause-analyzer

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository