monitoring-alerting
CommunityKeep your systems healthy, get alerted instantly.
Authorricardoroche
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill simplifies the complex task of setting up robust monitoring and alerting for applications. It helps ensure system reliability by providing patterns for metric instrumentation, defining Service Level Objectives (SLOs), and configuring timely alerts, preventing outages and reducing manual incident response.
Core Features & Use Cases
- Metric Instrumentation: Guides on using Prometheus client for counters, histograms, and gauges, including FastAPI middleware for automatic metric tracking.
- SLO/SLI Definition: Provides structured patterns for defining Service Level Indicators (SLIs) and Service Level Objectives (SLOs) with Prometheus queries.
- Alerting & Dashboards: Offers patterns for defining alert rules, sending notifications (Slack, PagerDuty), and configuring Grafana dashboards for visualization.
- Use Case: A DevOps engineer needs to ensure their new microservice meets a 99.9% availability target. This skill helps them instrument HTTP requests, define an SLO for success rate, create an alert if the error budget is exhausted, and build a dashboard to track performance.
Quick Start
Help me instrument my FastAPI application with Prometheus metrics for request rate and latency.
Dependency Matrix
Required Modules
prometheus_clientfastapihttpxpydantic
Components
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: monitoring-alerting Download link: https://github.com/ricardoroche/ricardos-claude-code/archive/main.zip#monitoring-alerting Please download this .zip file, extract it, and install it in the .claude/skills/ directory.