Site Reliability Engineering
CommunityBalance reliability and velocity.
Author7a336e6e
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps teams define and maintain service reliability by establishing clear objectives and managing operational risks, ensuring a balance between feature delivery and system stability.
Core Features & Use Cases
- SLO Definition: Quantify reliability targets (e.g., availability, latency) for services.
- Error Budget Management: Track deviations from SLOs to inform deployment decisions (e.g., code freezes).
- Incident Review: Conduct blameless post-mortems to learn from outages and prevent recurrence.
- Use Case: After a service experiences downtime, use this Skill to define SLIs, set an SLO, manage the resulting error budget, and conduct a blameless post-mortem to identify root causes and implement preventative measures.
Quick Start
Use the Site Reliability Engineering skill to define SLIs and SLOs for the user authentication service.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: Site Reliability Engineering Download link: https://github.com/7a336e6e/skills/archive/main.zip#site-reliability-engineering Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.