Site Reliability Engineering

Community

Balance reliability and velocity.

Author7a336e6e
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps teams define and maintain service reliability by establishing clear objectives and managing operational risks, ensuring a balance between feature delivery and system stability.

Core Features & Use Cases

  • SLO Definition: Quantify reliability targets (e.g., availability, latency) for services.
  • Error Budget Management: Track deviations from SLOs to inform deployment decisions (e.g., code freezes).
  • Incident Review: Conduct blameless post-mortems to learn from outages and prevent recurrence.
  • Use Case: After a service experiences downtime, use this Skill to define SLIs, set an SLO, manage the resulting error budget, and conduct a blameless post-mortem to identify root causes and implement preventative measures.

Quick Start

Use the Site Reliability Engineering skill to define SLIs and SLOs for the user authentication service.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: Site Reliability Engineering
Download link: https://github.com/7a336e6e/skills/archive/main.zip#site-reliability-engineering

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.