ops-resilience-chaos-engineer

Community

Harden reliability with chaos testing.

AuthorThiagoGuislotti
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Platform reliability is at risk when timeouts, transient failures, and maintenance cause outages. This skill provides a structured approach to harden reliability using timeout/retry/circuit-breaker strategies, capacity controls, chaos testing, and disaster-recovery readiness.

Core Features & Use Cases

  • Resilience strategy templates for service-level objectives and incident response.
  • Chaos testing orchestration and safe failure-mode experiments with runbooks.
  • Disaster-recovery planning and recovery validation integrated into CI/CD pipelines.
  • Use Case: When deploying a critical microservice, simulate failures, observe degradation, and verify recovery within defined RTO/RPO targets.

Quick Start

Initiate a resilience assessment by loading the minimal context files and running the chaos-engineer agent.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: ops-resilience-chaos-engineer
Download link: https://github.com/ThiagoGuislotti/copilot-instructions/archive/main.zip#ops-resilience-chaos-engineer

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.