ops-resilience-chaos-engineer
CommunityHarden reliability with chaos testing.
Software Engineering#resilience#reliability#runbook#incident-response#chaos-engineering#fault-injection#chaos-testing
AuthorThiagoGuislotti
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Platform reliability is at risk when timeouts, transient failures, and maintenance cause outages. This skill provides a structured approach to harden reliability using timeout/retry/circuit-breaker strategies, capacity controls, chaos testing, and disaster-recovery readiness.
Core Features & Use Cases
- Resilience strategy templates for service-level objectives and incident response.
- Chaos testing orchestration and safe failure-mode experiments with runbooks.
- Disaster-recovery planning and recovery validation integrated into CI/CD pipelines.
- Use Case: When deploying a critical microservice, simulate failures, observe degradation, and verify recovery within defined RTO/RPO targets.
Quick Start
Initiate a resilience assessment by loading the minimal context files and running the chaos-engineer agent.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: ops-resilience-chaos-engineer Download link: https://github.com/ThiagoGuislotti/copilot-instructions/archive/main.zip#ops-resilience-chaos-engineer Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.