chaos-engineering

Solid

Failure injection patterns, blast radius control, steady state hypothesis, and gameday planning for resilience testing.

AI & Automation 501 stars 42 forks Updated yesterday MIT

Install

View on GitHub

Quality Score: 91/100

Stars 20%
90
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Chaos Engineering Systematic resilience testing to discover weaknesses before they cause outages. ## Steady State Hypothesis ```yaml # Define BEFORE injecting chaos - what "normal" looks like steady_state_hypothesis: title: "API serves traffic within SLO" probes: - name: "API response time p95 < 500ms" type: http url: "https://api.example.com/health" threshold: 500 - name: "Error rate < 1%" type: prometheus query: "rate(http_requests_total{status=~'5..'}[5m]) / rate(http_requests_total[5m])" threshold: 0.01 - name: "Order processing queue depth < 100" type: cloudwatch metric: "ApproximateNumberOfMessagesVisible" threshold: 100 - name: "Database connections < 80% capacity" type: prometheus query: "pg_stat_activity_count / pg_settings_max_connections" threshold: 0.8 ``` ## Failure Injection Patterns ```python # Using Chaos Toolkit (chaostoolkit.org) # experiment.json { "title": "Database failover resilience", "description": "Verify app handles primary DB failover gracefully", "steady-state-hypothesis": { "title": "API responds normally", "probes": [ { "name": "api-health", "type": "probe", "provider": { "type": "http", "url": "https://api.example.com/health", "timeout": 5 }, "tolerance": {"status": 200} } ] }, "method": [ { "name": "failover-primary-db", "t...

Details

Author
vibeeval
Repository
vibeeval/vibecosystem
Created
2 months ago
Last Updated
yesterday
Language
C#
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category