chaoslab.eknathalabs.com · built in public
Kubernetes
Chaos Simulator
Inject real failures into your cluster. Learn resilience engineering by intentionally breaking things in a safe, controlled environment — powered entirely by GitHub Actions.
4Fault types
100%GitHub native
AutoRollback
FreeAlways
First-time setup required.
Add
KUBECONFIG_DATA and CLUSTER_CONTEXT as GitHub Secrets.
Then set CONFIG.REPO and CONFIG.TOKEN inside chaos.js.
README → Setup ↗
01 — choose fault
live
Pod Kill
Randomly deletes matching pods. Tests ReplicaSet self-healing and restart policies.
live
CPU Stress
Saturates CPU with stress pods. Tests HPA autoscaling and resource limit enforcement.
live
Network Delay
Injects latency via
tc netem. Tests timeouts and circuit breakers.live
Node Drain
Cordons and drains a worker node. Tests PodDisruptionBudgets and autoscaler response.
02 — configure experiment
$
chaos run --fault pod-kill --dry-run
🔵 Dry run ON — safe
kube-system is always blocked
Kubernetes label selector syntax
Time before auto-rollback check
Max pods affected
03 — experiment log
--:--:--ChaosLab ready. Select a fault, configure your experiment, then click Run.
--:--:--Dry run is ON by default — safe to click without affecting your cluster.
04 — recent runs
No experiments yet. Run your first chaos experiment above.
05 — safety guardrails
kube-system blocked
Hardcoded protection — no experiment can target the kube-system namespace under any condition.
Blast radius cap
Maximum 30% of matching pods by default. Configurable, but never more than you explicitly allow.
Auto rollback
Every workflow has an
always: cleanup block. Chaos artefacts are removed on success or failure.Dry run default
Dry run is ON by default. Every experiment shows exactly what it will do before doing anything.
Full audit trail
Every run is logged in GitHub Actions history. JSON reports saved as artifacts for 30 days.
Emergency rollback
One-click rollback cleans all chaos artefacts, uncordons nodes, and restarts deployments instantly.