agent-opslisted
Install: claude install-skill stevengonsalvez/agents-in-a-box
# Agent Ops — SRE for AI Agents
## Quick Reference
| Pattern | Purpose |
|---------|---------|
| Cost Cap | Hard-stop agent when spend exceeds budget |
| Circuit Breaker | Halt after N consecutive failures |
| Stall Detector | Kill agent stuck in loops |
| Health Dashboard | Real-time agent metrics |
| Runbook | Automated incident response |
## Cost Cap Pattern
Set maximum spend per session or per agent:
```bash
# Check current session cost
METRICS_FILE="$HOME/{{TOOL_DIR}}/metrics/costs.jsonl"
if [[ -f "$METRICS_FILE" ]]; then
SESSION_ID="${CLAUDE_SESSION_ID:-default}"
TOTAL=$(grep "$SESSION_ID" "$METRICS_FILE" | \
python3 -c "import sys,json; print(sum(json.loads(l)['estimated_cost_usd'] for l in sys.stdin))")
echo "Session cost: \$$TOTAL"
fi
```
**Implementation in agent workflows:**
```python
# Cost cap check — add to any long-running agent loop
MAX_COST_USD = float(os.environ.get("AGENT_COST_CAP", "5.00"))
def check_cost_cap(session_id: str) -> bool:
"""Return True if within budget, False if exceeded."""
metrics_file = Path.home() / ".claude" / "metrics" / "costs.jsonl"
if not metrics_file.exists():
return True
total = 0.0
for line in metrics_file.read_text().splitlines():
try:
row = json.loads(line)
if row.get("session_id") == session_id:
total += row.get("estimated_cost_usd", 0)
except json.JSONDecodeError:
continue
return total < MAX_COST_US