The Guardian is Ryvos's runtime watchdog. It monitors every agent run for problems and intervenes with corrective hints before things go wrong. It does not block execution — it guides the agent back on track.
What the Guardian Monitors
┌─────────────────────────────────────────────┐
│ Guardian Watchdog │
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ Doom Loop │ │ Stall │ │
│ │ Detection │ │ Detection │ │
│ │ │ │ │ │
│ │ Identical tool │ │ No progress │ │
│ │ calls repeated │ │ for N seconds │ │
│ └─────────────────┘ └──────────────────┘ │
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ Token Budget │ │ Dollar Budget │ │
│ │ Monitoring │ │ Monitoring │ │
│ │ │ │ │ │
│ │ Soft warn at │ │ Monthly and │ │
│ │ 80%, hard stop │ │ per-run cost │ │
│ │ at 95% │ │ limits │ │
│ └─────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────┘
Doom Loop Detection
A doom loop occurs when the agent calls the same tool with identical (or near-identical) arguments multiple times in a row. This usually means the agent is stuck trying the same failing approach repeatedly.
How It Works
The Guardian fingerprints each tool call using:
- Tool name
- Hash of the input JSON
If the last N calls match the same fingerprint (where N = doom_loop_threshold), the Guardian intervenes:
Turn 5: bash("npm test") → failed
Turn 6: bash("npm test") → failed
Turn 7: bash("npm test") → failed ← doom loop detected!
Guardian hint injected:
"You've called 'bash' with the same command 3 consecutive times.
Each attempt failed. Try a different approach: check the error
output, fix the underlying issue, or use a different tool."
Configuration
[agent.guardian]
doom_loop_threshold = 3 # Consecutive identical calls before interventionEvents
| Event | Meaning |
|---|---|
GuardianDoomLoop | Doom loop detected, hint injected |
GuardianHint | Corrective hint injected into conversation |
Stall Detection
A stall occurs when the agent makes no meaningful progress for an extended period. This can happen when:
- The LLM is generating very long responses without tool calls
- A tool call takes unexpectedly long
- The agent is deliberating without acting
How It Works
The Guardian tracks the timestamp of the last meaningful action (tool call, message completion). If the interval exceeds stall_timeout_secs, it injects a hint:
Guardian hint:
"No progress detected for 120 seconds. Consider taking action
or asking the user for clarification if you're stuck."
Configuration
[agent.guardian]
stall_timeout_secs = 120 # Seconds of inactivity before stall alertEvents
| Event | Meaning |
|---|---|
GuardianStall | No progress detected, hint injected |
Token Budget Monitoring
The Guardian tracks context token usage and warns before the limit is reached.
Soft Warning
At 80% of max_context_tokens, the Guardian emits a warning:
[BudgetWarning] Token usage at 82% (26,240/32,000).
Context compaction will begin soon.
This triggers the memory flush process — the agent extracts important facts to persistent memory before old messages are pruned.
Hard Stop
At 95% of max_context_tokens, the Guardian forces a stop:
[BudgetExceeded] Token budget exhausted (31,200/32,000).
Completing current turn and stopping.
Configuration
[agent.guardian]
token_budget_soft = 80 # Percentage: emit warning
token_budget_hard = 95 # Percentage: force stop
[agent]
max_context_tokens = 32000 # The total budgetDollar Budget Monitoring
The Guardian tracks spending against configured budget limits.
Per-Run Budget
[budget]
per_run_limit_cents = 100 # $1.00 per run maximumWhen the per-run cost approaches the limit:
- At 80%:
BudgetWarningevent - At 100%:
BudgetExceededevent, run stopped
Monthly Budget
[budget]
monthly_limit_cents = 5000 # $50.00 per month
warning_threshold_cents = 4000 # Warn at $40.00Monthly costs are tracked in the CostStore (SQLite). When the monthly total approaches the limit:
- At
warning_threshold_cents:BudgetWarningevent - At
monthly_limit_cents:BudgetExceededevent, all runs blocked until next month
Events
| Event | Meaning |
|---|---|
BudgetWarning | Approaching a budget limit |
BudgetExceeded | Budget limit reached, execution stopped |
Intervention Strategy
The Guardian follows a non-blocking intervention strategy:
- Detect — Identify the problem (doom loop, stall, budget)
- Hint — Inject a corrective message into the conversation
- Observe — Check if the agent adjusts its behavior
- Escalate — If the problem persists after the hint, apply harder interventions:
- Additional hints with stronger guidance
- Force-completing the current run with a summary
- Reporting the issue to the user via channels
The Guardian never silently blocks a tool call. It provides guidance and lets the agent make decisions.
Guardian Events Summary
| Event | Severity | Trigger |
|---|---|---|
GuardianHint | Info | Any corrective hint injected |
GuardianDoomLoop | Warning | Consecutive identical tool calls |
GuardianStall | Warning | No progress for stall_timeout_secs |
GuardianBudgetAlert | Warning | Approaching budget limit |
BudgetWarning | Warning | Budget soft threshold reached |
BudgetExceeded | Error | Budget hard limit reached |
All events are published on the EventBus and visible in:
- The Web UI activity feed
- Run logs (L2 and L3)
- Channel notifications (if configured)
Full Configuration
[agent.guardian]
enabled = true # Enable the guardian watchdog
doom_loop_threshold = 3 # Identical calls before intervention
stall_timeout_secs = 120 # Seconds before stall detection
token_budget_soft = 80 # Percentage: warning threshold
token_budget_hard = 95 # Percentage: hard stop
[budget]
monthly_limit_cents = 5000 # Monthly spending cap
warning_threshold_cents = 4000 # Monthly warning threshold
per_run_limit_cents = 100 # Per-run spending cap
[budget.pricing_overrides] # Custom pricing per model
"claude-sonnet-4-20250514" = { input_per_1m = 300, output_per_1m = 1500 }
"gpt-4o" = { input_per_1m = 250, output_per_1m = 1000 }Next Steps
- Failure Journal — Pattern tracking and reflexion hints
- Agent Loop — Where the guardian operates in the loop
- Self-Learning Safety — The broader safety system