Ryvos

The Failure Journal is Ryvos's self-healing memory. It records every tool failure, identifies patterns, and generates contextual hints that help the agent avoid repeating mistakes. This implements the reflexion pattern — verbal reinforcement learning without model fine-tuning.

How It Works

Tool Execution Failed
        │
        ▼
┌─────────────────────────────┐
│   FailureJournal.record()    │
│                              │
│   1. Record failure details  │
│   2. Check for patterns      │
│   3. Generate reflexion hint │
│   4. Store for future recall │
└─────────────────────────────┘
        │
        ▼
Next time this tool is called:
  → Relevant hints injected into context
  → Agent avoids the same mistake

What Gets Recorded

Every tool failure is stored in SQLite with:

Field	Description
`tool_name`	Which tool failed (e.g., `bash`, `web_fetch`, `browser_navigate`)
`input_summary`	Summary of the tool input (truncated for storage)
`error_message`	The error that occurred
`session_id`	Which session this happened in
`timestamp`	When the failure occurred
`failure_count`	How many times this tool has failed in this session
`pattern_hash`	Hash of the tool + input pattern for deduplication

Pattern Detection

The journal detects recurring failure patterns:

Same-Tool Patterns

If a tool fails repeatedly with similar inputs, the journal recognizes the pattern:

bash("npm install") → EACCES: permission denied → 3 times
  Pattern: "bash with npm commands frequently fails with permission errors"
  Hint: "Previous npm commands failed with permission errors. Try using
         'sudo' or check if the directory is writable."

Cross-Session Patterns

Failures are tracked across sessions. If web_fetch fails on a specific domain every time, the journal remembers:

Session 1: web_fetch("https://api.example.com") → timeout
Session 2: web_fetch("https://api.example.com") → timeout
Session 3: web_fetch("https://api.example.com") → ?
  Hint: "https://api.example.com has timed out in 2 previous sessions.
         Consider checking if the service is down or using a different endpoint."

Tool-Specific Patterns

Common patterns the journal recognizes:

Tool	Pattern	Hint
`bash`	Command not found	"The command may not be installed. Try checking with `which` first."
`bash`	Permission denied	"Try running with appropriate permissions or checking file ownership."
`web_fetch`	Repeated timeouts on same domain	"This URL has timed out before. Check if the service is available."
`read`	File not found	"Verify the file path exists. Use `glob` or `dir_list` to find the correct path."
`write`	Permission denied	"The target directory may not be writable. Check permissions with `file_info`."
`browser_navigate`	Connection refused	"The target URL may not be running. Check if the server is started."

Reflexion Hints

When the agent is about to use a tool that has failed before, the reflexion_hint() function generates a contextual hint:

reflexion_hint("bash", failure_count) -> Option<String>

The hint is injected into the conversation context as a system message:

[System] Tool guidance: The `bash` tool has failed 3 times in this session.
Recent failures were caused by: permission denied on /var/log/. Consider
checking permissions before attempting file operations in system directories.

This gives the agent specific, experience-based guidance without blocking the tool call. The agent can still use the tool — it just has better context about what might go wrong.

Viewing Tool Health

The ryvos health command displays aggregated statistics from the failure journal:

ryvos health

Tool              Success   Failures   Health   Common Errors
bash              234       3          98.7%    permission denied (2), command not found (1)
read              1,024     0          100.0%   —
write             156       2          98.7%    permission denied (2)
web_fetch         89        12         88.1%    timeout (8), 404 (3), SSL error (1)
browser_navigate  45        8          84.9%    connection refused (5), timeout (3)
edit              312       1          99.7%    pattern not found (1)
memory_search     567       0          100.0%   —
grep              890       0          100.0%   —

Research Background

The reflexion pattern is based on published research:

Reflexion: Language Agents with Verbal Reinforcement Learning — GPT-4 with reflexion: 91% task completion vs 80% without. No model fine-tuning needed.
The key insight: storing failure experiences as natural language and loading them into context is sufficient for learning. The model's weights do not change; only its context does.
Memory quality matters — Studies show that strict curation (removing low-quality lessons) yields 10% improvement. Bad failure records can create error loops where the agent avoids correct approaches.

Integration with Safety Memory

The failure journal feeds into the broader safety memory system:

Failure Journal (tool-level)
    │
    ├── Tool health statistics
    ├── Reflexion hints for context
    │
    ▼
Safety Memory (agent-level)
    │
    ├── Safety lessons (corrective rules)
    ├── Outcome assessment
    └── Constitutional AI principles

Tool-level failures contribute to agent-level safety lessons. If a bash command causes data loss, that becomes both a tool-level failure record and an agent-level safety lesson.

Configuration

The failure journal is always active. There is no configuration needed — it starts recording from the first tool failure.

Related settings:

[agent]
log = "l3"                          # L3 logging captures per-tool execution details
 
[agent.guardian]
enabled = true
doom_loop_threshold = 3             # Consecutive identical failures trigger guardian

Storage

Failure data is stored in the main SQLite database (~/.ryvos/ryvos.db):

-- Query failure patterns
SELECT tool_name, error_message, COUNT(*) as occurrences
FROM failure_journal
GROUP BY tool_name, error_message
ORDER BY occurrences DESC;
 
-- Recent failures
SELECT tool_name, input_summary, error_message, timestamp
FROM failure_journal
ORDER BY timestamp DESC
LIMIT 20;

Next Steps

Guardian Watchdog — Runtime monitoring and intervention
Self-Learning Safety — How failures feed into safety lessons
Audit Trail — Complete action logging