The Failure Journal is Ryvos's self-healing memory. It records every tool failure, identifies patterns, and generates contextual hints that help the agent avoid repeating mistakes. This implements the reflexion pattern — verbal reinforcement learning without model fine-tuning.
How It Works
Tool Execution Failed
│
▼
┌─────────────────────────────┐
│ FailureJournal.record() │
│ │
│ 1. Record failure details │
│ 2. Check for patterns │
│ 3. Generate reflexion hint │
│ 4. Store for future recall │
└─────────────────────────────┘
│
▼
Next time this tool is called:
→ Relevant hints injected into context
→ Agent avoids the same mistake
What Gets Recorded
Every tool failure is stored in SQLite with:
| Field | Description |
|---|---|
tool_name | Which tool failed (e.g., bash, web_fetch, browser_navigate) |
input_summary | Summary of the tool input (truncated for storage) |
error_message | The error that occurred |
session_id | Which session this happened in |
timestamp | When the failure occurred |
failure_count | How many times this tool has failed in this session |
pattern_hash | Hash of the tool + input pattern for deduplication |
Pattern Detection
The journal detects recurring failure patterns:
Same-Tool Patterns
If a tool fails repeatedly with similar inputs, the journal recognizes the pattern:
bash("npm install") → EACCES: permission denied → 3 times
Pattern: "bash with npm commands frequently fails with permission errors"
Hint: "Previous npm commands failed with permission errors. Try using
'sudo' or check if the directory is writable."
Cross-Session Patterns
Failures are tracked across sessions. If web_fetch fails on a specific domain every time, the journal remembers:
Session 1: web_fetch("https://api.example.com") → timeout
Session 2: web_fetch("https://api.example.com") → timeout
Session 3: web_fetch("https://api.example.com") → ?
Hint: "https://api.example.com has timed out in 2 previous sessions.
Consider checking if the service is down or using a different endpoint."
Tool-Specific Patterns
Common patterns the journal recognizes:
| Tool | Pattern | Hint |
|---|---|---|
bash | Command not found | "The command may not be installed. Try checking with which first." |
bash | Permission denied | "Try running with appropriate permissions or checking file ownership." |
web_fetch | Repeated timeouts on same domain | "This URL has timed out before. Check if the service is available." |
read | File not found | "Verify the file path exists. Use glob or dir_list to find the correct path." |
write | Permission denied | "The target directory may not be writable. Check permissions with file_info." |
browser_navigate | Connection refused | "The target URL may not be running. Check if the server is started." |
Reflexion Hints
When the agent is about to use a tool that has failed before, the reflexion_hint() function generates a contextual hint:
reflexion_hint("bash", failure_count) -> Option<String>The hint is injected into the conversation context as a system message:
[System] Tool guidance: The `bash` tool has failed 3 times in this session.
Recent failures were caused by: permission denied on /var/log/. Consider
checking permissions before attempting file operations in system directories.
This gives the agent specific, experience-based guidance without blocking the tool call. The agent can still use the tool — it just has better context about what might go wrong.
Viewing Tool Health
The ryvos health command displays aggregated statistics from the failure journal:
ryvos healthTool Success Failures Health Common Errors
bash 234 3 98.7% permission denied (2), command not found (1)
read 1,024 0 100.0% —
write 156 2 98.7% permission denied (2)
web_fetch 89 12 88.1% timeout (8), 404 (3), SSL error (1)
browser_navigate 45 8 84.9% connection refused (5), timeout (3)
edit 312 1 99.7% pattern not found (1)
memory_search 567 0 100.0% —
grep 890 0 100.0% —
Research Background
The reflexion pattern is based on published research:
- Reflexion: Language Agents with Verbal Reinforcement Learning — GPT-4 with reflexion: 91% task completion vs 80% without. No model fine-tuning needed.
- The key insight: storing failure experiences as natural language and loading them into context is sufficient for learning. The model's weights do not change; only its context does.
- Memory quality matters — Studies show that strict curation (removing low-quality lessons) yields 10% improvement. Bad failure records can create error loops where the agent avoids correct approaches.
Integration with Safety Memory
The failure journal feeds into the broader safety memory system:
Failure Journal (tool-level)
│
├── Tool health statistics
├── Reflexion hints for context
│
▼
Safety Memory (agent-level)
│
├── Safety lessons (corrective rules)
├── Outcome assessment
└── Constitutional AI principles
Tool-level failures contribute to agent-level safety lessons. If a bash command causes data loss, that becomes both a tool-level failure record and an agent-level safety lesson.
Configuration
The failure journal is always active. There is no configuration needed — it starts recording from the first tool failure.
Related settings:
[agent]
log = "l3" # L3 logging captures per-tool execution details
[agent.guardian]
enabled = true
doom_loop_threshold = 3 # Consecutive identical failures trigger guardianStorage
Failure data is stored in the main SQLite database (~/.ryvos/ryvos.db):
-- Query failure patterns
SELECT tool_name, error_message, COUNT(*) as occurrences
FROM failure_journal
GROUP BY tool_name, error_message
ORDER BY occurrences DESC;
-- Recent failures
SELECT tool_name, input_summary, error_message, timestamp
FROM failure_journal
ORDER BY timestamp DESC
LIMIT 20;Next Steps
- Guardian Watchdog — Runtime monitoring and intervention
- Self-Learning Safety — How failures feed into safety lessons
- Audit Trail — Complete action logging