Ryvos

The agent loop is the core execution engine of Ryvos. It implements the ReAct (Reason + Act) pattern where the LLM alternates between reasoning about the task and executing tools to make progress.

Entry Points

The AgentRuntime provides three ways to start an agent run:

// Simple run — just a prompt
runtime.run(&session_id, "fix the bug in auth.rs").await?;
 
// Goal-driven run — with success criteria and constraints
runtime.run_with_goal(&session_id, "fix the bug", goal).await?;
 
// Director run — LLM plans and executes a multi-step graph
runtime.run_with_director(&session_id, "refactor the module", &goal).await?;

The ReAct Loop

Every run follows this loop, executing up to max_turns iterations (default: 25) within max_duration_secs (default: 600):

┌─────────────────────────────────────────────────────┐
│                    ReAct Loop                        │
│                                                      │
│  1. Load session history from SQLite                 │
│  2. Append user message                              │
│  3. Build context (3 layers)                         │
│                                                      │
│  ┌────── Loop (max_turns) ─────────────────────┐    │
│  │                                              │    │
│  │  4. Estimate token count                     │    │
│  │  5. Prune if over budget                     │    │
│  │  6. Stream LLM response                      │    │
│  │  7. Parse tool calls from response            │    │
│  │  8. If no tool calls → final answer, break    │    │
│  │  9. Security gate each tool call              │    │
│  │ 10. Execute tools (parallel if independent)   │    │
│  │ 11. Append tool results to history            │    │
│  │ 12. Guardian watchdog check                   │    │
│  │ 13. Judge evaluation (if goal defined)        │    │
│  │ 14. Memory flush check (at 85% budget)        │    │
│  │                                              │    │
│  └──────────────────────────────────────────────┘    │
│                                                      │
│ 15. Save checkpoint                                  │
│ 16. Record costs                                     │
│ 17. Emit RunComplete event                           │
└─────────────────────────────────────────────────────┘

Step-by-Step

1. Load History

Session history is loaded from the SqliteStore. If --resume <session_id> is used, the previous conversation is restored including all tool calls and results.

2-3. Message and Context Building

The user message is appended, then the context stack is built. See Context Management for the full 3-layer model.

4-5. Token Estimation and Pruning

Ryvos estimates the token count of the full message history using tiktoken:

estimate_message_tokens(messages) → total tokens

If total > max_context_tokens, the prune_to_budget() function removes older messages while keeping the most recent min_tail messages. If enable_summarization is true, Ryvos first summarizes the removed messages into a compact summary message.

6. LLM Streaming

The message history plus tool definitions are sent to the LLM via chat_stream(). The response arrives as a stream of StreamDelta events:

TextDelta("I'll check")  →  TextDelta(" the file")  →  ToolUseStart{name: "read"}
  →  ToolInputDelta('{"path":')  →  ToolInputDelta('"src/auth.rs"}')  →  Stop

Each delta is emitted on the EventBus for real-time display in the TUI, Web UI, and channels.

7-8. Tool Call Parsing

The streaming parser accumulates ToolUseStart and ToolInputDelta events into complete tool calls. Each tool call has:

id — Unique identifier (from the LLM)
name — Tool name (e.g., read, bash, memory_search)
input — JSON parameters

If the LLM response contains no tool calls (just text), the loop ends and the text is returned as the final answer.

9. Security Gate

Every tool call passes through the constitutional safety system before execution:

The tool is classified by tier (T0-T4) for audit and context purposes
For bash calls, injection detection runs a regex scan for dangerous patterns
Constitutional reasoning evaluates the action against the 7 safety principles
SafetyMemory provides experience-based context from prior decisions
No tool is ever silently blocked — every decision is reasoned and logged
The action proceeds with full audit logging

Unparseable bash commands are classified as T4 (critical), requiring explicit constitutional reasoning.

10. Tool Execution

Approved tool calls are executed. If parallel_tools is enabled in config, independent tool calls within the same turn are executed concurrently using tokio::join!.

Each tool execution:

Publishes a ToolStart event
Runs the tool's execute() method with a ToolContext
Publishes a ToolEnd event
Returns a ToolResult (content string + is_error flag)

Long tool outputs are compacted via compact_tool_output() to stay within token limits.

11. Result Appending

Tool results are appended to the message history as ToolResult content blocks, paired with the original ToolUse blocks. This gives the LLM full visibility into what happened.

12. Guardian Watchdog

After each turn, the Guardian checks for problems:

Doom loop: Are the last N tool calls identical? (fingerprinting based on tool name + input hash)
Stall: Has there been no meaningful progress for stall_timeout_secs?
Token budget: Are we approaching the soft or hard limits?
Dollar budget: Has the monthly or per-run cost limit been reached?

If a problem is detected, the Guardian injects a corrective hint into the conversation (e.g., "You've called the same tool 3 times with identical arguments. Try a different approach.").

13. Judge Evaluation

If the run has a defined goal, the Judge evaluates progress:

Level 0 (fast): Checks deterministic criteria (OutputContains, OutputEquals)
Level 2 (slow): Sends the full conversation + goal to the LLM for evaluation

Returns a Verdict:

Accept — Goal is met, stop the loop
Retry — Not yet met, continue with a hint
Escalate — Cannot be met, report failure
Continue — In progress, keep going

14. Memory Flush

At 85% of the token budget, Ryvos triggers a memory flush. The agent is prompted to:

Extract important facts from the conversation
Write them to persistent memory via memory_write
The old messages are then safely prunable

This ensures that important context survives even as the conversation is compacted.

Streaming Flow

From the user's perspective, responses stream in real time:

User sends message
    ↓
[streaming begins]
    "I'll analyze the file..."     ← TextDelta events
    [tool: read src/auth.rs]       ← ToolStart event
    [246 lines read]               ← ToolEnd event
    "Found 3 issues..."            ← TextDelta events
    [tool: edit src/auth.rs]       ← ToolStart event
    [file updated]                 ← ToolEnd event
    "Fixed the issues. Here's..."  ← TextDelta events
[streaming ends]

Every delta is broadcast on the EventBus, allowing the TUI, Web UI, and channels to display progress in real time.

Checkpointing

After each turn, the CheckpointStore saves the current state to SQLite:

checkpoint.save(session_id, turn_number, messages, tools_called).await?;

This enables crash recovery. If Ryvos crashes mid-run, you can resume:

ryvos run --resume <session_id> "continue"

The checkpoint store uses SQLite WAL mode for crash-safe writes with no data loss.

Cost Tracking

Every LLM call emits a CostEvent with input/output token counts. The CostStore records:

Per-run costs (for per-run budget limits)
Per-session costs (for reporting)
Per-date costs (for monthly budget enforcement)

Cost estimation uses the pricing.rs module with per-model rates, overridable in config.

Run Logging

The RunLogger writes structured JSONL logs at three levels:

Level	Content	When
L1	Run summary (session, prompt, result, cost, duration)	End of run
L2	Per-turn details (messages, tool calls, token counts)	Each turn
L3	Per-step execution (tool input/output, timing)	Each tool call

Logs are stored in ~/.ryvos/logs/<session_id>/ and are crash-safe (immediate flush).

Configuration

Key agent loop settings in config.toml:

[agent]
max_turns = 25                    # Maximum ReAct loop iterations
max_duration_secs = 600           # Hard timeout per run
max_context_tokens = 32000        # Context window budget
parallel_tools = true             # Execute independent tools concurrently
enable_summarization = true       # Summarize old messages when pruning
 
[agent.checkpoint]
enabled = true                    # Save state after each turn

Next Steps

Director — Goal-driven orchestration with DAG execution
Context Management — The 3-layer onion model
Guardian Watchdog — Doom loop and budget monitoring