r1 Agentic API — The Wire Surface for External Agents

r1 Agentic API — The Wire Surface for External Agents

> Every action a human can take through a UI MUST have a documented, idempotent, schema-validated agent equivalent reachable through MCP. The UI is a view over the API; never the reverse.

This document is the contract between the r1 daemon and external agents (Claude Code, Codex CLI, Stagehand, browser-use, custom MCP clients). If a UI button cannot be reached through this API, the UI button is broken — not the API.

1. Audience and promise

Audience: agent authors integrating with r1 over MCP, including but not limited to:

Promise: the catalog described here is the only surface r1 exposes for programmatic control. Each tool is schema-validated, idempotent on its declared idempotency key, and never returns a raw Go error string — every error carries one of the 10 codes from internal/stokerr/.

2. Wire protocol

Connecting from Claude Code

{
  "mcpServers": {
    "r1": {
      "command": "r1",
      "args": ["mcp", "serve"]
    }
  }
}

Connecting from Codex CLI

[mcp_servers.r1]
command = "r1"
args = ["mcp", "serve"]

3. Tool catalog (38 tools across 10 categories)

The full catalog is generated from internal/mcp/r1servercatalog.go via:

r1 mcp serve --print-tools --markdown > docs/AGENTIC-API-CATALOG.md

Sections (counts):

CategoryToolsSection
Sessions6r1.session.{start,send,cancel,list,get,resume}
Lanes5r1.lanes.{list,subscribe,get,kill,pin}
Cortex5r1.cortex.{notes,publish,lobeslist,lobepause,loberesume}
Missions4r1.mission.{create,list,cancel,get}
Worktrees4r1.worktree.{list,diff,merge,clean}
Bus2r1.bus.{tail,replay}
Verify3r1.verify.{build,test,lint}
TUI4r1.tui.{presskey,snapshot,getmodel,focuslane}
Web4r1.web.{navigate,click,fill,snapshot} (Playwright MCP wrappers)
CLI1r1.cli.invoke

The full per-tool JSON schemas are in specs/agentic-test-harness.md §4. Run r1 mcp serve --print-tools for the live machine-readable form.

4. Streaming and replay

Two tools stream:

since_seq semantics: a 0 value replays from the start of the session; any positive integer replays from that exact sequence number. Reconnect-resume:

1. Client records the highest seq seen.

2. On disconnect (network blip, daemon restart), client reconnects with since_seq = lastSeen + 1.

3. Server replays missed events from the WAL, then resumes live tailing.

This is the D-D3 (durable replay) contract; clients that ignore it will see gaps under restart.

5. Idempotency rules

Every mutation tool is idempotent on the documented key:

ToolIdempotency keyBehavior
r1.session.send(sessionid, clientmessageid)Re-sending the same messageid is a no-op; returns the original messageid.
r1.lanes.kill(sessionid, laneid)Killing a lane that is already cancelled returns ok=true with status=cancelled; no error.
r1.lanes.pin(sessionid, laneid, pinned)Re-pinning an already-pinned lane is a no-op.
r1.mission.cancelmission_idRe-cancelling is a no-op.
r1.worktree.cleanworktree_idCleaning an already-cleaned worktree returns ok=true.
r1.cortex.lobepause/resume(session_id, lobe)Idempotent on the target state.

Tools NOT in this table (e.g. r1.session.cancel, r1.mission.create) are NOT idempotent — calling them twice has user-visible side effects (a new mission_id, etc.).

6. Error envelope

Every tool response wraps in the Slack-style envelope from internal/mcp/envelope.go:

{
  "ok": false,
  "error_code": "not_found",
  "error_message": "session s-9 not found",
  "links": {
    "self": "r1.session.cancel",
    "related": ["r1.session.list"],
    "deprecations": []
  }
}

The error_code is one of the 10 stokerr/ taxonomy values:

CodeMeaning
validationMalformed input (missing field, bad shape)
not_foundResource ID does not exist
conflictConcurrent-mutation collision or state precondition
appendonlyviolationAttempt to mutate immutable storage
permission_deniedRBAC or sandbox policy blocked the call
budget_exceededCost/token cap reached
timeoutDeadline tripped
crash_recoveryState restored from checkpoint; partial replay possible
schema_versionData-format mismatch; migration required
internalUnexpected invariant — bug, file an issue

See internal/mcp/stokerr_map.go for the mapping rules from arbitrary Go errors to taxonomy codes.

7. Capability flags

Untrusted agents are read-only by default:

8. Test harness (this spec)

External agents write *.agent.feature.md files under tests/agent/ to assert behavior. The format is documented in specs/agentic-test-harness.md §6; in short:

## Scenario: User sends a message and sees a streamed response

- Given a fresh r1d daemon at "http://127.0.0.1:3948"
- When I fill the textbox with name "Message" with "ping"
- And I click the button with name "Send"
- Then within 5 seconds the chat log contains an assistant message matching "pong|ping"

Run the suite:

make agent-features              # execute every scenario
make agent-features-update       # re-record golden a11y snapshots
make agent-features-drift-check  # CI guard against accidental updates

The runner at tools/agent-feature-runner walks tests/agent/**/*.agent.feature.md, dispatches each step against the r1.* catalog (heuristics in dispatcher/heuristics.go, per-file overrides via ## Tool mapping), and writes failure context to .agent-failures/<scenario>/ per spec 8 §10.

9. UI-author guide

If you are adding a UI button (React, Bubble Tea, or Tauri), the lint at tools/lint-view-without-api/ will fail your PR unless:

- React: data-testid="...".

- Bubble Tea: implements tui.A11yEmitter with a non-empty StableID().

- Tauri: #[tauri::command] with a doc-comment mcp_tool annotation.

- React: data-mcp_tool="r1.lanes.kill" (or via the Storybook parameters.agentic.actionables block).

- Bubble Tea: emit the tool name in the case branch's a11y state (e.g. state["mcp_tool"] = "r1.lanes.kill").

- Tauri: /// mcp_tool: r1.lanes.kill in the function's doc comment.

Adding a new UI surface without a corresponding MCP tool is a build break.

Adding a new MCP tool without a UI surface is a WARN (catalog tools should be reachable from a human-visible surface unless explicitly headless-only in tools/lint-view-without-api/allowlist.yaml).

10. Versioning and deprecation

11. Examples

Claude Code MCP config

{
  "mcpServers": {
    "r1": { "command": "r1", "args": ["mcp", "serve"] }
  }
}

Codex CLI MCP config

[mcp_servers.r1]
command = "r1"
args = ["mcp", "serve"]

Stagehand snippet

import { Stagehand } from "@browserbasehq/stagehand";
const sh = new Stagehand({
  mcpClient: { command: "r1", args: ["mcp", "serve"] },
});
await sh.act({ tool: "r1.session.start", args: { workdir: "/tmp/demo" } });

browser-use snippet

from browser_use import Agent
agent = Agent(
    mcp_servers={"r1": {"command": "r1", "args": ["mcp", "serve"]}},
)
agent.run("send 'ping' to a fresh r1 session")

12. Non-goals


For the per-tool input schemas in machine-readable form, run:

r1 mcp serve --print-tools           # JSON
r1 mcp serve --print-tools --markdown # this document's tool catalog section

For the source of truth, see specs/agentic-test-harness.md.

Pages in this directory