Skip to content

agent-stream

ai specs/ai/agent-stream.kmd

Specification body

Koder Agent Stream Protocol

Abstract

This specification defines the wire format for streaming an agent run from the gateway (services/ai/ai/gateway) to clients (CLI, web, mobile, desktop, third-party API consumers). It standardises seven event types that together describe reasoning, tool use, step boundaries, and run lifecycle — enough for UI surfaces to render a faithful timeline equivalent to Manus 1.6 Max, Devin 2.0, and Claude Code's transcript views.

The protocol is transport-agnostic: same event payload over Server-Sent Events (SSE), WebSocket, gRPC server-stream, or stored JSONL for replay (AICORE-117).

1. Design philosophy

  • Single envelope, typed payloads. Every event shares a small envelope (id, ts, type, run_id, child_id?) and discriminates by type. Consumers parse once.
  • Reasoning is a first-class segment, separate from text. Anthropic-style thinking blocks need their own renderer (collapsed by default, smaller font, italic). Text and reasoning never intermix in the same event.
  • Step boundaries are observable. AICORE-117 (step replay) and AICORE-121 (checkpoints) both need to know where steps end.
  • Tool calls have request + result events. Same call_id correlates them, even when the result arrives N events later.
  • Multi-child support from day one. AIGW-051 parallel dispatch embeds child_id so parent streams can intercalate sub-runs.
  • Bounded throughput. Server agreggates to ≤ 10 events/sec per run by default; clients can request raw via ?detail=full.

2. Envelope

Every event JSON is a single object:

{
  "id":       "evt_01HZ7K4P...",
  "ts":       "2026-05-14T19:42:13.124Z",
  "type":     "reasoning.delta",
  "run_id":   "run_01HZ7K3M...",
  "child_id": null,
  "seq":      127,
  "payload":  { /* type-specific */ }
}

Envelope fields

FieldTypeRequiredDescription
idstringyesULID, monotonic per stream
tsRFC3339 stringyesserver-side timestamp
typeenumyesone of the 7 types in §3
run_idstringyesthe agent run this event belongs to
child_idstring|nullnonon-null when this event comes from a sub-run (AIGW-051)
seqintyesper-run monotonic sequence; gap = dropped event
payloadobjectyestype-specific shape per §3

3. The 8 event types

3.1 reasoning.delta

Incremental "thinking" output — what model produces before/between visible text. Per Anthropic extended-thinking semantics.

{ "type": "reasoning.delta", "payload": { "text": "Let me check..." } }

Multiple deltas concatenate to form the full reasoning block of the current step. Clients render in a collapsed block (default closed), smaller font, italic. Reasoning is not assistant-visible output — treat as scratch.

3.2 text.delta

Incremental assistant-visible text. Concatenates to the message body.

{ "type": "text.delta", "payload": { "text": "I'll compare the three..." } }

Tool-output streaming (AICORE-137c). Tools that implement StreamingTool (Go interface { Tool; CallStream(ctx, input, chunks chan<- string) (output, error) }) cause the runtime to emit one text.delta per chunk between tool.start and tool.end. The final tool.end.output still carries the canonical JSON. Clients SHOULD render the deltas inline (just like model-produced text) without distinguishing tool-streamed text from model-streamed text — from the consumer's perspective, both are assistant-visible content arriving incrementally.

3.3 tool.start

A tool invocation begins.

{
  "type": "tool.start",
  "payload": {
    "call_id":  "call_01HZ7K5R...",
    "tool":     "browser.navigate",
    "input":    { "url": "https://example.com" },
    "skill_id": "compare-products@0.2.0"
  }
}

skill_id (optional) marks the call as originating from a TOOLS-013 skill (versioned).

3.4 tool.end

The corresponding tool invocation completed.

{
  "type": "tool.end",
  "payload": {
    "call_id":  "call_01HZ7K5R...",
    "ok":       true,
    "output":   { "url": "https://example.com", "title": "Example" },
    "error":    null,
    "duration_ms": 1842
  }
}

When ok=false, error carries a KAI-* code per specs/errors/user-facing-messages.kmd. The output field MAY be omitted in error cases or when output is large (then a blob_ref points to kdrive).

3.5 step.boundary

The agent finished a step (a coherent unit of reasoning + tool calls

  • text). Triggers checkpoint capture (AICORE-121) and step-replay indexing (AICORE-117).
{
  "type": "step.boundary",
  "payload": {
    "step_index":  4,
    "step_kind":   "tool-roundtrip",
    "checkpoint_id": "ckpt_01HZ7K6T..."
  }
}

step_kind is one of: plan, tool-roundtrip, text-only, fan-out (parent emitting sub-runs), fan-in (parent merging results), done.

3.6 child.spawn

Parent run is dispatching a sub-run (AIGW-051).

{
  "type": "child.spawn",
  "payload": {
    "child_id": "run_01HZ7K7U...",
    "prompt": "...",
    "tools_allowed": ["browser", "web.extract"]
  }
}

After spawn, events from the child stream appear in the parent stream with child_id set. Consumers can group by child_id to render parallel tabs (Cursor 3 / Replit Agent 4 style).

3.8 plan.proposal (added v0.2.0, AICORE-126)

The agent generated a plan for human review. Emitted when the run was started with planning_gate=required or planning_gate=preview. For required, the run pauses in awaiting_approval until the user calls POST /v1/agent/runs/:id/approve_plan.

{
  "type": "plan.proposal",
  "payload": {
    "plan": {
      "id": "plan_01HZ...",
      "run_id": "run_01HZ7K3M...",
      "steps": [
        { "id": "s1", "title": "Research candidate libraries",
          "intent": "research", "est_tools": ["web.search"],
          "est_cost_usd": 0.02 },
        { "id": "s2", "title": "Write benchmark harness",
          "intent": "write", "est_tools": ["file.write", "shell"],
          "est_cost_usd": 0.05 }
      ],
      "est_total_cost_usd": 0.07
    }
  }
}

step.intent is one of: research, write, edit, execute, verify, report, other. UIs render an icon per intent. Cost estimation per step comes from services/foundation/billing token history.

3.7 run.lifecycle

Lifecycle state transition. One event per state change.

{
  "type": "run.lifecycle",
  "payload": {
    "state": "running",
    "reason": null
  }
}

State values:

StateWhen
planningrun accepted, plan being generated (AICORE-126)
awaiting_approvalplan proposed, waiting for user (AICORE-126)
runningactively executing
pauseduser paused (AICORE-122)
redirectinguser issued redirect (AICORE-122)
donecompleted successfully; final state
aborteduser cancelled; final state
errorunrecoverable failure; final state

reason is a human-readable string explaining the transition (e.g. "user clicked pause", "tool browser failed 3 times").

4. Throughput shaping (R1–R4)

R1 — Default aggregation

Server batches reasoning.delta and text.delta events to at most 10 events/sec per run by default. Aggregation window: 100ms. Multiple deltas within the window concatenate text.

R2 — Raw mode opt-in

Client may request ?detail=full (SSE) or send a Detail: full header (WebSocket handshake) to disable aggregation. Returns every underlying token boundary. Use for debugging, replay validation, or animations that need character-level granularity.

R3 — tool.start / tool.end / step.boundary are never batched

These are control-plane signals; deliver immediately even under aggregation pressure.

R4 — Backpressure

If client's read buffer fills, server drops reasoning.delta first, then text.delta, never tool/step/lifecycle. Dropped events are counted in a final run.lifecycle with state=done payload extending with dropped_count.

5. Transport profiles

5.1 SSE (HTTP/2 server-sent events) — primary

  • Endpoint: GET /v1/agent/runs/:id/stream
  • Content-Type: text/event-stream
  • One event per data: line, JSON-encoded
  • Reconnect with Last-Event-ID: <seq> header; server replays from seq+1
  • Heartbeat: empty : keepalive\n\n every 15s

5.2 WebSocket — for browser when needed

  • Endpoint: GET /v1/agent/runs/:id/stream with Upgrade: websocket
  • Binary frames forbidden; only text JSON
  • Same envelope; client opts in via subprotocol koder-agent-stream-v1

5.3 gRPC server-stream — internal

  • Service agent.v1.AgentRun method Stream(StreamRequest) returns (stream Event)
  • Same proto3 shape as JSON

5.4 JSONL replay — offline

  • One JSON event per line, id ordered ascending
  • Stored in kdrive blob per agent_runs.replay_uri
  • Backs AICORE-117 step-replay UI

6. Persistence

The full event stream is persisted in kdb (agent_run_events table) multi-tenant per policies/multi-tenant-by-default.kmd:

ColumnTypeNotes
idULIDPK
run_idULIDFK
koder_user_idstringtenant
child_idULID?when from sub-run
seqint64per-run monotonic
typeenumone of §3
tstimestampserver-side
payloadjsonbtype-specific

Retention: 90 days for completed runs (overrideable per workspace); 30 days for aborted/error. Per policies/identity-data-retention.kmd spirit — but this is operational data, not auth.

7. Test contract (T1–T5)

  • T1 — Round-trip: marshal → unmarshal → equal (Go + JS clients)
  • T2 — Ordering: client receiving events out-of-seq detects gap
  • T3 — Reconnect: send Last-Event-ID, get events ≥ seq+1
  • T4 — Throughput: 1000 deltas/run aggregated to ≤10/s by default
  • T5 — Multi-child: parent stream intercalates child events without corrupting ordering per child_id

Templates live in specs/ai/agent-stream-test-template.kmd (to be authored as sub-ticket of AICORE-120).

8. Error codes

Per specs/errors/user-facing-messages.kmd, namespace KAI-STREAM-:

CodeMeaning
KAI-STREAM-CONN-0001client disconnected mid-stream
KAI-STREAM-CONN-0002reconnect Last-Event-ID too old (replay window expired)
KAI-STREAM-PROTO-0001malformed event in raw mode
KAI-STREAM-PROTO-0002unknown event type
KAI-STREAM-BUF-0001backpressure drop occurred

9. Compatibility

  • OpenAI streaming API — superset; text.delta maps to OpenAI's content delta; we add reasoning + tool/step events
  • Anthropic streaming API — direct mapping for reasoning.delta (thinking blocks) and text.delta; tool events different
  • SSE general — strict subset; consumable by EventSource in any browser

10. References

  • AICORE-117 (step replay — consumes step.boundary + JSONL persistence)
  • AICORE-121 (checkpoints — consumes step.boundary)
  • AICORE-122 (pause/redirect — emits run.lifecycle)
  • AICORE-123 (follow-ups — runs after run.lifecycle state=done)
  • AICORE-126 (planning gate — emits planning/awaiting_approval/running)
  • AIGW-051 (parallel — emits child.spawn + child_id)
  • KIT-030 (KAiThinking, KAiToolCall, KAiStepBoundary widgets)
  • WEBKIT-010 (web custom elements)
  • specs/errors/user-facing-messages.kmd
  • policies/multi-tenant-by-default.kmd