agent-stream
Corpo da especificação
Koder Agent Stream Protocol
Abstract
This specification defines the wire format for streaming an agent run
from the gateway (services/ai/ai/gateway) to clients (CLI, web,
mobile, desktop, third-party API consumers). It standardises seven
event types that together describe reasoning, tool use, step
boundaries, and run lifecycle — enough for UI surfaces to render a
faithful timeline equivalent to Manus 1.6 Max, Devin 2.0, and Claude
Code's transcript views.
The protocol is transport-agnostic: same event payload over Server-Sent Events (SSE), WebSocket, gRPC server-stream, or stored JSONL for replay (AICORE-117).
1. Design philosophy
- Single envelope, typed payloads. Every event shares a small
envelope (id, ts, type, run_id, child_id?) and discriminates by
type. Consumers parse once. - Reasoning is a first-class segment, separate from text. Anthropic-style thinking blocks need their own renderer (collapsed by default, smaller font, italic). Text and reasoning never intermix in the same event.
- Step boundaries are observable. AICORE-117 (step replay) and AICORE-121 (checkpoints) both need to know where steps end.
- Tool calls have request + result events. Same
call_idcorrelates them, even when the result arrives N events later. - Multi-child support from day one. AIGW-051 parallel dispatch
embeds
child_idso parent streams can intercalate sub-runs. - Bounded throughput. Server agreggates to ≤ 10 events/sec per
run by default; clients can request raw via
?detail=full.
2. Envelope
Every event JSON is a single object:
{
"id": "evt_01HZ7K4P...",
"ts": "2026-05-14T19:42:13.124Z",
"type": "reasoning.delta",
"run_id": "run_01HZ7K3M...",
"child_id": null,
"seq": 127,
"payload": { /* type-specific */ }
}
Envelope fields
| Field | Type | Required | Description |
|---|---|---|---|
id | string | yes | ULID, monotonic per stream |
ts | RFC3339 string | yes | server-side timestamp |
type | enum | yes | one of the 7 types in §3 |
run_id | string | yes | the agent run this event belongs to |
child_id | string|null | no | non-null when this event comes from a sub-run (AIGW-051) |
seq | int | yes | per-run monotonic sequence; gap = dropped event |
payload | object | yes | type-specific shape per §3 |
3. The 8 event types
3.1 reasoning.delta
Incremental "thinking" output — what model produces before/between visible text. Per Anthropic extended-thinking semantics.
{ "type": "reasoning.delta", "payload": { "text": "Let me check..." } }
Multiple deltas concatenate to form the full reasoning block of the current step. Clients render in a collapsed block (default closed), smaller font, italic. Reasoning is not assistant-visible output — treat as scratch.
3.2 text.delta
Incremental assistant-visible text. Concatenates to the message body.
{ "type": "text.delta", "payload": { "text": "I'll compare the three..." } }
Tool-output streaming (AICORE-137c). Tools that implement
StreamingTool (Go interface { Tool; CallStream(ctx, input, chunks chan<- string) (output, error) }) cause the runtime to emit
one text.delta per chunk between tool.start and tool.end. The
final tool.end.output still carries the canonical JSON. Clients
SHOULD render the deltas inline (just like model-produced text)
without distinguishing tool-streamed text from model-streamed text
— from the consumer's perspective, both are assistant-visible
content arriving incrementally.
3.3 tool.start
A tool invocation begins.
{
"type": "tool.start",
"payload": {
"call_id": "call_01HZ7K5R...",
"tool": "browser.navigate",
"input": { "url": "https://example.com" },
"skill_id": "compare-products@0.2.0"
}
}
skill_id (optional) marks the call as originating from a TOOLS-013
skill (versioned).
3.4 tool.end
The corresponding tool invocation completed.
{
"type": "tool.end",
"payload": {
"call_id": "call_01HZ7K5R...",
"ok": true,
"output": { "url": "https://example.com", "title": "Example" },
"error": null,
"duration_ms": 1842
}
}
When ok=false, error carries a KAI-* code per
specs/errors/user-facing-messages.kmd. The output field MAY be
omitted in error cases or when output is large (then a blob_ref
points to kdrive).
3.5 step.boundary
The agent finished a step (a coherent unit of reasoning + tool calls
- text). Triggers checkpoint capture (AICORE-121) and step-replay indexing (AICORE-117).
{
"type": "step.boundary",
"payload": {
"step_index": 4,
"step_kind": "tool-roundtrip",
"checkpoint_id": "ckpt_01HZ7K6T..."
}
}
step_kind is one of: plan, tool-roundtrip, text-only,
fan-out (parent emitting sub-runs), fan-in (parent merging
results), done.
3.6 child.spawn
Parent run is dispatching a sub-run (AIGW-051).
{
"type": "child.spawn",
"payload": {
"child_id": "run_01HZ7K7U...",
"prompt": "...",
"tools_allowed": ["browser", "web.extract"]
}
}
After spawn, events from the child stream appear in the parent stream
with child_id set. Consumers can group by child_id to render
parallel tabs (Cursor 3 / Replit Agent 4 style).
3.8 plan.proposal (added v0.2.0, AICORE-126)
The agent generated a plan for human review. Emitted when the run
was started with planning_gate=required or planning_gate=preview.
For required, the run pauses in awaiting_approval until the user
calls POST /v1/agent/runs/:id/approve_plan.
{
"type": "plan.proposal",
"payload": {
"plan": {
"id": "plan_01HZ...",
"run_id": "run_01HZ7K3M...",
"steps": [
{ "id": "s1", "title": "Research candidate libraries",
"intent": "research", "est_tools": ["web.search"],
"est_cost_usd": 0.02 },
{ "id": "s2", "title": "Write benchmark harness",
"intent": "write", "est_tools": ["file.write", "shell"],
"est_cost_usd": 0.05 }
],
"est_total_cost_usd": 0.07
}
}
}
step.intent is one of: research, write, edit, execute,
verify, report, other. UIs render an icon per intent. Cost
estimation per step comes from services/foundation/billing token
history.
3.7 run.lifecycle
Lifecycle state transition. One event per state change.
{
"type": "run.lifecycle",
"payload": {
"state": "running",
"reason": null
}
}
State values:
| State | When |
|---|---|
planning | run accepted, plan being generated (AICORE-126) |
awaiting_approval | plan proposed, waiting for user (AICORE-126) |
running | actively executing |
paused | user paused (AICORE-122) |
redirecting | user issued redirect (AICORE-122) |
done | completed successfully; final state |
aborted | user cancelled; final state |
error | unrecoverable failure; final state |
reason is a human-readable string explaining the transition
(e.g. "user clicked pause", "tool browser failed 3 times").
4. Throughput shaping (R1–R4)
R1 — Default aggregation
Server batches reasoning.delta and text.delta events to at most
10 events/sec per run by default. Aggregation window: 100ms.
Multiple deltas within the window concatenate text.
R2 — Raw mode opt-in
Client may request ?detail=full (SSE) or send a Detail: full
header (WebSocket handshake) to disable aggregation. Returns every
underlying token boundary. Use for debugging, replay validation, or
animations that need character-level granularity.
R3 — tool.start / tool.end / step.boundary are never batched
These are control-plane signals; deliver immediately even under aggregation pressure.
R4 — Backpressure
If client's read buffer fills, server drops reasoning.delta first,
then text.delta, never tool/step/lifecycle. Dropped events are
counted in a final run.lifecycle with state=done payload
extending with dropped_count.
5. Transport profiles
5.1 SSE (HTTP/2 server-sent events) — primary
- Endpoint:
GET /v1/agent/runs/:id/stream - Content-Type:
text/event-stream - One event per
data:line, JSON-encoded - Reconnect with
Last-Event-ID: <seq>header; server replays fromseq+1 - Heartbeat: empty
: keepalive\n\nevery 15s
5.2 WebSocket — for browser when needed
- Endpoint:
GET /v1/agent/runs/:id/streamwithUpgrade: websocket - Binary frames forbidden; only text JSON
- Same envelope; client opts in via subprotocol
koder-agent-stream-v1
5.3 gRPC server-stream — internal
- Service
agent.v1.AgentRunmethodStream(StreamRequest) returns (stream Event) - Same proto3 shape as JSON
5.4 JSONL replay — offline
- One JSON event per line,
idordered ascending - Stored in kdrive blob per
agent_runs.replay_uri - Backs AICORE-117 step-replay UI
6. Persistence
The full event stream is persisted in kdb (agent_run_events table)
multi-tenant per policies/multi-tenant-by-default.kmd:
| Column | Type | Notes |
|---|---|---|
id | ULID | PK |
run_id | ULID | FK |
koder_user_id | string | tenant |
child_id | ULID? | when from sub-run |
seq | int64 | per-run monotonic |
type | enum | one of §3 |
ts | timestamp | server-side |
payload | jsonb | type-specific |
Retention: 90 days for completed runs (overrideable per workspace);
30 days for aborted/error. Per policies/identity-data-retention.kmd
spirit — but this is operational data, not auth.
7. Test contract (T1–T5)
- T1 — Round-trip: marshal → unmarshal → equal (Go + JS clients)
- T2 — Ordering: client receiving events out-of-seq detects gap
- T3 — Reconnect: send Last-Event-ID, get events ≥ seq+1
- T4 — Throughput: 1000 deltas/run aggregated to ≤10/s by default
- T5 — Multi-child: parent stream intercalates child events
without corrupting ordering per
child_id
Templates live in specs/ai/agent-stream-test-template.kmd (to be
authored as sub-ticket of AICORE-120).
8. Error codes
Per specs/errors/user-facing-messages.kmd, namespace KAI-STREAM-:
| Code | Meaning |
|---|---|
KAI-STREAM-CONN-0001 | client disconnected mid-stream |
KAI-STREAM-CONN-0002 | reconnect Last-Event-ID too old (replay window expired) |
KAI-STREAM-PROTO-0001 | malformed event in raw mode |
KAI-STREAM-PROTO-0002 | unknown event type |
KAI-STREAM-BUF-0001 | backpressure drop occurred |
9. Compatibility
- OpenAI streaming API — superset;
text.deltamaps to OpenAI'scontentdelta; we add reasoning + tool/step events - Anthropic streaming API — direct mapping for
reasoning.delta(thinking blocks) andtext.delta; tool events different - SSE general — strict subset; consumable by
EventSourcein any browser
10. References
- AICORE-117 (step replay — consumes step.boundary + JSONL persistence)
- AICORE-121 (checkpoints — consumes step.boundary)
- AICORE-122 (pause/redirect — emits run.lifecycle)
- AICORE-123 (follow-ups — runs after run.lifecycle state=done)
- AICORE-126 (planning gate — emits planning/awaiting_approval/running)
- AIGW-051 (parallel — emits child.spawn + child_id)
- KIT-030 (
KAiThinking,KAiToolCall,KAiStepBoundarywidgets) - WEBKIT-010 (web custom elements)
specs/errors/user-facing-messages.kmdpolicies/multi-tenant-by-default.kmd