AI agent step trace

ai-ui specs/ai-ui/agent-step-trace.kmd

Vertical step list + per-step expand (tool call, args, result, duration) + plan-and-execute (plan pre-render before run) + time-travel replay + branching for multi-step AI agents. Companion to OpenTelemetry trace export via services/ai/trace. Consumed by Kortex, Kode, Bot agents.

Quando esta spec se aplica

Triggers primários

Display agent step-by-step reasoning + tool calls

Todos os triggers

Render multi-step agent execution in any Koder client
Implement plan-and-execute agent (LangChain-style)
Add observability/debug UI for agent runs

Spec — AI agent step trace

Consumer principal: surfaces que executam agents (Kortex agent mode, Kode platform, Bot autonomous flows). Hosts inline tool cards (mcp-tool-invocation.kmd) e thinking blocks (thinking-state.kmd) per step. Implementation ticket pra web side: services/ai/ai#117 (já aberto).

Princípios

Plan pre-render — quando agent expõe plan upfront (plan-and-execute), render todos os steps imediatamente; statuses populam during execution.
Linear flow + branching opcional — vertical list default; branching gesture cria forks.
Step is self-contained — cada step encapsula tool call + thinking + result + duration.
Replay-able — qualquer ponto da trace é point-in-time reconstructable.

R1 — Anatomia

Vertical step list:

┌──────────────────────────────────────┐
│ ✓ 1 · Search documentation           │
│       Duration: 1.2s                 │
│       ▸ tools.search('Koder Stack')  │
├──────────────────────────────────────┤
│ ✓ 2 · Read top result                │
│       Duration: 0.4s                 │
│       ▸ tools.fetch_url(...)         │
├──────────────────────────────────────┤
│ ⚙ 3 · Synthesize answer              │
│       (running…)                     │
├──────────────────────────────────────┤
│ ⏳ 4 · Format response                │
│       (waiting)                      │
└──────────────────────────────────────┘

Per step:

Slot	Conteúdo
Status icon	per R2
Number	step ordinal (1, 2, 3, …)
Title	one-line description (from agent plan OR auto-derived)
Duration	when complete; live timer when running
Body	(collapsed by default) tool call card + thinking block + result

Tap step header → expand body (R3).

R2 — Status states

State	Icon	Color (per `color-roles.kmd`)
pending	⏳	`text-muted` (waiting in queue)
running	⚙	`accent` (active, breathing dot)
done	✓	`success`
error	✗	`error` (with retry option in body)
skipped	⊘	`text-subtle` (plan said no-op or human-deferred)
paused	⏸	`warning` (user paused, awaiting resume)

Transitions:

pending → running → done | error | skipped
running → paused (manual) → running | error

R3 — Step body (expanded)

┌──────────────────────────────────────┐
│ ✓ 1 · Search documentation           │
│       Duration: 1.2s                 │
├──────────────────────────────────────┤
│ ▼ Thinking (cross-link #107)         │
│   "I need to find recent docs about  │
│    Koder Stack architecture..."      │
├──────────────────────────────────────┤
│ ▼ Tool call (cross-link #100)        │
│   [mcp-tool-card: tools.search]      │
├──────────────────────────────────────┤
│ ▼ Result                             │
│   Found 5 results...                 │
└──────────────────────────────────────┘

Sub-blocks expandable independent:

Thinking — reasoning trace via thinking-state.kmd (#107).
Tool call — MCP tool card via mcp-tool-invocation.kmd (#100).
Result — text or content array (per chat-message-bubble.kmd R5 renderers).

R4 — Plan pre-render

When agent supports plan-and-execute pattern, expose plan via gateway:

{
  "agent_run": {
    "id": "...",
    "plan": [
      {"id": "1", "title": "Search documentation", "tool_hint": "tools.search"},
      {"id": "2", "title": "Read top result", "tool_hint": "tools.fetch_url"},
      {"id": "3", "title": "Synthesize answer"},
      {"id": "4", "title": "Format response"}
    ],
    "current_step_id": "3",
    "status": "running"
  }
}

Client renders all 4 steps immediately with statuses (1+2 done, 3 running, 4 pending). User sees "we're 75% through" visually.

Plans MAY mutate at runtime (agent adds/removes steps mid-run); client renders new steps with smooth insertion animation per themes/motion.kmd R9 spring.

R5 — Replay / time-travel

Each step has unique id + timestamp + sequence number. Trace exportable as OpenTelemetry-compatible spans via services/ai/trace/.

Replay UI:

Pause/Play controls at bottom of trace.
Scrubber: drag to specific step → UI rewinds to that point-in-time (subsequent steps gray out as "future").
Branching: at any past step, "Fork from here" creates new conversation/agent-run starting fresh from that state. Cross-link chat-message-bubble.kmd R3.2 Branch action.

R6 — Branching

Forking semantics:

Selecting "Fork from here" at step N: clone state up to step N; new agent run begins with step N+1 (or user prompt-injection).
Forks tracked in conversation history (conversation-history.kmd #115) as separate entries with parent reference.
Visual: forks shown as tree below original trace; user can navigate between branches.

R7 — Surface bindings

Surface	API
Flutter	`KoderAgentStepTrace({required steps, onStepTap, onFork, onReplay})` em `koder_kit/lib/src/ai/agent_step_trace.dart`
Web	`<koder-agent-step-trace agent-run-id="...">` em `koder_web_kit` (impl tracker: `services/ai/ai#117`)
Compose Android	`KoderAgentStepTrace` em `koder-design-compose` (futuro)
SwiftUI iOS	idem em `koder-design-swift` (futuro)
CLI / TUI	Vertical list em terminal; `bubbletea` expansão; replay via arrow keys

R8 — Acessibilidade

Step list: <ol> (semanticamente ordered).
Status icons: aria-label per state ("Done", "Running", "Waiting", etc.).
Step body: <details><summary> em Web; equivalente em Flutter/Compose.
Live region: aria-live="polite" announces step state changes ("Step 3 done").
Keyboard: Tab cycle, Enter expands, arrow keys navigate replay scrubber.
Reduced-motion: skip insertion animation; smooth scrubbing replaced by stepped jumps.

R9 — Multi-tenant + persistência

Agent run + steps scoped per (koder_user_id, workspace_id, conversation_id, run_id).
Tool calls em steps respeitam permissions per mcp-permission-prompt.kmd.
Forks: parent_run_id link mantido cross-fork; cascade delete per identity-data-retention.kmd.
Audit log: every step transition emit event pra services/foundation/audit/.

R10 — i18n

Key	en-US	pt-BR
`ai.agent.step.status.pending`	"Waiting"	"Aguardando"
`ai.agent.step.status.running`	"Running"	"Em execução"
`ai.agent.step.status.done`	"Done"	"Concluído"
`ai.agent.step.status.error`	"Error"	"Erro"
`ai.agent.step.status.skipped`	"Skipped"	"Ignorado"
`ai.agent.step.status.paused`	"Paused"	"Pausado"
`ai.agent.step.action.fork`	"Fork from here"	"Ramificar daqui"
`ai.agent.step.action.pause`	"Pause"	"Pausar"
`ai.agent.step.action.resume`	"Resume"	"Continuar"
`ai.agent.step.action.replay`	"Replay"	"Repetir"
`ai.agent.step.duration`	"Duration: {time}"	"Duração: {time}"
`ai.agent.run.complete`	"Agent run complete in {time}"	"Execução do agente concluída em {time}"

R11 — Per-preset variation

Preset	Trace style
`material3` / `material_expressive`	Default vertical list; spring insertion
`terminal_classic`	Plain text: `[1] ✓ Search` linhas
`brutalist`	Sharp dividers; no animation
`cyberpunk_neon`	Glow active step; neon connectors entre steps
`minimalist_mono`	Mono font; spartan dividers

T-suite

T1 Mount: render trace com 4 steps; statuses iniciais corretos.
T2 Transitions: emit step state events → assert UI atualiza per R2.
T3 Expand step: tap header → body visible com thinking + tool call + result sub-blocks.
T4 Plan pre-render: receive plan with 4 steps → all 4 rendered immediately.
T5 Plan mutation: agent adds step 5 mid-run → smooth insertion (spring).
T6 Replay scrubber: drag to step 2 → steps 3+4 gray out; UI shows state at step 2.
T7 Fork: tap "Fork from here" at step 2 → new conversation entry created with parent_run_id link.
T8 Pause/resume: tap pause → step running → paused; tap resume → continues.
T9 Error retry: step error → retry button → step re-runs.
T10 OpenTelemetry export: trace export produces valid OTLP spans (validation via opentelemetry-cli).
T11 A11y: screen reader announces step transitions; keyboard nav functional.
N1 Cross-tenant: run from workspace A não visible em workspace B.
N2 Multi-tool permissions: step com tool denied via mcp-permission-prompt.kmd → step skipped; trace marks "Skipped (permission denied)".

Cross-link

Companion: mcp-tool-invocation.kmd, thinking-state.kmd, chat-message-bubble.kmd (bubble host trace as content item), conversation-history.kmd (fork tracking)
Backend: services/ai/agents/, services/ai/trace/ (OpenTelemetry export)
Impl ticket web: services/ai/ai/backlog/pending/117-web-agent-run-step-replay.md
Policies: multi-tenant-by-default.kmd, identity-data-retention.kmd
Motion: themes/motion.kmd R9 spring (insertion animation)
Refs: LangChain agent observability, fuselabcreative agent UX 2026

Referências

meta/docs/stack/specs/ai-ui/mcp-tool-invocation.kmd
meta/docs/stack/specs/ai-ui/thinking-state.kmd
meta/docs/stack/specs/ai-ui/chat-message-bubble.kmd
meta/docs/stack/policies/multi-tenant-by-default.kmd