Pular para o conteúdo

AI agent step trace

ai-ui specs/ai-ui/agent-step-trace.kmd

Vertical step list + per-step expand (tool call, args, result, duration) + plan-and-execute (plan pre-render before run) + time-travel replay + branching for multi-step AI agents. Companion to OpenTelemetry trace export via services/ai/trace. Consumed by Kortex, Kode, Bot agents.

Quando esta spec se aplica

Triggers primários

Todos os triggers

Corpo da especificação

Spec — AI agent step trace

Consumer principal: surfaces que executam agents (Kortex agent mode, Kode platform, Bot autonomous flows). Hosts inline tool cards (mcp-tool-invocation.kmd) e thinking blocks (thinking-state.kmd) per step. Implementation ticket pra web side: services/ai/ai#117 (já aberto).

Princípios

  1. Plan pre-render — quando agent expõe plan upfront (plan-and-execute), render todos os steps imediatamente; statuses populam during execution.
  2. Linear flow + branching opcional — vertical list default; branching gesture cria forks.
  3. Step is self-contained — cada step encapsula tool call + thinking + result + duration.
  4. Replay-able — qualquer ponto da trace é point-in-time reconstructable.

R1 — Anatomia

Vertical step list:

┌──────────────────────────────────────┐
│ ✓ 1 · Search documentation           │
│       Duration: 1.2s                 │
│       ▸ tools.search('Koder Stack')  │
├──────────────────────────────────────┤
│ ✓ 2 · Read top result                │
│       Duration: 0.4s                 │
│       ▸ tools.fetch_url(...)         │
├──────────────────────────────────────┤
│ ⚙ 3 · Synthesize answer              │
│       (running…)                     │
├──────────────────────────────────────┤
│ ⏳ 4 · Format response                │
│       (waiting)                      │
└──────────────────────────────────────┘

Per step:

SlotConteúdo
Status iconper R2
Numberstep ordinal (1, 2, 3, …)
Titleone-line description (from agent plan OR auto-derived)
Durationwhen complete; live timer when running
Body(collapsed by default) tool call card + thinking block + result

Tap step header → expand body (R3).

R2 — Status states

StateIconColor (per color-roles.kmd)
pendingtext-muted (waiting in queue)
runningaccent (active, breathing dot)
donesuccess
errorerror (with retry option in body)
skippedtext-subtle (plan said no-op or human-deferred)
pausedwarning (user paused, awaiting resume)

Transitions:

pending → running → done | error | skipped
running → paused (manual) → running | error

R3 — Step body (expanded)

┌──────────────────────────────────────┐
│ ✓ 1 · Search documentation           │
│       Duration: 1.2s                 │
├──────────────────────────────────────┤
│ ▼ Thinking (cross-link #107)         │
│   "I need to find recent docs about  │
│    Koder Stack architecture..."      │
├──────────────────────────────────────┤
│ ▼ Tool call (cross-link #100)        │
│   [mcp-tool-card: tools.search]      │
├──────────────────────────────────────┤
│ ▼ Result                             │
│   Found 5 results...                 │
└──────────────────────────────────────┘

Sub-blocks expandable independent:

R4 — Plan pre-render

When agent supports plan-and-execute pattern, expose plan via gateway:

{
  "agent_run": {
    "id": "...",
    "plan": [
      {"id": "1", "title": "Search documentation", "tool_hint": "tools.search"},
      {"id": "2", "title": "Read top result", "tool_hint": "tools.fetch_url"},
      {"id": "3", "title": "Synthesize answer"},
      {"id": "4", "title": "Format response"}
    ],
    "current_step_id": "3",
    "status": "running"
  }
}

Client renders all 4 steps immediately with statuses (1+2 done, 3 running, 4 pending). User sees "we're 75% through" visually.

Plans MAY mutate at runtime (agent adds/removes steps mid-run); client renders new steps with smooth insertion animation per themes/motion.kmd R9 spring.

R5 — Replay / time-travel

Each step has unique id + timestamp + sequence number. Trace exportable as OpenTelemetry-compatible spans via services/ai/trace/.

Replay UI:

  • Pause/Play controls at bottom of trace.
  • Scrubber: drag to specific step → UI rewinds to that point-in-time (subsequent steps gray out as "future").
  • Branching: at any past step, "Fork from here" creates new conversation/agent-run starting fresh from that state. Cross-link chat-message-bubble.kmd R3.2 Branch action.

R6 — Branching

Forking semantics:

  • Selecting "Fork from here" at step N: clone state up to step N; new agent run begins with step N+1 (or user prompt-injection).
  • Forks tracked in conversation history (conversation-history.kmd #115) as separate entries with parent reference.
  • Visual: forks shown as tree below original trace; user can navigate between branches.

R7 — Surface bindings

SurfaceAPI
FlutterKoderAgentStepTrace({required steps, onStepTap, onFork, onReplay}) em koder_kit/lib/src/ai/agent_step_trace.dart
Web<koder-agent-step-trace agent-run-id="..."> em koder_web_kit (impl tracker: services/ai/ai#117)
Compose AndroidKoderAgentStepTrace em koder-design-compose (futuro)
SwiftUI iOSidem em koder-design-swift (futuro)
CLI / TUIVertical list em terminal; bubbletea expansão; replay via arrow keys

R8 — Acessibilidade

  • Step list: <ol> (semanticamente ordered).
  • Status icons: aria-label per state ("Done", "Running", "Waiting", etc.).
  • Step body: <details><summary> em Web; equivalente em Flutter/Compose.
  • Live region: aria-live="polite" announces step state changes ("Step 3 done").
  • Keyboard: Tab cycle, Enter expands, arrow keys navigate replay scrubber.
  • Reduced-motion: skip insertion animation; smooth scrubbing replaced by stepped jumps.

R9 — Multi-tenant + persistência

  • Agent run + steps scoped per (koder_user_id, workspace_id, conversation_id, run_id).
  • Tool calls em steps respeitam permissions per mcp-permission-prompt.kmd.
  • Forks: parent_run_id link mantido cross-fork; cascade delete per identity-data-retention.kmd.
  • Audit log: every step transition emit event pra services/foundation/audit/.

R10 — i18n

Keyen-USpt-BR
ai.agent.step.status.pending"Waiting""Aguardando"
ai.agent.step.status.running"Running""Em execução"
ai.agent.step.status.done"Done""Concluído"
ai.agent.step.status.error"Error""Erro"
ai.agent.step.status.skipped"Skipped""Ignorado"
ai.agent.step.status.paused"Paused""Pausado"
ai.agent.step.action.fork"Fork from here""Ramificar daqui"
ai.agent.step.action.pause"Pause""Pausar"
ai.agent.step.action.resume"Resume""Continuar"
ai.agent.step.action.replay"Replay""Repetir"
ai.agent.step.duration"Duration: {time}""Duração: {time}"
ai.agent.run.complete"Agent run complete in {time}""Execução do agente concluída em {time}"

R11 — Per-preset variation

PresetTrace style
material3 / material_expressiveDefault vertical list; spring insertion
terminal_classicPlain text: [1] ✓ Search linhas
brutalistSharp dividers; no animation
cyberpunk_neonGlow active step; neon connectors entre steps
minimalist_monoMono font; spartan dividers

T-suite

  • T1 Mount: render trace com 4 steps; statuses iniciais corretos.
  • T2 Transitions: emit step state events → assert UI atualiza per R2.
  • T3 Expand step: tap header → body visible com thinking + tool call + result sub-blocks.
  • T4 Plan pre-render: receive plan with 4 steps → all 4 rendered immediately.
  • T5 Plan mutation: agent adds step 5 mid-run → smooth insertion (spring).
  • T6 Replay scrubber: drag to step 2 → steps 3+4 gray out; UI shows state at step 2.
  • T7 Fork: tap "Fork from here" at step 2 → new conversation entry created with parent_run_id link.
  • T8 Pause/resume: tap pause → step running → paused; tap resume → continues.
  • T9 Error retry: step error → retry button → step re-runs.
  • T10 OpenTelemetry export: trace export produces valid OTLP spans (validation via opentelemetry-cli).
  • T11 A11y: screen reader announces step transitions; keyboard nav functional.
  • N1 Cross-tenant: run from workspace A não visible em workspace B.
  • N2 Multi-tool permissions: step com tool denied via mcp-permission-prompt.kmd → step skipped; trace marks "Skipped (permission denied)".
  • Companion: mcp-tool-invocation.kmd, thinking-state.kmd, chat-message-bubble.kmd (bubble host trace as content item), conversation-history.kmd (fork tracking)
  • Backend: services/ai/agents/, services/ai/trace/ (OpenTelemetry export)
  • Impl ticket web: services/ai/ai/backlog/pending/117-web-agent-run-step-replay.md
  • Policies: multi-tenant-by-default.kmd, identity-data-retention.kmd
  • Motion: themes/motion.kmd R9 spring (insertion animation)
  • Refs: LangChain agent observability, fuselabcreative agent UX 2026

Referências