Skip to content

AI streaming text

ai-ui specs/ai-ui/streaming-text.kmd

Token-by-token reveal of AI-generated text with cursor, Stop button, Retry, autoscroll, and incremental markdown rendering. Hosted by chat-message-bubble (#105) and any surface that streams gateway responses (inline-suggest, agent step trace, etc.).

When this spec applies

Primary triggers

All triggers

Specification body

Spec — AI streaming text

Companion: chat-message-bubble.kmd é o consumer principal. Code blocks deferred per R4 (cross-link code-block.kmd R7). Cursor animation respeita motion.kmd R6 reduced-motion.

Princípios

  1. Append-only render — tokens chegam, append no buffer, render incrementalmente. Sem reflow inteiro.
  2. Cursor presence signals state — pisca durante stream; some on done.
  3. Stop is prominent — não em menu, sempre 1-tap reachable.
  4. Defer code blocks — NÃO highlight enquanto fence aberto (evita re-highlight per token).
  5. Autoscroll com escape — follow tail por default; user scroll up cancela; re-engage on return-to-bottom.

R1 — Token buffer

Input: stream de events do gateway (SSE ou WebSocket via chat-adapter).

event: token       data: "Hello"
event: token       data: " world"
event: token       data: "!"
event: done        data: {}

Buffer: StringBuffer append-only. Cada token → append + trigger incremental render.

Performance gate: render coalescing ≥16ms (60fps) — multiple tokens dentro de 16ms agrupam num único render frame.

R2 — Cursor

Cursor visible enquanto stream ativo:

  • Glyph: (default) ou (block) configurável per-preset.
  • Animation: opacity 1.0 → 0.3 → 1.0 com period 1000ms.
  • Position: imediatamente após último char renderizado.
  • Anchor: inline span; respeita line wrap.

On done event: cursor desaparece (fade-out 100ms via motion.kmd R9 motion-effect-fast).

On error event: cursor desaparece + error icon appears.

Reduced-motion: cursor estático (não pisca); ainda visível pra signal "streaming".

R3 — Stop button

Stop button rendered PROMINENTLY:

  • Position: floating action button right-aligned no composer area (mobile) OR inline trailing do streaming bubble (desktop).
  • Visible: ALWAYS during stream (não esconde em menu).
  • Action: emit SIGINT pro gateway via WebSocket cancel message; aguarda 200ms; força close se sem ack.
  • After stop: cursor sumiu; "Retry" button replaces Stop (cross-link chat-message-bubble.kmd R4 error/stopped state).

Anti-pattern (forbidden): Stop em overflow menu ( button). Streaming long é frustrante; Stop MUST be 1-tap.

R4 — Defer code block rendering

Markdown source pode incluir fenced code:

Aqui está o código:
```python
def hello():
    pass

Sem defer: cada token renderiza markdown → code block highlight re-runs per token → CPU/memory drain + flicker.

**Contract**: code blocks (` ``` ` opening fence detected) MUST stay como placeholder text dimmed até closing fence detected. APENAS após closing fence: trigger syntax highlight + render via [`code-block.kmd`](code-block.kmd).

Placeholder estilo:

┌────────────────────────────────────┐ │ ```python │ │ ▋ (writing code...) │ │ │ └────────────────────────────────────┘


Quando fechado: snap-render para final code-block widget.

## R5 — Autoscroll com escape

Scroll behavior:

- Default: follow tail (autoscroll on append).
- User scroll up (>50px from bottom): cancel autoscroll; show "Jump to bottom" floating chip.
- User scrolls back to bottom (<10px): re-engage autoscroll; chip hides.

Scroll velocity: smooth, não jump; via [`motion.kmd`](../themes/motion.kmd) `motion-spatial-fast` spring.

Multi-bubble streaming (rare; cascade responses): autoscroll segue o LATEST bubble; older bubbles não movem absolute position.

## R6 — Retry after stop

After user clicked Stop OR gateway emitted error:

- "Retry" button replaces Stop position.
- Click → re-invoke gateway com same context (last user message + history); new stream starts.
- Buffer cleared antes do restart (não append em buffer parcial).
- Max retries client-side: 3 dentro de 30s (avoid runaway loop); after threshold, mostrar "Try again later".

## R7 — Markdown incremental rendering

Tokens podem partir construções markdown ao meio:

"In markdown italic"


Tokens recebidos:
1. `"In "`
2. `"**mark"`  ← ainda incompleto
3. `"down**"`  ← agora "**markdown**" pode renderar
4. `" *italic*"`

Strategy:

- Render markdown ATÉ último construct completo. Partial: render como literal text (degrade graciosamente).
- Re-evaluate AT NEWLINE OR every paragraph boundary (whichever mais lenient pra performance).
- Específicos defer per R4: fenced code, tabelas grandes, blockquotes profundos.

Library: per-surface (`flutter_markdown` incremental fork; `marked.js`-compatible Web; etc.). Surfaces MUST passar T-suite mesmo em libs diferentes.

## R8 — Surface bindings

| Surface | API |
|---|---|
| Flutter | `KoderStreamingText({required stream, onStop, onRetry})` em `koder_kit/lib/src/ai/streaming_text.dart` |
| Web | `<koder-streaming-text source="event-source://...">` em `koder_web_kit` |
| Compose Android | `KoderStreamingText` em `koder-design-compose` (futuro) |
| SwiftUI iOS | idem em `koder-design-swift` (futuro) |
| CLI / TUI | Print incremental (`io.Writer.Write`); Stop via Ctrl+C; no cursor (terminal cursor já existe) |

API: `Stream<Token>` input → `Widget`/`Element` output + callbacks `onStop()` + `onRetry()` + `onDone(fullText)`.

## R9 — Acessibilidade

- Container: `aria-live="polite"` durante stream (announces increments mas não flood).
- Cursor: `aria-hidden="true"`.
- Stop button: `aria-label="Stop generating"` (i18n).
- After done: announce "Done" once.
- After error: announce error description.
- Reduced-motion: cursor estático; autoscroll instant (não smooth).
- Touch: Stop target ≥48dp.

## R10 — i18n

| Key | en-US | pt-BR |
|---|---|---|
| `ai.streaming.stop` | "Stop generating" | "Parar geração" |
| `ai.streaming.retry` | "Retry" | "Tentar novamente" |
| `ai.streaming.jump_to_bottom` | "Jump to bottom" | "Ir pro fim" |
| `ai.streaming.code_placeholder` | "(writing code…)" | "(escrevendo código…)" |
| `ai.streaming.error_max_retries` | "Try again later" | "Tente novamente mais tarde" |

## T-suite

- **T1** Render tokens: stream 100 tokens → all visible incrementally; final text correct.
- **T2** Cursor visible: assert cursor element presente durante stream.
- **T3** Cursor removed on done: emit `done` event → cursor disappears within 100ms.
- **T4** Stop button: tap stop → emit cancel; assert stream halted; Retry button visible.
- **T5** Retry: tap retry → new stream starts; buffer cleared.
- **T6** Defer code block: emit fence-open + 50 tokens + fence-close → code-block rendered ONCE (not per-token); assert no flicker via animation frame audit.
- **T7** Autoscroll engage: scroll bottom → tokens append → scroll stays at bottom.
- **T8** Autoscroll escape: scroll up 100px → tokens append → scroll position unchanged; "Jump to bottom" chip visible.
- **T9** Reduced-motion: enable prefers-reduced-motion → cursor static; autoscroll instant.
- **T10** A11y: screen reader announces partial text periodically (not per token); announces "Done" at end.
- **N1** Max retries: 3 retries dentro de 30s → 4ª retry shows max-retries message.
- **N2** Markdown partial: tokens partem `**bold**` ao meio → no flicker; final render correct.

## Cross-link

- Companion: [`chat-message-bubble.kmd`](chat-message-bubble.kmd) (host), [`code-block.kmd`](code-block.kmd) (defer R4)
- Motion: [`themes/motion.kmd`](../themes/motion.kmd) R6 reduced-motion + R9 springs
- Backend: `services/ai/chat-adapter/` (SSE wire), `services/ai/gateway/` (provider)

References