Wire/API compatibility — Koder Flow vs upstream Forgejo

wire-compat specs/wire-compat/flow-vs-forgejo.kmd

Normative contract for the wire/API surfaces Koder Flow promises to keep byte- or semantically-equivalent with the upstream Forgejo it forks. Flow ships the same wire as Forgejo but never CI-proves it (koder.toml gates_pending=cross_impl_tests); every Koder-side delta therefore risks silently diverging from upstream and breaking external tools (`gh`, Forgejo/Gitea clients, repo mirroring, package clients) that depend on shared semantics. This spec pins WHICH surfaces are contractual, what "equivalent" means per surface, and the replay-diff harness that gates the promise in CI.

When this spec applies

Primary triggers

Change a Flow handler under routers/api/**
Add a Koder-specific field/endpoint to the Flow API

All triggers

Change a Koder Flow router/handler under routers/api/**
Modify a Forgejo-inherited API response shape or status code
Add a Koder-specific field/endpoint to the Flow API surface
Decide whether a Flow change is wire-breaking vs upstream
Wire the cross-impl compat CI job (flow-wire-compat.yml)

Spec — Koder Flow ↔ Forgejo wire/API compatibility

Koder Flow is a Forgejo derivative (GPL-3.0-or-later, see koder.toml). It inherits Forgejo's HTTP API, git wire protocol, and package-registry HTTP surfaces, and ships Koder-specific additions on top (Koder ID OAuth consumer, branding, a handful of routers). The value of staying a compatible fork is that the whole external ecosystem — gh, Forgejo/Gitea API clients, git itself, mirroring tools, helm/docker/npm/cargo package clients — keeps working unchanged against flow.koder.dev.

That compatibility is currently asserted but never proven. This spec defines the contractual surface, the equivalence rules per surface, and the replay-diff harness that turns the promise into a CI gate (closing koder.toml gates_pending=cross_impl_tests).

Scope

This spec governs only the surfaces below. Koder-specific additions (Koder ID OAuth, custom templates, branding) are explicitly out of contract — they may diverge from Forgejo freely, but they MUST NOT alter the shape of an in-contract surface for a request that does not opt into them.

R1 — Contractual surfaces

The following surfaces are contractual — Flow promises equivalence (per the rules in R2) with the upstream Forgejo release pinned in R5:

ID	Surface	Equivalence class (R2)
S1	`GET /api/v1/version`	semantic (version string differs; shape pinned)
S2	`/api/v1/repos/**` (read: get, list, contents, commits, branches, tags)	semantic
S3	`/api/v1/repos/{o}/{r}/issues/**` (list, get, comments)	semantic
S4	`/api/v1/repos/{o}/{r}/pulls/**` (list, get, files, commits)	semantic
S5	`/api/v1/orgs/`, `/api/v1/users/` (public read)	semantic
S6	`/api/v1/swagger.v1.json` (the OpenAPI document)	structural (paths+verbs+schemas; descriptions may differ)
S7	git smart-HTTP wire (`/{o}/{r}.git/info/refs?service=…`, upload-pack/receive-pack capability advertisement)	byte-equivalent on capability lines
S8	package registry HTTP APIs (`/api/packages/**`: at minimum the registries Flow advertises — see R1.1)	semantic

R1.1 — Surface enumeration is data, not prose

The authoritative machine-readable list of contractual endpoints lives at products/dev/flow/engine/tests/wire-compat/contract-surfaces.json (committed alongside the harness). This table is the human summary; the JSON is what the harness iterates. A new contractual endpoint is added by editing the JSON and this table in the same PR.

R2 — Equivalence classes

Each contractual surface declares one equivalence class. The harness (R4) enforces the matching comparator:

byte-equivalent — the two responses must be identical after stripping a fixed allowlist of volatile headers (Date, Set-Cookie, request-id, Server, the version token). Used for the git capability advertisement (S7) where wire bytes matter.
structural — JSON is compared by shape: same set of keys at every level, same types, same enum domains, same array element shapes. Values that are inherently instance-specific (ids, timestamps, URLs containing the host, the version string) are normalized to a placeholder before the diff. Used for OpenAPI (S6) and most JSON APIs.
semantic — structural, plus a per-surface allowlist of intended Koder divergences (e.g. an added koder_* field, an extra link relation). Any divergence NOT on the allowlist fails the diff. The allowlist is the audit trail of "things we changed on purpose"; an empty allowlist means "must match Forgejo exactly".

The per-surface allowlists live in tests/wire-compat/divergence-allowlist.json. Adding an entry is a deliberate, reviewable act — it is how a Koder-side API change is ratified as non-breaking.

R3 — Canonical workload

Compatibility is proven by replaying one canonical workload — a recorded, deterministic transcript of requests — against both a Flow instance and a stock Forgejo instance seeded identically, then diffing the responses per R2.

The workload transcript lives at tests/wire-compat/canonical-workload.jsonl (one request per line: method, path, headers, body, the surface ID it exercises, and the equivalence class). It MUST:

Be seed-deterministic — runnable against a freshly seeded instance (the harness seeds both sides with the same fixtures: one org, one user, one repo with 3 commits / 1 issue / 1 PR / 1 package). No dependence on wall-clock, random ids, or live data.
Cover every surface in R1 at least once.
Be append-only in spirit — new endpoints add lines; existing lines are edited only when upstream Forgejo itself changes the contract (bump R5 in the same PR).

R4 — Replay-diff harness

tests/wire-compat/ ships a Go harness (wire_compat_test.go, build tag wire_compat) that:

Stands up Flow and stock Forgejo side by side (containers or two local servers on distinct ports), each seeded with the R3 fixtures.
Replays canonical-workload.jsonl against both.
For each request, applies the R2 comparator for its declared equivalence class, consulting the R2 allowlists.
Fails with a readable per-surface diff on any unallowlisted divergence; passes when every line matches under its class.

The harness is self-skipping when the Forgejo reference image / the wire_compat tag is absent, so it never breaks the default test run — it is opt-in via make test-wire-compat and the CI job in R6.

R5 — Upstream reference pin

The Forgejo release Flow promises compatibility with is pinned in tests/wire-compat/forgejo-ref.txt (a single version tag, e.g. v11.0.0). Bumping the fork's upstream base is the only reason to edit the canonical workload's existing lines; the bump and the workload edit land together. The pin is recorded in registries/self-hosted-pairs.md next to the Flow row.

R6 — CI gate

.gitea/workflows/flow-wire-compat.yml runs the harness on the self-hosted runner. Until the GUI/container-capable runner and the Forgejo reference image are provisioned (shared with FLOW-181's flow-e2e.yml and the infra/ci#001 runner-tooling gap), the job is workflow_dispatch-only and its run step is allowed to skip cleanly when the reference image is absent — it exists as the wired stub so the gate is real the moment the runner lands. Promoting it to on: push (and dropping the skip) is what flips koder.toml gates_pending=cross_impl_tests off.

Test obligations

A component implementing this spec MUST ship:

T1: the canonical workload (canonical-workload.jsonl) parses and references only surfaces present in contract-surfaces.json.
T2: every surface in contract-surfaces.json is exercised by at least one workload line (coverage check — runnable WITHOUT a Forgejo reference, so it gates in the default suite).
T3: each comparator (byte / structural / semantic) round-trips a golden pair correctly (matching pair passes, mutated pair fails).
T4: a divergence not on the allowlist fails; the same divergence added to the allowlist passes (proves the allowlist is load-bearing).
T5: the harness self-skips (does not error) when the Forgejo reference image is absent.

T1–T4 run without a live Forgejo (they exercise the parser, the coverage check, and the comparators against golden fixtures), so they gate in the default suite today. T5 + the full replay are the opt-in half that lands with the runner.

Anti-patterns

❌ Asserting "we ship the same wire" in prose with no replay proof. This spec exists precisely because that claim was untested.
❌ Adding a koder_* field to an in-contract JSON response without an entry in divergence-allowlist.json. Undocumented divergence is a silent break.
❌ Editing existing canonical-workload.jsonl lines for a Koder change. New behavior adds lines + an allowlist entry; only an upstream bump (R5) rewrites existing lines.
❌ Letting the harness hard-fail when Forgejo is absent. It MUST self-skip so it never blocks the default run (R4/T5).

Não-escopo

gRPC / GraphQL — Flow exposes neither as a contractual surface.
Web UI HTML — the rendered HTML is Koder-branded and explicitly out of contract (the API is the contract, not the chrome).
Performance parity — that is G2 / gates_pending=performance, tracked separately in FLOW-184 + perf-baseline/forgejo-headtohead.
Koder ID OAuth, custom templates, branding — Koder additions, out of contract by R1 (must not alter in-contract surfaces, but free to diverge themselves).

Maturity

v0.1 Draft — defines the contract + harness shape for closing koder.toml gates_pending=cross_impl_tests. The spec doc, the machine-readable surface/allowlist tables, the canonical workload seed, and the CI stub land first (FLOW-185). Promote to v1.0 Ratified when the replay harness runs green against a pinned Forgejo reference on the provisioned runner and the gate flips off in koder.toml.

References

products/dev/flow/engine/koder.toml
meta/docs/stack/policies/self-hosted-first.kmd
meta/docs/stack/registries/self-hosted-pairs.md
products/dev/flow/engine/tests/wire-compat/