Multi-tenancy contract
multi-tenancy specs/multi-tenancy/contract.kmd
Specification body
Spec — Multi-tenancy contract
Mecanismos concretos para implementar
policies/multi-tenant-by-default.kmd. Spec é normativo: todo módulo
da Koder Stack que armazena dado-de-usuário deve passar nos checks
T1–T9 abaixo.
Identity model
koder_user_id BIGINT NOT NULL -- FK to services/foundation/id.user(id)
workspace_id BIGINT -- nullable; FK to id.workspace(id)
koder_user_id é canonical PK partial em toda tabela com PII.
workspace_id amplifica scope: nullable significa "pessoal";
non-null significa "do workspace, todos os members veem por
membership".
Tabela canonical de membership (Koder ID):
CREATE TABLE workspace_member (
workspace_id BIGINT NOT NULL,
koder_user_id BIGINT NOT NULL,
role TEXT NOT NULL, -- 'owner' | 'admin' | 'member' | 'viewer'
joined_unix BIGINT NOT NULL,
PRIMARY KEY (workspace_id, koder_user_id)
);
Qualquer query cross-workspace passa por essa tabela. Não cacheia
membership client-side — é hot path do auth, fica em
services/foundation/id com cache de 60s server-side.
PAT scope grammar
PAT (Personal Access Token) emitido pelo Koder ID carrega scope. Sintaxe canonical (herdado do Flow RFC-003 credentials/backups):
<verb>:<resource>[:<modifier>]
verbs: read | write | admin
resources: user | workspace | repo | credentials | backups | …
modifier: optional, e.g. "self" or "<id>"
Exemplos:
read:user— ler perfil própriowrite:credentials— escrever credentials no scope que o PAT herdouread:workspace:<id>— ler dados específicos de um workspaceadmin:user— privileged self-management
PAT é scoped to a single koder_user_id (o owner). Workspace
access é resolvido via workspace_member na hora do request, não
via PAT scope. PATs não atravessam tenants.
RLS template (Postgres / kdb-next)
Toda tabela com PII tem RLS. Helper migration:
-- 1. Schema com tenant fields
CREATE TABLE my_resource (
id BIGSERIAL,
koder_user_id BIGINT NOT NULL REFERENCES koder_id.user(id),
workspace_id BIGINT REFERENCES koder_id.workspace(id),
payload JSONB NOT NULL,
created_unix BIGINT NOT NULL DEFAULT extract(epoch from now()),
PRIMARY KEY (koder_user_id, id)
);
-- 2. Index on tenant + recent
CREATE INDEX ix_my_resource_user_recent
ON my_resource (koder_user_id, created_unix DESC);
-- 3. RLS enable + policy
ALTER TABLE my_resource ENABLE ROW LEVEL SECURITY;
CREATE POLICY p_owner ON my_resource
USING (koder_user_id = current_setting('koder.uid')::BIGINT);
CREATE POLICY p_workspace_member ON my_resource
USING (workspace_id IS NOT NULL
AND EXISTS (
SELECT 1 FROM koder_id.workspace_member m
WHERE m.workspace_id = my_resource.workspace_id
AND m.koder_user_id = current_setting('koder.uid')::BIGINT
));
Connection setup (per request):
conn.Exec(ctx, "SET LOCAL koder.uid = $1", auth.UserID)
// queries thereafter are RLS-filtered automatically
Bypass admin path (rare): RESET koder.uid é privilege da role
koder_admin only. Audit log obrigatório em qualquer reset.
KV / cache template (Redis-style)
Toda key tem prefixo de tenant:
<namespace>:<tenant-key>:<resource-key>
examples:
rate_limit:user:<uid>:5h_window → counter
session:user:<uid>:<session_id> → JSON
presence:workspace:<wid>:<uid> → boolean
Helper:
func TenantKey(uid int64, parts ...string) string {
return fmt.Sprintf("user:%d:%s", uid, strings.Join(parts, ":"))
}
Key sem prefixo → bug crítico (cross-tenant leak via cache).
S3 / object storage template
Path:
<bucket>/<koder_user_id>/<workspace_id|"personal">/<resource_id>/<file>
IAM / signed-URL: per-request, restricted to the tenant prefix.
Test contract — T1..T9
Todo módulo multi-tenant tem suite que cobre:
| ID | Test | Description |
|---|---|---|
| T1 | Auth required | GET /resource sem PAT → 401 |
| T2 | Self read | A's PAT, GET /my-resource → A's data only |
| T3 | Cross-tenant read denied | A's PAT, GET /resource/<B's id> → 404 (not 403) |
| T4 | Cross-tenant write denied | A's PAT, POST /resource setting koder_user_id=B → 400 or silent override to A |
| T5 | Workspace member read | A in workspace W, GET /resource?workspace=W → all members' data |
| T6 | Workspace non-member read | A not in W, GET /resource?workspace=W → 404 |
| T7 | RLS isolation | Direct DB query without SET LOCAL koder.uid → returns nothing (or error) |
| T8 | Index efficiency | EXPLAIN of A's read uses tenant index, not seq scan |
| T9 | Tenant deletion | When user A is deleted, all WHERE koder_user_id = A rows are removed within retention window |
Cada implementação ships com tests/multi-tenant/T1..T9_test.go (ou
equivalente). Audit: PR sem T1..T9 verde bloqueia merge (ver
policies/regression-tests.kmd co-enforcement).
Error model
| Cenário | HTTP | gRPC | Body |
|---|---|---|---|
| Sem auth | 401 | UNAUTHENTICATED | {"error": "auth required"} |
| Token inválido | 401 | UNAUTHENTICATED | {"error": "invalid token"} |
| Recurso não-existente OU de outro tenant | 404 | NOT_FOUND | {"error": "not found"} |
| Recurso existe mas role insuficiente (workspace member sem write) | 403 | PERMISSION_DENIED | {"error": "insufficient role"} |
| Bad input | 400 | INVALID_ARGUMENT | {"error": "<details>"} |
| Server error | 500 | INTERNAL | {"error": "internal"} |
Crítico: 404, não 403, em cross-tenant cases. 403 vaza existência ("este id existe mas você não pode ler" → atacante sabe que existe).
Audit log
Toda operação mutating que toca PII grava audit row:
CREATE TABLE audit_log (
id BIGSERIAL PRIMARY KEY,
actor_user_id BIGINT NOT NULL, -- the PAT owner
target_user_id BIGINT, -- tenant being acted on (often = actor)
action TEXT NOT NULL, -- 'create' | 'update' | 'delete' | 'read_admin'
resource TEXT NOT NULL, -- 'credentials' | 'usage' | …
resource_id BIGINT,
payload JSONB,
created_unix BIGINT NOT NULL
);
CREATE INDEX ix_audit_actor ON audit_log (actor_user_id, created_unix DESC);
CREATE INDEX ix_audit_target ON audit_log (target_user_id, created_unix DESC);
Audit row é best-effort write (failure logs but doesn't abort
the action; ver flow#056b policy).
Sharding model (futuro, hyperscale)
Quando uma tabela passar de ~10M rows ou ~100K tenants ativos:
- Range-shard por
koder_user_id(TiKV PD faz isso automático em kdb-next) - Hash-shard via
hash(koder_user_id) % N(alternativa em Postgres com Citus / pg_partman) - Geo-shard por região do tenant (multi-region future, ver stack-RFC-001 §faseamento)
Não pré-otimizar. Trigger: monitoring sinalizar p99 latência > 50ms ou table size > 1TB.
Edge cases
User rename / handle change
koder_user_id é immutable — handle (@username) muda; ID
não. Toda referência cross-table usa koder_user_id (BIGINT),
nunca handle.
Workspace transfer
Workspace muda de owner: workspace.owner_id muda; workspace_id
permanece. Resources com workspace_id = X continuam acessíveis
pelos members atuais.
Account deletion (GDPR-style)
Quando user pede delete:
- Set
user.deleted_unix = NOW()(soft delete) - Cron job de retention varre tabelas e deleta rows
WHERE koder_user_id = X AND <table-specific retention> - Audit row em
audit_logregistra "user_deleted" antes da limpeza - Retention default: 30 dias (configurável per-tenant pra compliance).
Account merge
Out of scope — Koder Stack não suporta merge automático de accounts. Admin-only manual operation se necessário.
Spec audit
Aplicabilidade automática (futuro: koder-spec-audit multi-tenancy):
- Escaneia migrations: tabelas com PII columns (
email,name,password*,key*) semkoder_user_id→ flag - Escaneia routers: endpoints sem auth middleware → flag
- Escaneia código:
SELECT * FROM <pii-table>semWHERE→ flag - Escaneia env vars: shared cache keys sem prefixo → flag
Severity: error (block release) na primeira release que adopt o audit; advisory antes.
References
policies/multi-tenant-by-default.kmdrfcs/stack-RFC-001-kdb-as-unified-data-plane.kmdservices/foundation/id/engine/docs/rfcs/RFC-004-oauth2-oidc-service.md