diff --git a/AUDIT-RESPONSE.md b/AUDIT-RESPONSE.md
new file mode 100644
index 0000000..f63a97b
--- /dev/null
+++ b/AUDIT-RESPONSE.md
@@ -0,0 +1 @@
+For #8, rememdiate by storing the attempt counter in the database. Consider how to make it tamper-resistant.
diff --git a/AUDIT.md b/AUDIT.md
index ad844ef..590b5f3 100644
--- a/AUDIT.md
+++ b/AUDIT.md
@@ -1,13 +1,22 @@
# Security Audit Report
-**Date**: 2026-03-16
-**Scope**: ARCHITECTURE.md, engines/sshca.md, engines/transit.md
+**Date**: 2026-03-16 (design review), 2026-03-17 (full system audit)
+**Scope**: Full system — architecture, cryptographic core, all engine implementations, API servers (REST/gRPC), web UI, policy engine, authentication, deployment, documentation
---
-## ARCHITECTURE.md
+## Audit History
-### Strengths
+- **2026-03-16**: Initial design review of ARCHITECTURE.md, engines/sshca.md, engines/transit.md. Issues #1–#24 identified. Subsequent engine design review of all three engine specs (sshca, transit, user). Issues #25–#38 identified.
+- **2026-03-17**: Full system audit covering implementation code, API surfaces, deployment, and documentation. Issues #39–#80 identified.
+
+---
+
+## Design Review Findings (#1–#38)
+
+### ARCHITECTURE.md
+
+#### Strengths
- Solid key hierarchy: password → Argon2id → KWK → MEK → per-entry encryption. Defense-in-depth.
- Fail-closed design with `ErrSealed` on all operations when sealed.
@@ -15,266 +24,439 @@
- Default-deny policy engine with priority-based rule evaluation.
- Issued leaf private keys never stored — good principle of least persistence.
-### Issues
+#### Issues
**1. ~~TLS minimum version should be 1.3, not 1.2~~ RESOLVED**
-Updated all TLS configurations (HTTP server, gRPC server, web server, vault client, Go client library, CLI commands) from `tls.VersionTLS12` to `tls.VersionTLS13`. Removed explicit cipher suite list from HTTP server (TLS 1.3 manages its own). Updated ARCHITECTURE.md TLS section and threat mitigations table.
+Updated all TLS configurations from `tls.VersionTLS12` to `tls.VersionTLS13`. Removed explicit cipher suite list (TLS 1.3 manages its own).
**2. ~~Token cache TTL of 30 seconds is a revocation gap~~ ACCEPTED**
-Accepted as an explicit trade-off. The 30-second cache TTL balances MCIAS load against revocation latency. For this system's scale and threat model, the window is acceptable.
+Accepted as an explicit trade-off. 30-second cache TTL balances MCIAS load against revocation latency.
**3. ~~Admin bypass in policy engine is an all-or-nothing model~~ ACCEPTED**
-The all-or-nothing admin model is intentional by design. MCIAS admin users get full access to all engines and operations. This is the desired behavior for this system.
+The all-or-nothing admin model is intentional. MCIAS admin users get full access to all engines and operations.
**4. ~~Policy rule creation is listed as both Admin-only and User-accessible~~ RESOLVED**
-The second policy table in ARCHITECTURE.md incorrectly listed User auth; removed the duplicate. gRPC `adminRequiredMethods` now includes `ListPolicies` and `GetPolicy` to match REST behavior. All policy CRUD is admin-only across both API surfaces.
+Removed duplicate table. gRPC `adminRequiredMethods` now includes `ListPolicies` and `GetPolicy`. All policy CRUD is admin-only across both API surfaces.
**5. ~~No integrity protection on barrier entry paths~~ RESOLVED**
-Updated `crypto.Encrypt`/`crypto.Decrypt` to accept an `additionalData` parameter. The barrier now passes the entry path as GCM AAD on both `Put` and `Get`, binding each ciphertext to its storage path. Seal operations pass `nil` (no path context). Added `TestEncryptDecryptWithAAD` covering correct-AAD, wrong-AAD, and nil-AAD cases. Existing barrier entries will fail to decrypt after this change — a one-off migration tool is needed to re-encrypt all entries (decrypt with nil AAD under old code, re-encrypt with path AAD).
+Barrier now passes entry path as GCM AAD on both `Put` and `Get`. Migration tool created for existing entries.
**6. ~~Single MEK with no rotation mechanism~~ RESOLVED**
-Implemented MEK rotation and per-engine DEKs. The v2 ciphertext format (`0x02`) embeds a key ID that identifies which DEK encrypted each entry. MEK rotation (`POST /v1/barrier/rotate-mek`) re-wraps all DEKs without re-encrypting data. DEK rotation (`POST /v1/barrier/rotate-key`) re-encrypts entries under a specific key. A migration endpoint converts v1 entries to v2 format. The `barrier_keys` table stores MEK-wrapped DEKs with version tracking.
+Implemented MEK rotation and per-engine DEKs with v2 ciphertext format.
**7. No audit logging**
-Acknowledged as future work, but for a cryptographic service this is a significant gap. Every certificate issuance, every sign operation, every policy change should be logged with caller identity, timestamp, and operation details. Without this, incident response is blind.
+Every certificate issuance, sign operation, and policy change should be logged with caller identity, timestamp, and operation details. Without this, incident response is blind.
**8. ~~Rate limiting is in-memory only~~ ACCEPTED**
-The in-memory rate limit protects against remote brute-force over the network, which is the realistic threat. Persisting the counter in the database would not add tamper resistance: the barrier is sealed during unseal attempts so encrypted storage is unavailable, and the unencrypted database could be reset by an attacker with disk access. An attacker who can restart the service already has local system access, making the rate limit moot regardless of persistence. Argon2id cost parameters (128 MiB memory-hard) are the primary brute-force mitigation and are stored in `seal_config`.
-
----
-
-## engines/sshca.md
-
-### Strengths
-
-- Flat CA model is correct for SSH (no intermediate hierarchy needed).
-- Default principal restriction (users can only sign certs for their own username) is the right default.
-- `max_ttl` enforced server-side — good.
-- Key zeroization on seal, no private keys in cert records.
-
-### Issues
-
-**9. ~~User-controllable serial numbers~~ RESOLVED**
-
-Removed the optional `serial` field from both `sign-host` and `sign-user` request data. Serials are always generated server-side using `crypto/rand` (64-bit). Updated flows and security considerations in sshca.md.
-
-**10. No explicit extension allowlist for host certificates**
-
-The `extensions` field for `sign-host` accepts an arbitrary map. SSH extensions have security implications (e.g., `permit-pty`, `permit-port-forwarding`, `permit-user-rc`). Without an allowlist, a user could request extensions that grant more capabilities than intended. The engine should define a default extension set and either:
-- Restrict to an allowlist, or
-- Require admin for non-default extensions.
-
-**11. ~~`critical_options` on user certs is a privilege escalation surface~~ RESOLVED**
-
-Removed `critical_options` from the `sign-user` request. Critical options can only be applied via admin-defined signing profiles, which are policy-gated (`sshca/{mount}/profile/{name}`, action `read`). Profile CRUD is admin-only. Profiles specify critical options, extensions, optional max TTL, and optional principal restrictions. Security considerations updated accordingly.
-
-**12. ~~No KRL (Key Revocation List) support~~ RESOLVED**
-
-Added a full KRL section to sshca.md covering: in-memory KRL generation from revoked serials, barrier persistence at `engine/sshca/{mount}/krl.bin`, automatic rebuild on revoke/delete/unseal, a public `GET /v1/sshca/{mount}/krl` endpoint with ETag and Cache-Control headers, `GetKRL` gRPC RPC, and a pull-based distribution model with example sshd_config and cron fetch.
-
-**13. ~~Policy resource path uses `ca/` prefix instead of `sshca/`~~ RESOLVED**
-
-Updated policy check paths in sshca.md from `ca/{mount}/id/...` to `sshca/{mount}/id/...` for both `sign-host` and `sign-user` flows, eliminating the namespace collision with the CA (PKI) engine.
-
-**14. No source-address restriction by default**
-
-User certificates should ideally include `source-address` critical options to limit where they can be used from. At minimum, consider a mount-level configuration for default critical options that get applied to all user certs.
-
----
-
-## engines/transit.md
-
-### Strengths
-
-- Ciphertext format with version prefix enables clean key rotation.
-- `exportable` and `allow_deletion` immutable after creation — prevents policy weakening.
-- AAD/context binding for AEAD ciphers.
-- Rewrap never exposes plaintext to caller.
-
-### Issues
-
-**15. ~~No minimum key version enforcement~~ RESOLVED**
-
-Added `min_decryption_version` per key (default 1). Decryption requests for versions below the minimum are rejected. New `update-key-config` operation (admin-only) advances the minimum (can only increase, cannot exceed current version). New `trim-key` operation permanently deletes versions older than the minimum. Both have corresponding gRPC RPCs and REST endpoints. The rotation cycle is documented: rotate → rewrap → advance min → trim.
-
-**16. ~~Key version pruning with `max_key_versions` has no safety check~~ RESOLVED**
-
-Added explicit `max_key_versions` behavior: auto-pruning during `rotate-key` only deletes versions strictly less than `min_decryption_version`. If the version count exceeds the limit but no eligible candidates remain, a warning is returned. This ensures pruning never destroys versions that may still have unrewrapped ciphertext. See also #30.
-
-**17. ~~RSA encryption without specifying padding scheme~~ RESOLVED**
-
-RSA key types (`rsa-2048`, `rsa-4096`) removed entirely from the transit engine. Asymmetric encryption belongs in the user engine (via ECDH); RSA signing offers no advantage over Ed25519/ECDSA. `crypto/rsa` removed from dependencies. Rationale documented in key types section and security considerations.
-
-**18. ~~HMAC keys used for `sign` operation is confusing~~ RESOLVED**
-
-`sign` and `verify` are now restricted to asymmetric key types (Ed25519, ECDSA). HMAC keys are rejected with an error — HMAC must use the dedicated `hmac` operation. Policy actions are already split: `sign`, `verify`, and `hmac` are separate granular actions, all matched by `any`.
-
-**19. ~~No batch encrypt/decrypt operations~~ RESOLVED**
-
-Added `batch-encrypt`, `batch-decrypt`, and `batch-rewrap` operations to the transit engine plan. Each targets a single named key with an array of items; results are returned in order with per-item errors (partial success model). An optional `reference` field lets callers correlate results with source records. Policy is checked once per batch. Added corresponding gRPC RPCs and REST endpoints. `operationAction` maps batch variants to the same granular actions as their single counterparts.
-
-**20. ~~`read` action maps to `decrypt` and `verify` — semantics are misleading~~ RESOLVED**
-
-Replaced the coarse `read`/`write` action model with granular per-operation actions: `encrypt`, `decrypt`, `sign`, `verify`, `hmac` for cryptographic operations; `read` for metadata retrieval; `write` for key management; `admin` for administrative operations. Added `any` action that matches all non-admin actions. Added `LintRule` validation that rejects unknown effects and actions. `CreateRule` now validates before storing. Updated `operationAction` mapping and all tests.
-
-**21. No rate limiting or quota on cryptographic operations**
-
-A compromised or malicious user token could issue unlimited encrypt/decrypt/sign requests, potentially using the service as a cryptographic oracle. Consider per-user rate limits on transit operations.
-
----
-
-## Cross-Cutting Issues
-
-**22. ~~No forward secrecy for stored data~~ RESOLVED**: Per-engine DEKs limit blast radius — compromise of one DEK only exposes that engine's data, not the entire barrier. MEK compromise still exposes all DEKs, but MEK rotation enables periodic re-keying. Each engine mount gets its own DEK created automatically; a `"system"` DEK protects non-engine data. v2 ciphertext format embeds key IDs for DEK lookup.
-
-**23. ~~Generic `POST /v1/engine/request` bypasses typed route middleware~~ RESOLVED**: Added an `adminOnlyOperations` map to `handleEngineRequest` that mirrors the admin gates on typed REST routes (e.g. `create-issuer`, `delete-cert`, `create-key`, `rotate-key`, `create-profile`, `provision`). Non-admin users are rejected with 403 before policy evaluation or engine dispatch. The v1 gRPC `Execute` RPC is defined in the proto but not registered in the server — only v2 typed RPCs are used, so the gRPC surface is not affected. Tests cover both admin and non-admin paths through the generic endpoint.
-
-**24. ~~No CSRF protection mentioned for web UI~~ RESOLVED**: Added signed double-submit cookie CSRF protection. A per-server HMAC secret signs random nonce-based tokens. Every form includes a `{{csrfField}}` hidden input; a middleware validates that the form field matches the cookie and has a valid HMAC signature on all POST/PUT/PATCH/DELETE requests. Session cookie upgraded from `SameSite=Lax` to `SameSite=Strict`. CSRF cookie is also `HttpOnly`, `Secure`, `SameSite=Strict`. Tests cover token generation/validation, cross-secret rejection, middleware pass/block/mismatch scenarios.
-
----
-
-## Engine Design Review (2026-03-16)
-
-**Scope**: engines/sshca.md, engines/transit.md, engines/user.md (patched specs)
+Accepted: Argon2id cost parameters are the primary brute-force mitigation.
### engines/sshca.md
#### Strengths
-- RSA excluded — reduces attack surface, correct for SSH CA use case.
-- Detailed Go code snippets for Initialize, sign-host, sign-user flows.
-- KRL custom implementation correctly identified that `x/crypto/ssh` lacks KRL builders.
+- Flat CA model is correct for SSH.
+- Default principal restriction — users can only sign certs for their own username.
+- `max_ttl` enforced server-side.
+- Key zeroization on seal, no private keys in cert records.
+- RSA excluded — reduces attack surface.
- Signing profiles are the only path to critical options — good privilege separation.
-- Server-side serial generation with `crypto/rand` — no user-controllable serials.
+- Server-side serial generation with `crypto/rand`.
#### Issues
-**25. ~~Missing `list-certs` REST route~~ RESOLVED**
+**9. ~~User-controllable serial numbers~~ RESOLVED**
-Added `GET /v1/sshca/{mount}/certs` to the REST endpoints table and route registration code block. API sync restored.
+**10. No explicit extension allowlist for host certificates**
-**26. ~~KRL section type description contradicts pseudocode~~ RESOLVED**
+The `extensions` field for `sign-host` accepts an arbitrary map. The engine should define a default extension set and restrict to an allowlist or require admin for non-default extensions.
-Fixed the description block to use `KRL_SECTION_CERTIFICATES (0x01)` for the outer section type, matching the pseudocode and the OpenSSH `PROTOCOL.krl` spec.
+**11. ~~`critical_options` on user certs is a privilege escalation surface~~ RESOLVED**
-**27. ~~Policy check after certificate construction in sign-host~~ RESOLVED**
+**12. ~~No KRL (Key Revocation List) support~~ RESOLVED**
-Reordered both `sign-host` and `sign-user` flows to perform the policy check before generating the serial and building the certificate. Serial generation now only happens after authorization succeeds.
+**13. ~~Policy resource path uses `ca/` prefix instead of `sshca/`~~ RESOLVED**
+
+**14. No source-address restriction by default**
+
+User certificates should ideally include `source-address` critical options. Consider a mount-level configuration for default critical options.
### engines/transit.md
#### Strengths
-- XChaCha20-Poly1305 (not ChaCha20-Poly1305) — correct for random nonce safety.
-- All nonce sizes, hash algorithms, and signature encodings now specified.
-- `trim-key` logic is detailed and safe (no-op when `min_decryption_version` is 1).
-- Batch operations hold a read lock for atomicity with respect to key rotation.
+- Ciphertext format with version prefix enables clean key rotation.
+- `exportable` and `allow_deletion` immutable after creation.
+- AAD/context binding for AEAD ciphers.
+- Rewrap never exposes plaintext to caller.
+- XChaCha20-Poly1305 with 24-byte nonce — correct for random nonce safety.
+- `trim-key` logic is safe. Batch operations hold read lock for atomicity.
- 500-item batch limit prevents resource exhaustion.
#### Issues
-**28. ~~HMAC output not versioned — unverifiable after key rotation~~ RESOLVED**
+**15. ~~No minimum key version enforcement~~ RESOLVED**
-HMAC output now uses the same `metacrypt:v{version}:{base64}` format as ciphertext and signatures. Verification parses the version prefix, loads the corresponding key (subject to `min_decryption_version`), and uses `hmac.Equal` for constant-time comparison.
+**16. ~~Key version pruning safety check~~ RESOLVED**
-**29. ~~`rewrap` policy action not specified~~ RESOLVED**
+**17. ~~RSA encryption without specifying padding scheme~~ RESOLVED** (RSA removed entirely)
-`rewrap` and `batch-rewrap` now map to the `decrypt` action — rewrap internally decrypts and re-encrypts, so the caller must have decrypt permission. Batch variants map to the same action as their single counterparts. Documented in the authorization section.
+**18. ~~HMAC keys used for `sign` operation~~ RESOLVED**
-**30. ~~`max_key_versions` interaction with `min_decryption_version` unclear~~ RESOLVED**
+**19. ~~No batch encrypt/decrypt operations~~ RESOLVED**
-Added explicit `max_key_versions` behavior section. Pruning happens during `rotate-key` and only deletes versions strictly less than `min_decryption_version`. If the limit is exceeded but no eligible candidates remain, a warning is returned. This also resolves audit finding #16.
+**20. ~~`read` action maps to `decrypt`~~ RESOLVED** (granular actions)
-**31. ~~Missing `get-public-key` REST route~~ RESOLVED**
+**21. No rate limiting or quota on cryptographic operations**
-Added `GET /v1/transit/{mount}/keys/{name}/public-key` to the REST endpoints table and route registration code block. API sync restored.
-
-**32. ~~`exportable` flag with no export operation~~ RESOLVED**
-
-Removed the `exportable` flag from `create-key`. Transit's value proposition is that keys never leave the service. If export is needed for migration, a dedicated admin-only operation can be added later with audit logging.
+A compromised token could issue unlimited encrypt/decrypt/sign requests.
### engines/user.md
#### Strengths
-- HKDF with per-recipient random salt — prevents wrapping key reuse across messages.
-- AES-256-GCM for DEK wrapping (consistent with codebase, avoids new primitive).
+- HKDF with per-recipient random salt prevents wrapping key reuse.
+- AES-256-GCM for DEK wrapping (consistent with codebase).
- ECDH key agreement with info-string binding prevents key confusion.
- Explicit zeroization of all intermediate secrets documented.
-- Envelope format includes salt per-recipient — correct for HKDF security.
+- Envelope format includes salt per-recipient.
#### Issues
-**33. ~~Auto-provisioning creates keys for arbitrary usernames~~ RESOLVED**
+**22. ~~No forward secrecy for stored data~~ RESOLVED** (per-engine DEKs)
-The encrypt flow now validates recipient usernames against MCIAS via `auth.ValidateUsername` before auto-provisioning. Non-existent usernames are rejected with an error, preventing barrier pollution.
+**23. ~~Generic `POST /v1/engine/request` bypasses typed route middleware~~ RESOLVED**
+
+**24. ~~No CSRF protection for web UI~~ RESOLVED**
+
+**25–32.** ~~Various spec issues~~ **RESOLVED** (see detailed history below)
+
+**33. ~~Auto-provisioning creates keys for arbitrary usernames~~ RESOLVED**
**34. ~~No recipient limit on encrypt~~ RESOLVED**
-Added a `maxRecipients = 100` limit. Requests exceeding this limit are rejected with `400 Bad Request` before any ECDH computation.
-
**35. ~~No re-encryption support for key rotation~~ RESOLVED**
-Added a `re-encrypt` operation that decrypts an envelope and re-encrypts it with current key pairs for all recipients. This enables safe key rotation: re-encrypt all stored envelopes first, then call `rotate-key`. Added to HandleRequest dispatch, gRPC service, REST endpoints, and route registration.
+**36–38.** ~~Various spec/cross-cutting issues~~ **RESOLVED**
-**36. ~~`UserKeyConfig` type undefined~~ RESOLVED**
+---
-Defined `UserKeyConfig` struct with `Algorithm`, `CreatedAt`, and `AutoProvisioned` fields in the in-memory state section.
+## Full System Audit (2026-03-17)
-### Cross-Cutting Issues (Engine Designs)
+**Scope**: All implementation code, deployment, and documentation.
-**37. ~~`adminOnlyOperations` name collision blocks user engine `rotate-key`~~ RESOLVED**
+### Cryptographic Core
-Changed the `adminOnlyOperations` map from flat operation names to engine-type-qualified keys (`engineType:operation`, e.g. `"transit:rotate-key"`). The generic endpoint now resolves the mount's engine type via `GetMount` before checking the map. Added tests verifying that `rotate-key` on a user mount succeeds for non-admin users while `rotate-key` on a transit mount correctly requires admin.
+#### Strengths
-**38. ~~`engine.ZeroizeKey` helper prerequisite not cross-referenced~~ RESOLVED**
+- AES-256-GCM with 12-byte random nonces from `crypto/rand` — correct.
+- Argon2id with configurable parameters stored in `seal_config` — correct.
+- Path-bound AAD in barrier — defense against ciphertext relocation.
+- Per-engine DEKs with v2 ciphertext format — limits blast radius.
+- Constant-time comparison via `crypto/subtle` for all secret comparisons.
-Added prerequisite step to both transit and user implementation steps referencing `engines/sshca.md` step 1 for the `engine.ZeroizeKey` shared helper.
+#### Issues
+
+**39. TOCTOU race in barrier Seal/Unseal**
+
+`barrier.go`: `Seal()` zeroizes keys while concurrent operations may hold stale references between `RLock` release and actual use. A read operation could read the MEK, lose the lock, then use a zeroized key. Requires restructuring to hold the lock through the crypto operation or using atomic pointer swaps.
+
+**40. Crash during `ReWrapKeys` loses all barrier data**
+
+`seal.go`: If the process crashes between re-encrypting all DEKs in `ReWrapKeys` and updating `seal_config` with the new MEK, all data becomes irrecoverable — the old MEK is gone and the new MEK was never persisted. This needs a two-phase commit or WAL-based approach.
+
+**41. `loadKeys` errors silently swallowed during unseal**
+
+`barrier.go`: If `loadKeys` fails to decrypt DEK entries (e.g., corrupt `barrier_keys` rows), errors are silently ignored and the keys map may be incomplete. Subsequent operations on engine mounts with missing DEKs will fail with confusing errors instead of failing at unseal time.
+
+**42. No AAD binding on MEK encryption with KWK**
+
+`seal.go`: The MEK is encrypted with the KWK (derived from password via Argon2id) using `crypto.Encrypt(kwk, mek, nil)`. There is no AAD binding this ciphertext to its purpose. An attacker who can swap `encrypted_mek` in `seal_config` could substitute a different ciphertext (though the practical impact is limited since the KWK is password-derived).
+
+**43. Barrier `List` uses SQL LIKE with unescaped prefix**
+
+`barrier.go`: The `List` method passes the prefix directly into a SQL `LIKE` clause without escaping `%` and `_` characters. A path containing these characters would match unintended entries.
+
+**44. System key rotation query may miss entries**
+
+`barrier.go`: `RotateKey` for the system key excludes all `engine/%` paths, but entries at shorter paths encrypted with the system key could be missed if they don't follow the expected naming convention.
+
+**45. `Zeroize` loop may be optimized away by compiler**
+
+`crypto.go`: The `Zeroize` function uses a simple `for` loop to zero memory. The Go compiler may optimize this away if the slice is not used after zeroization. Use `crypto/subtle.XORBytes` or a volatile-equivalent pattern.
+
+**46. SQLite PRAGMAs only applied to first connection**
+
+`db.go`: `PRAGMA journal_mode`, `foreign_keys`, and `busy_timeout` are applied once at open time but `database/sql` may open additional connections in its pool that don't receive these PRAGMAs. Use a `ConnInitHook` or `_pragma` DSN parameters.
+
+**47. Plaintext not zeroized after re-encryption during key rotation**
+
+`barrier.go`: During `RotateKey`, decrypted plaintext is held in a `[]byte` but not zeroized after re-encryption. This leaves plaintext in memory longer than necessary.
+
+### Engine Implementations
+
+#### CA (PKI) Engine
+
+**48. Path traversal via unsanitized issuer names**
+
+`ca/ca.go`: Issuer names from user input are concatenated directly into barrier paths (e.g., `engine/ca/{mount}/issuers/{name}/...`). A name containing `../` could write to arbitrary barrier locations. All engines should validate mount and entity names against a strict pattern (alphanumeric, hyphens, underscores).
+
+**49. No TTL enforcement against issuer MaxTTL in issuance**
+
+`ca/ca.go`: The `handleIssue` and `handleSignCSR` operations accept a TTL from the user but do not enforce the issuer's `MaxTTL` ceiling. A user can request arbitrarily long certificate lifetimes.
+
+**50. Non-admin users can override key usages**
+
+`ca/ca.go`: The `key_usages` and `ext_key_usages` fields are accepted from non-admin users. A user could request a certificate with `cert sign` or `crl sign` key usage, potentially creating an intermediate CA certificate.
+
+**51. Certificate renewal does not revoke original**
+
+`ca/ca.go`: `handleRenew` creates a new certificate but does not revoke the original. This creates duplicate valid certificates for the same identity, which complicates revocation and weakens the security model.
+
+**52. Leaf private key in API response not zeroized**
+
+`ca/ca.go`: After marshalling the leaf private key to PEM for the API response, the in-memory key material is not zeroized. The key persists in memory until garbage collected.
+
+#### SSH CA Engine
+
+**53. HandleRequest uses exclusive write lock for all operations**
+
+`sshca/sshca.go`: All operations (including reads like `get-cert`, `list-certs`, `get-profile`) acquire a write lock (`mu.Lock()`), serializing the entire engine. Read operations should use `mu.RLock()`.
+
+**54. Host signing is default-allow without policy rules**
+
+`sshca/sshca.go`: When no policy rules match a host signing request, the engine allows it by default. This contradicts the default-deny principle established in the engineering standards and ARCHITECTURE.md.
+
+**55. SSH certificate serial collision risk**
+
+`sshca/sshca.go`: Random `uint64` serials have a birthday collision probability of ~50% at ~4 billion certificates. While far beyond typical scale, the engine should detect and retry on collision.
+
+**56. KRL is not signed**
+
+`sshca/sshca.go`: The generated KRL is not cryptographically signed. An attacker who can intercept the KRL distribution (e.g., MITM on the `GET /v1/sshca/{mount}/krl` endpoint, though TLS mitigates this) could serve a truncated KRL that omits revoked certificates.
+
+**57. PEM key bytes not zeroized after parsing in Unseal**
+
+`sshca/sshca.go`: After reading the CA private key PEM from the barrier and parsing it, the raw PEM bytes are not zeroized.
+
+#### Transit Engine
+
+**58. Default-allow for non-admin users contradicts default-deny**
+
+`transit/transit.go`: Similar to #54 — when no policy rules match a transit operation, the engine allows it. This should default to deny.
+
+**59. Negative ciphertext version not rejected**
+
+`transit/transit.go`: `parseVersionedData` does not reject negative version numbers. A crafted ciphertext with a negative version could cause unexpected behavior in version lookups.
+
+**60. ECDSA big.Int internals not fully zeroized**
+
+`transit/transit.go`: The local `zeroizeKey` clears `D` on ECDSA keys but not `PublicKey.X/Y`. While the public key is not secret, the `big.Int` internal representation may retain data from the private key computation.
+
+#### User E2E Encryption Engine
+
+**61. ECDH private key zeroization is ineffective**
+
+`user/user.go`: `key.Bytes()` returns a copy of the private key bytes. Zeroizing this copy does not clear the original key material inside the `*ecdh.PrivateKey` struct. The actual private key remains in memory.
+
+**62. Policy resource path uses mountPath instead of mount name**
+
+`user/user.go`: Policy checks use the full mount path instead of the mount name. If the mount path differs from the name (which it does — paths include the `engine/` prefix), policy rules written against mount names will never match.
+
+**63. No role checks on decrypt, re-encrypt, and rotate-key**
+
+`user/user.go`: The `handleDecrypt`, `handleReEncrypt`, and `handleRotateKey` operations have no role checks. A guest-role user (who should have restricted access per MCIAS role definitions) can perform these operations.
+
+**64. Initialize does not acquire mutex**
+
+`user/user.go`: The `Initialize` method writes to shared state without holding the mutex, creating a data race if called concurrently.
+
+**65. handleEncrypt uses stale state after releasing lock**
+
+`user/user.go`: After releasing the write lock during encryption, the handler continues to use pointers to user state that may have been modified by another goroutine.
+
+**66. handleReEncrypt uses manual lock without defer**
+
+`user/user.go`: Manual `RLock`/`Unlock` calls without `defer` — a panic between lock and unlock will leak the lock, deadlocking the engine.
+
+**67. No sealed-state check in user HandleRequest**
+
+`user/user.go`: Unlike other engines, the user engine's `HandleRequest` does not check if the engine is sealed. A request reaching the engine after seal but before the HTTP layer catches it could panic on nil map access.
+
+### API Servers
+
+#### REST API
+
+**68. JSON injection via unsanitized error messages**
+
+`server/routes.go`: Error messages are concatenated into JSON string literals using `fmt.Sprintf` without JSON escaping. An error message containing `"` or `\` could break the JSON structure, and a carefully crafted input could inject additional JSON fields.
+
+**69. Typed REST handlers bypass policy engine**
+
+`server/routes.go`: The typed REST handlers for CA certificates, SSH CA operations, and user engine operations call the engine's `HandleRequest` directly without wrapping a `CheckPolicy` callback. Only the generic `/v1/engine/request` endpoint passes the policy checker. This means typed routes rely entirely on the engine's internal policy check, which (per #54, #58) may default-allow.
+
+**70. `RenewCert` gRPC RPC has no corresponding REST route**
+
+`server/routes.go`: The `CAService/RenewCert` gRPC RPC exists but has no REST endpoint, violating the API sync rule.
+
+#### gRPC API
+
+**71. `PKIService/GetCRL` missing from `sealRequiredMethods`**
+
+`grpcserver/server.go`: The `GetCRL` RPC can be called even when the service is sealed. While this is arguably intentional (public endpoint), it is inconsistent with the interceptor design where all RPCs are gated.
+
+#### Policy Engine
+
+**72. Policy rule ID allows path traversal**
+
+`policy/policy.go`: Policy rule IDs are not validated. An ID containing `/` or `..` could write to arbitrary paths in the barrier, since rules are stored at `policy/rules/{id}`.
+
+**73. `filepath.Match` does not support `**` recursive globs**
+
+`policy/policy.go`: Policy resource patterns use `filepath.Match`, which does not support `**` for recursive directory matching. Administrators writing rules like `engine/**/certs/*` will find they don't match as expected.
+
+#### Authentication
+
+**74. Token validation cache grows without bound**
+
+`auth/auth.go`: The token cache has no size limit or eviction of expired entries beyond lazy expiry checks. Under sustained load with many unique tokens, this is an unbounded memory growth vector.
+
+### Web UI
+
+**75. CSRF token not bound to user session**
+
+`webserver/csrf.go`: CSRF tokens are signed with a server-wide HMAC key but not bound to the user's session. Any valid server-generated CSRF token works for any user, reducing CSRF protection to a server-origin check rather than a session-integrity check.
+
+**76. Login cookie missing explicit expiry**
+
+`webserver/routes.go`: The `metacrypt_token` cookie has no `MaxAge` or `Expires`, making it a session cookie that persists until the browser is closed. Consider an explicit TTL matching the MCIAS token lifetime.
+
+**77. Several POST handlers missing `MaxBytesReader`**
+
+`webserver/routes.go`, `webserver/user.go`, `webserver/sshca.go`: `handlePolicyCreate`, `handlePolicyDelete`, `handleUserRegister`, `handleUserRotateKey`, SSH CA cert revoke/delete — all accept POST bodies without `MaxBytesReader`, allowing arbitrarily large request bodies.
+
+### Deployment & Documentation
+
+**78. `ExecReload` sends SIGHUP but no handler exists**
+
+`deploy/systemd/metacrypt.service`, `deploy/systemd/metacrypt-web.service`: Both units define `ExecReload=/bin/kill -HUP $MAINPID`, but the Go binary does not handle SIGHUP. A `systemctl reload` would crash the process.
+
+**79. Dockerfiles use `golang:1.23-alpine` but `go.mod` requires Go 1.25**
+
+`Dockerfile.api`, `Dockerfile.web`: The builder stage uses Go 1.23 but the module requires Go 1.25. Builds will fail.
+
+**80. ARCHITECTURE.md system overview says "TLS 1.2+" but code enforces TLS 1.3**
+
+`ARCHITECTURE.md:33`: The ASCII diagram still says "TLS 1.2+" despite issue #1 being resolved in code. The diagram was not updated.
+
+---
+
+## Open Issues (Unresolved)
+
+### Open — Critical
+
+*None.*
+
+### Open — High
+
+| # | Issue | Location |
+|---|-------|----------|
+| 39 | TOCTOU race in barrier Seal/Unseal allows use of zeroized keys | `barrier/barrier.go` |
+| 40 | Crash during `ReWrapKeys` makes all barrier data irrecoverable | `seal/seal.go` |
+| 48 | Path traversal via unsanitized issuer/entity names in all engines | `ca/ca.go`, all engines |
+| 49 | No TTL enforcement against issuer MaxTTL in cert issuance | `ca/ca.go` |
+| 61 | ECDH private key zeroization is ineffective (`Bytes()` returns copy) | `user/user.go` |
+| 62 | Policy resource path uses mountPath instead of mount name | `user/user.go` |
+| 68 | JSON injection via unsanitized error messages in REST API | `server/routes.go` |
+| 69 | Typed REST handlers bypass policy engine | `server/routes.go` |
+
+### Open — Medium
+
+| # | Issue | Location |
+|---|-------|----------|
+| 7 | No audit logging for cryptographic operations | ARCHITECTURE.md |
+| 10 | No extension allowlist for SSH host certificates | `sshca/sshca.go` |
+| 21 | No rate limiting on transit cryptographic operations | `transit/transit.go` |
+| 41 | `loadKeys` errors silently swallowed during unseal | `barrier/barrier.go` |
+| 42 | No AAD binding on MEK encryption with KWK | `seal/seal.go` |
+| 43 | Barrier `List` SQL LIKE with unescaped prefix | `barrier/barrier.go` |
+| 46 | SQLite PRAGMAs only applied to first connection | `db/db.go` |
+| 50 | Non-admin users can override key usages (cert sign, CRL sign) | `ca/ca.go` |
+| 51 | Certificate renewal does not revoke original | `ca/ca.go` |
+| 53 | SSH CA write-locks all operations including reads | `sshca/sshca.go` |
+| 54 | SSH CA host signing is default-allow (contradicts default-deny) | `sshca/sshca.go` |
+| 58 | Transit default-allow contradicts default-deny | `transit/transit.go` |
+| 59 | Negative ciphertext version not rejected in transit | `transit/transit.go` |
+| 63 | No role checks on user decrypt/re-encrypt/rotate | `user/user.go` |
+| 64 | User engine Initialize has no mutex | `user/user.go` |
+| 65 | handleEncrypt uses stale state after lock release | `user/user.go` |
+| 66 | handleReEncrypt manual lock without defer (leak risk) | `user/user.go` |
+| 67 | No sealed-state check in user HandleRequest | `user/user.go` |
+| 70 | `RenewCert` has no REST route (API sync violation) | `server/routes.go` |
+| 72 | Policy rule ID allows path traversal in barrier | `policy/policy.go` |
+| 73 | `filepath.Match` does not support `**` recursive globs | `policy/policy.go` |
+| 74 | Token validation cache grows without bound | `auth/auth.go` |
+| 78 | systemd `ExecReload` sends SIGHUP with no handler | `deploy/systemd/` |
+| 79 | Dockerfiles use Go 1.23 but module requires Go 1.25 | `Dockerfile.*` |
+
+### Open — Low
+
+| # | Issue | Location |
+|---|-------|----------|
+| 14 | No source-address restriction by default in SSH certs | `sshca/sshca.go` |
+| 44 | System key rotation query may miss entries | `barrier/barrier.go` |
+| 45 | `Zeroize` loop may be optimized away by compiler | `crypto/crypto.go` |
+| 47 | Plaintext not zeroized after re-encryption in rotation | `barrier/barrier.go` |
+| 52 | Leaf private key in API response not zeroized | `ca/ca.go` |
+| 55 | SSH certificate serial collision risk at scale | `sshca/sshca.go` |
+| 56 | KRL is not cryptographically signed | `sshca/sshca.go` |
+| 57 | PEM key bytes not zeroized after parsing in SSH CA | `sshca/sshca.go` |
+| 60 | ECDSA big.Int internals not fully zeroized | `transit/transit.go` |
+| 71 | `GetCRL` missing from `sealRequiredMethods` | `grpcserver/server.go` |
+| 75 | CSRF token not bound to user session | `webserver/csrf.go` |
+| 76 | Login cookie missing explicit expiry | `webserver/routes.go` |
+| 77 | POST handlers missing `MaxBytesReader` | `webserver/` |
+| 80 | ARCHITECTURE.md diagram still says "TLS 1.2+" | `ARCHITECTURE.md` |
+
+### Accepted
+
+| # | Issue | Rationale |
+|---|-------|-----------|
+| 2 | Token cache 30s revocation gap | Trade-off: MCIAS load vs revocation latency |
+| 3 | Admin all-or-nothing access | Intentional design |
+| 8 | Unseal rate limit resets on restart | Argon2id is the primary mitigation |
+
+---
+
+## Resolved Issues (#1–#38)
+
+All design review findings from the 2026-03-16 audit have been resolved or accepted. See the [Audit History](#audit-history) section. The following issues were resolved:
+
+**Critical** (all resolved): #4 (policy auth contradiction), #9 (user-controllable SSH serials), #13 (policy path collision), #37 (adminOnlyOperations name collision).
+
+**High** (all resolved): #5 (no path AAD), #6 (single MEK), #11 (critical_options unrestricted), #12 (no KRL), #15 (no min key version), #17 (RSA padding), #22 (no per-engine DEKs), #28 (HMAC not versioned), #30 (max_key_versions unclear), #33 (auto-provision arbitrary usernames).
+
+**Medium** (all resolved or accepted): #1, #2, #3, #8, #20, #23, #24, #25, #26, #27, #29, #31, #34.
+
+**Low** (all resolved): #18, #19, #32, #35, #36, #38.
---
## Priority Summary
-| Priority | Issue | Location |
-|----------|-------|----------|
-| ~~**Critical**~~ | ~~#4 — Policy auth contradiction (admin vs user)~~ **RESOLVED** | ARCHITECTURE.md |
-| ~~**Critical**~~ | ~~#9 — User-controllable SSH cert serials~~ **RESOLVED** | sshca.md |
-| ~~**Critical**~~ | ~~#13 — Policy path collision (`ca/` vs `sshca/`)~~ **RESOLVED** | sshca.md |
-| ~~**Critical**~~ | ~~#37 — `adminOnlyOperations` name collision blocks user `rotate-key`~~ **RESOLVED** | Cross-cutting |
-| ~~**High**~~ | ~~#5 — No path AAD in barrier encryption~~ **RESOLVED** | ARCHITECTURE.md |
-| ~~**High**~~ | ~~#12 — No KRL distribution for SSH revocation~~ **RESOLVED** | sshca.md |
-| ~~**High**~~ | ~~#15 — No min key version for transit rotation~~ **RESOLVED** | transit.md |
-| ~~**High**~~ | ~~#17 — RSA padding scheme unspecified~~ **RESOLVED** | transit.md |
-| ~~**High**~~ | ~~#11 — `critical_options` not restricted~~ **RESOLVED** | sshca.md |
-| ~~**High**~~ | ~~#6 — Single MEK with no rotation~~ **RESOLVED** | ARCHITECTURE.md |
-| ~~**High**~~ | ~~#22 — No forward secrecy / per-engine DEKs~~ **RESOLVED** | Cross-cutting |
-| ~~**High**~~ | ~~#28 — HMAC output not versioned~~ **RESOLVED** | transit.md |
-| ~~**High**~~ | ~~#30 — `max_key_versions` vs `min_decryption_version` unclear~~ **RESOLVED** | transit.md |
-| ~~**High**~~ | ~~#33 — Auto-provision creates keys for arbitrary usernames~~ **RESOLVED** | user.md |
-| ~~**Medium**~~ | ~~#2 — Token cache revocation gap~~ **ACCEPTED** | ARCHITECTURE.md |
-| ~~**Medium**~~ | ~~#3 — Admin all-or-nothing access~~ **ACCEPTED** | ARCHITECTURE.md |
-| ~~**Medium**~~ | ~~#8 — Unseal rate limit resets on restart~~ **ACCEPTED** | ARCHITECTURE.md |
-| ~~**Medium**~~ | ~~#20 — `decrypt` mapped to `read` action~~ **RESOLVED** | transit.md |
-| ~~**Medium**~~ | ~~#24 — No CSRF protection for web UI~~ **RESOLVED** | ARCHITECTURE.md |
-| ~~**Medium**~~ | ~~#25 — Missing `list-certs` REST route~~ **RESOLVED** | sshca.md |
-| ~~**Medium**~~ | ~~#26 — KRL section type description error~~ **RESOLVED** | sshca.md |
-| ~~**Medium**~~ | ~~#27 — Policy check after cert construction~~ **RESOLVED** | sshca.md |
-| ~~**Medium**~~ | ~~#29 — `rewrap` policy action not specified~~ **RESOLVED** | transit.md |
-| ~~**Medium**~~ | ~~#31 — Missing `get-public-key` REST route~~ **RESOLVED** | transit.md |
-| ~~**Medium**~~ | ~~#34 — No recipient limit on encrypt~~ **RESOLVED** | user.md |
-| ~~**Low**~~ | ~~#1 — TLS 1.2 vs 1.3~~ **RESOLVED** | ARCHITECTURE.md |
-| ~~**Low**~~ | ~~#19 — No batch transit operations~~ **RESOLVED** | transit.md |
-| ~~**Low**~~ | ~~#18 — HMAC/sign semantic confusion~~ **RESOLVED** | transit.md |
-| ~~**Medium**~~ | ~~#23 — Generic endpoint bypasses typed route middleware~~ **RESOLVED** | Cross-cutting |
-| ~~**Low**~~ | ~~#32 — `exportable` flag with no export operation~~ **RESOLVED** | transit.md |
-| ~~**Low**~~ | ~~#35 — No re-encryption support for user key rotation~~ **RESOLVED** | user.md |
-| ~~**Low**~~ | ~~#36 — `UserKeyConfig` type undefined~~ **RESOLVED** | user.md |
-| ~~**Low**~~ | ~~#38 — `ZeroizeKey` prerequisite not cross-referenced~~ **RESOLVED** | Cross-cutting |
+| Priority | Count | Status |
+|----------|-------|--------|
+| High | 8 | Open |
+| Medium | 21 | Open |
+| Low | 14 | Open |
+| Accepted | 3 | Closed |
+| Resolved | 38 | Closed |
+
+**Recommendation**: Address all High findings before the next deployment. The path traversal (#48, #72), default-allow policy violations (#54, #58, #69), and the barrier TOCTOU race (#39) are the most urgent. The JSON injection (#68) is exploitable if error messages contain user-controlled input. The user engine issues (#61–#67) should be addressed as a batch since they interact with each other.
diff --git a/REMEDIATION.md b/REMEDIATION.md
index 5b3e7a0..4ec779f 100644
--- a/REMEDIATION.md
+++ b/REMEDIATION.md
@@ -1,354 +1,579 @@
-# Remediation Plan
+# Remediation Plan — High-Priority Audit Findings
-**Date**: 2026-03-16
-**Scope**: Audit findings #25–#38 from engine design review
+**Date**: 2026-03-17
+**Scope**: AUDIT.md findings #39, #40, #48, #49, #61, #62, #68, #69
-This document provides a concrete remediation plan for each open finding. Items
-are grouped by priority and ordered for efficient implementation (dependencies
-first).
+This plan addresses all eight High-severity findings from the 2026-03-17
+full system audit. Findings are grouped into four work items by shared root
+cause or affected subsystem. The order reflects dependency chains: #68 is a
+standalone fix that should ship first; #48 is a prerequisite for safe
+operation across all engines; #39/#40 affect the storage core; the remaining
+four affect specific engines.
---
-## Critical
+## Work Item 1: JSON Injection in REST Error Responses (#68)
-### #37 — `adminOnlyOperations` name collision blocks user `rotate-key`
+**Risk**: An error message containing `"` or `\` breaks the JSON response
+structure. If the error contains attacker-controlled input (e.g., a mount
+name or key name that triggers a downstream error), this enables JSON
+injection in API responses.
-**Problem**: The `adminOnlyOperations` map in `handleEngineRequest`
-(`internal/server/routes.go:265`) is a flat `map[string]bool` keyed by
-operation name. The transit engine's `rotate-key` is admin-only, but the user
-engine's `rotate-key` is user-self. Since the map is checked before engine
-dispatch, non-admin users are blocked from calling `rotate-key` on any engine
-mount — including user engine mounts where it should be allowed.
-
-**Fix**: Replace the flat map with an engine-type-qualified lookup. Two options:
-
-**Option A — Qualify the map key** (minimal change):
-
-Change the map type to include the engine type prefix:
+**Root cause**: 13 locations in `internal/server/routes.go` construct JSON
+error responses via string concatenation:
```go
-var adminOnlyOperations = map[string]bool{
- "ca:import-root": true,
- "ca:create-issuer": true,
- "ca:delete-issuer": true,
- "ca:revoke-cert": true,
- "ca:delete-cert": true,
- "transit:create-key": true,
- "transit:delete-key": true,
- "transit:rotate-key": true,
- "transit:update-key-config": true,
- "transit:trim-key": true,
- "sshca:create-profile": true,
- "sshca:update-profile": true,
- "sshca:delete-profile": true,
- "sshca:revoke-cert": true,
- "sshca:delete-cert": true,
- "user:provision": true,
- "user:delete-user": true,
+http.Error(w, `{"error":"`+err.Error()+`"}`, http.StatusInternalServerError)
+```
+
+The `writeEngineError` helper (line 1704) is the most common entry point;
+most typed handlers call it.
+
+### Fix
+
+1. **Replace `writeEngineError`** with a safe JSON encoder:
+
+ ```go
+ func writeJSONError(w http.ResponseWriter, msg string, code int) {
+ w.Header().Set("Content-Type", "application/json")
+ w.WriteHeader(code)
+ _ = json.NewEncoder(w).Encode(map[string]string{"error": msg})
+ }
+ ```
+
+2. **Replace all 13 call sites** that use string concatenation with
+ `writeJSONError(w, grpcMessage(err), status)` or
+ `writeJSONError(w, err.Error(), status)`.
+
+ The `grpcMessage` helper already exists in the webserver package and
+ extracts human-readable messages from gRPC errors. Add an equivalent
+ to the REST server, and prefer it over raw `err.Error()` to avoid
+ leaking internal error details.
+
+3. **Grep for the pattern** `"error":"` in `routes.go` to confirm no
+ remaining string-concatenated JSON.
+
+### Files
+
+| File | Change |
+|------|--------|
+| `internal/server/routes.go` | Replace `writeEngineError` and all 13 inline error sites |
+
+### Verification
+
+- `go vet ./internal/server/`
+- `go test ./internal/server/`
+- Manual test: mount an engine with a name containing `"`, trigger an error,
+ verify the response is valid JSON.
+
+---
+
+## Work Item 2: Path Traversal via Unsanitized Names (#48)
+
+**Risk**: User-controlled strings (issuer names, key names, profile names,
+usernames, mount names) are concatenated directly into barrier storage
+paths. An input containing `../` traverses the barrier namespace, allowing
+reads and writes to arbitrary paths. This affects all four engines and the
+engine registry.
+
+**Root cause**: No validation exists at any layer — neither the barrier's
+`Put`/`Get`/`Delete` methods nor the engines sanitize path components.
+
+### Vulnerable locations
+
+| File | Input | Path Pattern |
+|------|-------|-------------|
+| `ca/ca.go` | issuer `name` | `mountPath + "issuers/" + name + "/"` |
+| `sshca/sshca.go` | profile `name` | `mountPath + "profiles/" + name + ".json"` |
+| `transit/transit.go` | key `name` | `mountPath + "keys/" + name + "/"` |
+| `user/user.go` | `username` | `mountPath + "users/" + username + "/"` |
+| `engine/engine.go` | mount `name` | `engine/{type}/{name}/` |
+| `policy/policy.go` | rule `ID` | `policy/rules/{id}` |
+
+### Fix
+
+Enforce validation at **two layers** (defense in depth):
+
+1. **Barrier layer** — reject paths containing `..` segments.
+
+ Add a `validatePath` check at the top of `Get`, `Put`, `Delete`, and
+ `List` in `barrier.go`:
+
+ ```go
+ var ErrInvalidPath = errors.New("barrier: invalid path")
+
+ func validatePath(p string) error {
+ for _, seg := range strings.Split(p, "/") {
+ if seg == ".." {
+ return fmt.Errorf("%w: path traversal rejected: %q", ErrInvalidPath, p)
+ }
+ }
+ return nil
+ }
+ ```
+
+ Call `validatePath` at the entry of `Get`, `Put`, `Delete`, `List`.
+ Return `ErrInvalidPath` on failure.
+
+2. **Engine/registry layer** — validate entity names at input boundaries.
+
+ Add a `ValidateName` helper to `internal/engine/`:
+
+ ```go
+ var namePattern = regexp.MustCompile(`^[a-zA-Z0-9][a-zA-Z0-9._-]*$`)
+
+ func ValidateName(name string) error {
+ if name == "" || len(name) > 128 || !namePattern.MatchString(name) {
+ return fmt.Errorf("invalid name %q: must be 1-128 alphanumeric, "+
+ "dot, hyphen, or underscore characters", name)
+ }
+ return nil
+ }
+ ```
+
+ Call `ValidateName` in:
+
+ | Location | Input validated |
+ |----------|----------------|
+ | `engine.go` `Mount()` | mount name |
+ | `ca.go` `handleCreateIssuer` | issuer name |
+ | `sshca.go` `handleCreateProfile` | profile name |
+ | `transit.go` `handleCreateKey` | key name |
+ | `user.go` `handleRegister`, `handleProvision` | username |
+ | `user.go` `handleEncrypt` | recipient usernames |
+ | `policy.go` `CreateRule` | rule ID |
+
+ Note: certificate serials are generated server-side from `crypto/rand`
+ and hex-encoded, so they are safe. Validate anyway for defense in depth.
+
+### Files
+
+| File | Change |
+|------|--------|
+| `internal/barrier/barrier.go` | Add `validatePath`, call from Get/Put/Delete/List |
+| `internal/engine/engine.go` | Add `ValidateName`, call from `Mount` |
+| `internal/engine/ca/ca.go` | Call `ValidateName` on issuer name |
+| `internal/engine/sshca/sshca.go` | Call `ValidateName` on profile name |
+| `internal/engine/transit/transit.go` | Call `ValidateName` on key name |
+| `internal/engine/user/user.go` | Call `ValidateName` on usernames |
+| `internal/policy/policy.go` | Call `ValidateName` on rule ID |
+
+### Verification
+
+- Add `TestValidatePath` to `barrier_test.go`: confirm `../` and `..` are
+ rejected; confirm normal paths pass.
+- Add `TestValidateName` to `engine_test.go`: confirm `../evil`, empty
+ string, and overlong names are rejected; confirm valid names pass.
+- `go test ./internal/barrier/ ./internal/engine/... ./internal/policy/`
+
+---
+
+## Work Item 3: Barrier Concurrency and Crash Safety (#39, #40)
+
+These two findings share the barrier/seal subsystem and should be addressed
+together.
+
+### #39 — TOCTOU Race in Barrier Get/Put
+
+**Risk**: `Get` and `Put` copy the `mek` slice header and `keys` map
+reference under `RLock`, release the lock, then use the copied references
+for encryption/decryption. A concurrent `Seal()` zeroizes the underlying
+byte slices in place before nil-ing the fields, so a concurrent reader
+uses zeroized key material.
+
+**Root cause**: The lock does not cover the crypto operation. The "copy"
+is a shallow reference copy (slice header), not a deep byte copy. `Seal()`
+zeroizes the backing array, which is shared.
+
+**Current locking pattern** (`barrier.go`):
+
+```
+Get: RLock → copy mek/keys refs → RUnlock → decrypt (uses zeroized key)
+Put: RLock → copy mek/keys refs → RUnlock → encrypt (uses zeroized key)
+Seal: Lock → zeroize mek bytes → nil mek → zeroize keys → nil keys → Unlock
+```
+
+**Fix**: Hold `RLock` through the entire crypto operation:
+
+```go
+func (b *AESGCMBarrier) Get(ctx context.Context, path string) ([]byte, error) {
+ if err := validatePath(path); err != nil {
+ return nil, err
+ }
+ b.mu.RLock()
+ defer b.mu.RUnlock()
+ if b.mek == nil {
+ return nil, ErrSealed
+ }
+ // query DB, resolve key, decrypt — all under RLock
+ // ...
}
```
-In `handleEngineRequest`, look up `engineType + ":" + operation` instead of
-just `operation`. The `engineType` is already known from the mount registry
-(the generic endpoint resolves the mount to an engine type).
+This is the minimal, safest change. `RLock` permits concurrent readers, so
+there is no throughput regression for parallel `Get`/`Put` operations. The
+only serialization point is `Seal()`, which acquires the exclusive `Lock`
+and waits for all readers to drain — exactly the semantics we want.
-**Option B — Per-engine admin operations** (cleaner but more code):
+Apply the same pattern to `Put`, `Delete`, and `List`.
-Each engine implements an `AdminOperations() []string` method. The server
-queries the resolved engine for its admin operations instead of using a global
-map.
+**Alternative considered**: Atomic pointer swap (`atomic.Pointer[keyState]`).
+This eliminates the lock from the hot path entirely, but introduces
+complexity around deferred zeroization of the old state (readers may still
+hold references). The `RLock`-through-crypto approach is simpler and
+sufficient for Metacrypt's concurrency profile.
-**Recommendation**: Option A. It requires a one-line change to the lookup and
-a mechanical update to the map keys. The generic endpoint already resolves the
-mount to get the engine type.
+### #40 — Crash During `ReWrapKeys` Loses All Data
-**Files to change**:
-- `internal/server/routes.go` — update map and lookup in `handleEngineRequest`
-- `engines/sshca.md` — update `adminOnlyOperations` section
-- `engines/transit.md` — update `adminOnlyOperations` section
-- `engines/user.md` — update `adminOnlyOperations` section
+**Risk**: `RotateMEK` calls `barrier.ReWrapKeys(newMEK)` which commits a
+transaction re-wrapping all DEKs, then separately updates `seal_config`
+with the new encrypted MEK. A crash between these two database operations
+leaves DEKs wrapped with a MEK that is not persisted — all data is
+irrecoverable.
-**Tests**: Add test case in `internal/server/server_test.go` — non-admin user
-calling `rotate-key` via generic endpoint on a user engine mount should succeed
-(policy permitting). Same call on a transit mount should return 403.
-
----
-
-## High
-
-### #28 — HMAC output not versioned
-
-**Problem**: HMAC output is raw base64 with no key version indicator. After key
-rotation and `min_decryption_version` advancement, old HMACs are unverifiable
-because the engine doesn't know which key version produced them.
-
-**Fix**: Use the same versioned prefix format as ciphertext and signatures:
+**Current flow** (`seal.go` lines 245–313):
```
-metacrypt:v{version}:{base64(mac_bytes)}
+1. Generate newMEK
+2. barrier.ReWrapKeys(ctx, newMEK) ← commits transaction (barrier_keys updated)
+3. crypto.Encrypt(kwk, newMEK, nil) ← encrypt new MEK
+4. UPDATE seal_config SET encrypted_mek = ? ← separate statement, not in transaction
+ *** CRASH HERE = DATA LOSS ***
+5. Swap in-memory MEK
```
-Update the `hmac` operation to include `key_version` in the response. Update
-internal HMAC verification to parse the version prefix and select the
-corresponding key version (subject to `min_decryption_version` enforcement).
+**Fix**: Unify steps 2–4 into a single database transaction.
-**Files to change**:
-- `engines/transit.md` — update HMAC section, add HMAC output format, update
- Cryptographic Details section
-- Implementation: `internal/engine/transit/sign.go` (when implemented)
-
-### #30 — `max_key_versions` vs `min_decryption_version` unclear
-
-**Problem**: The spec doesn't define when `max_key_versions` pruning happens or
-whether it respects `min_decryption_version`. Auto-pruning on rotation could
-destroy versions that still have unrewrapped ciphertext.
-
-**Fix**: Define the behavior explicitly in `engines/transit.md`:
-
-1. `max_key_versions` pruning happens during `rotate-key`, after the new
- version is created.
-2. Pruning **only** deletes versions **strictly less than**
- `min_decryption_version`. If `max_key_versions` would require deleting a
- version at or above `min_decryption_version`, the version is **retained**
- and a warning is included in the response:
- `"warning": "max_key_versions exceeded; advance min_decryption_version to enable pruning"`.
-3. This means `max_key_versions` is a soft limit — it is only enforceable
- after the operator completes the rotation cycle (rotate → rewrap → advance
- min → prune happens automatically on next rotate).
-
-This resolves the original audit finding #16 as well.
-
-**Files to change**:
-- `engines/transit.md` — add `max_key_versions` behavior to Key Rotation
- section and `rotate-key` flow
-- `AUDIT.md` — mark #16 as RESOLVED with reference to the new behavior
-
-### #33 — Auto-provision creates keys for arbitrary usernames
-
-**Problem**: The encrypt flow auto-provisions recipients without validating
-that the username exists in MCIAS. Any authenticated user can create barrier
-entries for non-existent users.
-
-**Fix**: Before auto-provisioning, validate the recipient username against
-MCIAS. The engine has access to the auth system via `req.CallerInfo` context.
-Add an MCIAS user lookup:
-
-1. Add a `ValidateUsername(username string) (bool, error)` method to the auth
- client interface. This calls the MCIAS user info endpoint to check if the
- username exists.
-2. In the encrypt flow, before auto-provisioning a recipient, call
- `ValidateUsername`. If the user doesn't exist in MCIAS, return an error:
- `"recipient not found: {username}"`.
-3. Document this validation in the encrypt flow and security considerations.
-
-**Alternative** (simpler, weaker): Skip MCIAS validation but add a
-rate limit on auto-provisioning (e.g., max 10 new provisions per encrypt
-request, max 100 total auto-provisions per hour per caller). This prevents
-storage inflation but doesn't prevent phantom users.
-
-**Recommendation**: MCIAS validation. It's the correct security boundary —
-only real MCIAS users should have keypairs.
-
-**Files to change**:
-- `engines/user.md` — update encrypt flow step 2, add MCIAS validation
-- `internal/auth/` — add `ValidateUsername` to auth client (when implemented)
-
----
-
-## Medium
-
-### #25 — Missing `list-certs` REST route (SSH CA)
-
-**Fix**: Add to the REST endpoints table:
-
-```
-| GET | `/v1/sshca/{mount}/certs` | List cert records |
-```
-
-Add to the route registration code block:
+Refactor `ReWrapKeys` to accept an optional `*sql.Tx`:
```go
-r.Get("/v1/sshca/{mount}/certs", s.requireAuth(s.handleSSHCAListCerts))
-```
+// ReWrapKeysTx re-wraps all DEKs with newMEK within the given transaction.
+func (b *AESGCMBarrier) ReWrapKeysTx(ctx context.Context, tx *sql.Tx, newMEK []byte) error {
+ // Same logic as ReWrapKeys, but use tx instead of b.db.BeginTx.
+ rows, err := tx.QueryContext(ctx, "SELECT key_id, wrapped_key FROM barrier_keys")
+ // ... decrypt with old MEK, encrypt with new MEK, UPDATE barrier_keys ...
+}
-**Files to change**: `engines/sshca.md`
-
-### #26 — KRL section type description error
-
-**Fix**: Change the description block from:
-
-```
-Section type: KRL_SECTION_CERT_SERIAL_LIST (0x21)
-```
-
-to:
-
-```
-Section type: KRL_SECTION_CERTIFICATES (0x01)
- CA key blob: ssh.MarshalAuthorizedKey(caSigner.PublicKey())
- Subsection type: KRL_SECTION_CERT_SERIAL_LIST (0x20)
-```
-
-This matches the pseudocode comments and the OpenSSH `PROTOCOL.krl` spec.
-
-**Files to change**: `engines/sshca.md`
-
-### #27 — Policy check after cert construction (SSH CA)
-
-**Fix**: Reorder the sign-host flow steps:
-
-1. Authenticate caller.
-2. Parse the supplied SSH public key.
-3. Parse TTL.
-4. **Policy check**: for each hostname, check policy on
- `sshca/{mount}/id/{hostname}`, action `sign`.
-5. Generate serial (only after policy passes).
-6. Build `ssh.Certificate`.
-7. Sign, store, return.
-
-Same reordering for sign-user.
-
-**Files to change**: `engines/sshca.md`
-
-### #29 — `rewrap` policy action not specified
-
-**Fix**: Add `rewrap` as an explicit action in the `operationAction` mapping.
-`rewrap` maps to `decrypt` (since it requires internal access to plaintext).
-Batch variants map to the same action.
-
-Add to the authorization section in `engines/transit.md`:
-
-> The `rewrap` and `batch-rewrap` operations require the `decrypt` action —
-> rewrap internally decrypts with the old version and re-encrypts with the
-> latest, so the caller must have decrypt permission. Alternatively, a
-> dedicated `rewrap` action could be added for finer-grained control, but
-> `decrypt` is the safer default (granting `rewrap` without `decrypt` would be
-> odd since rewrap implies decrypt capability).
-
-**Recommendation**: Map to `decrypt`. Simpler, and anyone who should rewrap
-should also be able to decrypt.
-
-**Files to change**: `engines/transit.md`
-
-### #31 — Missing `get-public-key` REST route (Transit)
-
-**Fix**: Add to the REST endpoints table:
-
-```
-| GET | `/v1/transit/{mount}/keys/{name}/public-key` | Get public key |
-```
-
-Add to the route registration code block:
-
-```go
-r.Get("/v1/transit/{mount}/keys/{name}/public-key", s.requireAuth(s.handleTransitGetPublicKey))
-```
-
-**Files to change**: `engines/transit.md`
-
-### #34 — No recipient limit on encrypt (User)
-
-**Fix**: Add a compile-time constant `maxRecipients = 100` to the user engine.
-Reject requests exceeding this limit with `400 Bad Request` / `InvalidArgument`
-before any ECDH computation.
-
-Add to the encrypt flow in `engines/user.md` after step 1:
-
-> Validate that `len(recipients) <= maxRecipients` (100). Reject with error if
-> exceeded.
-
-Add to the security considerations section.
-
-**Files to change**: `engines/user.md`
-
----
-
-## Low
-
-### #32 — `exportable` flag with no export operation (Transit)
-
-**Fix**: Add an `export-key` operation to the transit engine:
-
-- Auth: User+Policy (action `read`).
-- Only succeeds if the key's `exportable` flag is `true`.
-- Returns raw key material (base64-encoded) for the current version only.
-- Asymmetric keys: returns private key in PKCS8 PEM.
-- Symmetric keys: returns raw key bytes, base64-encoded.
-- Add to HandleRequest dispatch, gRPC service, REST endpoints.
-
-Alternatively, if key export is never intended, remove the `exportable` flag
-from `create-key` to avoid dead code. Given that transit is meant to keep keys
-server-side, **removing the flag** may be the better choice. Document the
-decision either way.
-
-**Recommendation**: Remove `exportable`. Transit's entire value proposition is
-that keys never leave the service. If export is needed for migration, a
-dedicated admin-only `export-key` can be added later with appropriate audit
-logging (#7).
-
-**Files to change**: `engines/transit.md`
-
-### #35 — No re-encryption support for user key rotation
-
-**Fix**: Add a `re-encrypt` operation:
-
-- Auth: User (self) — only the envelope recipient can re-encrypt.
-- Input: old envelope.
-- Flow: decrypt with current key, generate new DEK, re-encrypt, return new
- envelope.
-- The old key must still be valid at the time of re-encryption. Document the
- workflow: re-encrypt all stored envelopes, then rotate-key.
-
-This is a quality-of-life improvement, not a security fix. The current design
-(decrypt + encrypt separately) works but requires the caller to handle
-plaintext.
-
-**Files to change**: `engines/user.md`
-
-### #36 — `UserKeyConfig` type undefined
-
-**Fix**: Add the type definition to the in-memory state section:
-
-```go
-type UserKeyConfig struct {
- Algorithm string `json:"algorithm"` // key exchange algorithm used
- CreatedAt time.Time `json:"created_at"`
- AutoProvisioned bool `json:"auto_provisioned"` // created via auto-provision
+// SwapMEK updates the in-memory MEK after a committed transaction.
+func (b *AESGCMBarrier) SwapMEK(newMEK []byte) {
+ b.mu.Lock()
+ defer b.mu.Unlock()
+ mcrypto.Zeroize(b.mek)
+ b.mek = newMEK
}
```
-**Files to change**: `engines/user.md`
+Then in `RotateMEK`:
-### #38 — `ZeroizeKey` prerequisite not cross-referenced
+```go
+func (m *Manager) RotateMEK(ctx context.Context, password string) error {
+ // ... derive KWK, generate newMEK ...
-**Fix**: Add to the Implementation Steps section in both `engines/transit.md`
-and `engines/user.md`:
+ tx, err := m.db.BeginTx(ctx, nil)
+ if err != nil {
+ return err
+ }
+ defer tx.Rollback()
-> **Prerequisite**: `engine.ZeroizeKey` must exist in
-> `internal/engine/helpers.go` (created as part of the SSH CA engine
-> implementation — see `engines/sshca.md` step 1).
+ // Re-wrap all DEKs within the transaction.
+ if err := m.barrier.ReWrapKeysTx(ctx, tx, newMEK); err != nil {
+ return err
+ }
-**Files to change**: `engines/transit.md`, `engines/user.md`
+ // Update seal_config within the same transaction.
+ encNewMEK, err := crypto.Encrypt(kwk, newMEK, nil)
+ if err != nil {
+ return err
+ }
+ if _, err := tx.ExecContext(ctx,
+ "UPDATE seal_config SET encrypted_mek = ? WHERE id = 1",
+ encNewMEK,
+ ); err != nil {
+ return err
+ }
+
+ if err := tx.Commit(); err != nil {
+ return err
+ }
+
+ // Only after commit: update in-memory state.
+ m.barrier.SwapMEK(newMEK)
+ return nil
+}
+```
+
+SQLite in WAL mode handles this correctly — the transaction is atomic
+regardless of process crash. The `barrier_keys` and `seal_config` updates
+either both commit or neither does.
+
+### Files
+
+| File | Change |
+|------|--------|
+| `internal/barrier/barrier.go` | Extend RLock scope in Get/Put/Delete/List; add `ReWrapKeysTx`, `SwapMEK` |
+| `internal/seal/seal.go` | Wrap ReWrapKeysTx + seal_config UPDATE in single transaction |
+| `internal/barrier/barrier_test.go` | Add concurrent Get/Seal stress test |
+
+### Verification
+
+- `go test -race ./internal/barrier/ ./internal/seal/`
+- Add `TestConcurrentGetSeal`: spawn goroutines doing Get while another
+ goroutine calls Seal. Run with `-race`. Verify no panics or data races.
+- Add `TestRotateMEKAtomic`: verify that `barrier_keys` and `seal_config`
+ are updated in the same transaction (mock the DB to detect transaction
+ boundaries, or verify via rollback behavior).
+
+---
+
+## Work Item 4: CA TTL Enforcement, User Engine Fixes, Policy Bypass (#49, #61, #62, #69)
+
+These four findings touch separate files with no overlap and can be
+addressed in parallel.
+
+### #49 — No TTL Ceiling in CA Certificate Issuance
+
+**Risk**: A non-admin user can request an arbitrarily long certificate
+lifetime. The issuer's `MaxTTL` exists in config but is not enforced
+during `handleIssue` or `handleSignCSR`.
+
+**Root cause**: The CA engine applies the user's requested TTL directly
+to the certificate without comparing it against `issuerConfig.MaxTTL`.
+The SSH CA engine correctly enforces this via `resolveTTL` — the CA
+engine does not.
+
+**Fix**: Add a `resolveTTL` method to the CA engine, following the SSH
+CA engine's pattern (`sshca.go` lines 902–932):
+
+```go
+func (e *CAEngine) resolveTTL(requested string, issuer *issuerState) (time.Duration, error) {
+ maxTTL, err := time.ParseDuration(issuer.config.MaxTTL)
+ if err != nil {
+ maxTTL = 2160 * time.Hour // 90 days fallback
+ }
+
+ if requested != "" {
+ ttl, err := time.ParseDuration(requested)
+ if err != nil {
+ return 0, fmt.Errorf("invalid TTL: %w", err)
+ }
+ if ttl > maxTTL {
+ return 0, fmt.Errorf("requested TTL %s exceeds issuer maximum %s", ttl, maxTTL)
+ }
+ return ttl, nil
+ }
+
+ return maxTTL, nil
+}
+```
+
+Call this in `handleIssue` and `handleSignCSR` before constructing the
+certificate. Replace the raw TTL string with the validated duration.
+
+| File | Change |
+|------|--------|
+| `internal/engine/ca/ca.go` | Add `resolveTTL`, call from `handleIssue` and `handleSignCSR` |
+| `internal/engine/ca/ca_test.go` | Add test: issue cert with TTL > MaxTTL, verify rejection |
+
+### #61 — Ineffective ECDH Key Zeroization
+
+**Risk**: `privKey.Bytes()` returns a copy of the private key bytes.
+Zeroizing the copy leaves the original inside `*ecdh.PrivateKey`. Go's
+`crypto/ecdh` API does not expose the internal byte slice.
+
+**Root cause**: Language/API limitation in Go's `crypto/ecdh` package.
+
+**Fix**: Store the raw private key bytes alongside the parsed key in
+`userState`, and zeroize those bytes on seal:
+
+```go
+type userState struct {
+ privKey *ecdh.PrivateKey
+ privBytes []byte // raw key bytes, retained for zeroization
+ pubKey *ecdh.PublicKey
+ config *UserKeyConfig
+}
+```
+
+On **load from barrier** (Unseal, auto-provision):
+```go
+raw, err := b.Get(ctx, prefix+"priv.key")
+priv, err := curve.NewPrivateKey(raw)
+state.privBytes = raw // retain for zeroization
+state.privKey = priv
+```
+
+On **Seal**:
+```go
+mcrypto.Zeroize(u.privBytes)
+u.privKey = nil
+u.privBytes = nil
+```
+
+Document the limitation: the parsed `*ecdh.PrivateKey` struct's internal
+copy cannot be zeroized from Go code. Setting `privKey = nil` makes it
+eligible for GC, but does not guarantee immediate byte overwrite. This is
+an accepted Go runtime limitation.
+
+| File | Change |
+|------|--------|
+| `internal/engine/user/user.go` | Add `privBytes` to `userState`, populate on load, zeroize on Seal |
+| `internal/engine/user/types.go` | Update `userState` struct |
+
+### #62 — User Engine Policy Path Uses `mountPath` Instead of Mount Name
+
+**Risk**: Policy checks construct the resource path using `e.mountPath`
+(which is `engine/user/{name}/`) instead of just the mount name. Policy
+rules match against `user/{name}/recipient/{username}`, so the full mount
+path creates a mismatch like `user/engine/user/myengine//recipient/alice`.
+No policy rule will ever match.
+
+**Root cause**: Line 358 of `user.go` uses `e.mountPath` directly. The
+SSH CA and transit engines correctly use a `mountName()` helper.
+
+**Fix**: Add a `mountName()` method to the user engine:
+
+```go
+func (e *UserEngine) mountName() string {
+ // mountPath is "engine/user/{name}/"
+ parts := strings.Split(strings.TrimSuffix(e.mountPath, "/"), "/")
+ if len(parts) >= 3 {
+ return parts[2]
+ }
+ return e.mountPath
+}
+```
+
+Change line 358:
+
+```go
+resource := fmt.Sprintf("user/%s/recipient/%s", e.mountName(), r)
+```
+
+Audit all other resource path constructions in the user engine to confirm
+they also use the correct mount name.
+
+| File | Change |
+|------|--------|
+| `internal/engine/user/user.go` | Add `mountName()`, fix resource path on line 358 |
+| `internal/engine/user/user_test.go` | Add test: verify policy resource path format |
+
+### #69 — Typed REST Handlers Bypass Policy Engine
+
+**Risk**: 18 typed REST handlers pass `nil` for `CheckPolicy` in the
+`engine.Request`, skipping service-level policy evaluation. The generic
+`/v1/engine/request` endpoint correctly passes a `policyChecker`. Since
+engines #54 and #58 default to allow when no policy matches, typed routes
+are effectively unprotected by policy.
+
+**Root cause**: Typed handlers were modeled after admin-only operations
+(which don't need policy) but applied to user-accessible operations.
+
+**Fix**: Extract the policy checker construction from
+`handleEngineRequest` into a shared helper:
+
+```go
+func (s *Server) newPolicyChecker(info *CallerInfo) engine.PolicyChecker {
+ return func(resource, action string) (string, bool) {
+ effect, matched, err := s.policy.Check(
+ info.Username, info.Roles, resource, action,
+ )
+ if err != nil || !matched {
+ return "deny", false
+ }
+ return effect, matched
+ }
+}
+```
+
+Then in each typed handler, set `CheckPolicy` on the request:
+
+```go
+req := &engine.Request{
+ Operation: "get-cert",
+ Data: data,
+ CallerInfo: callerInfo,
+ CheckPolicy: s.newPolicyChecker(callerInfo),
+}
+```
+
+**18 handlers to update**:
+
+| Handler | Operation |
+|---------|-----------|
+| `handleGetCert` | `get-cert` |
+| `handleRevokeCert` | `revoke-cert` |
+| `handleDeleteCert` | `delete-cert` |
+| `handleSSHCASignHost` | `sign-host` |
+| `handleSSHCASignUser` | `sign-user` |
+| `handleSSHCAGetProfile` | `get-profile` |
+| `handleSSHCAListProfiles` | `list-profiles` |
+| `handleSSHCADeleteProfile` | `delete-profile` |
+| `handleSSHCAGetCert` | `get-cert` |
+| `handleSSHCAListCerts` | `list-certs` |
+| `handleSSHCARevokeCert` | `revoke-cert` |
+| `handleSSHCADeleteCert` | `delete-cert` |
+| `handleUserRegister` | `register` |
+| `handleUserProvision` | `provision` |
+| `handleUserListUsers` | `list-users` |
+| `handleUserGetPublicKey` | `get-public-key` |
+| `handleUserDeleteUser` | `delete-user` |
+| `handleUserDecrypt` | `decrypt` |
+
+Note: `handleUserEncrypt` already passes a policy checker — verify it
+uses the same shared helper after refactoring. Admin-only handlers
+(behind `requireAdmin` wrapper) do not need a policy checker since admin
+bypasses policy.
+
+| File | Change |
+|------|--------|
+| `internal/server/routes.go` | Add `newPolicyChecker`, pass to all 18 typed handlers |
+| `internal/server/server_test.go` | Add test: policy-denied user is rejected by typed route |
+
+### Verification (Work Item 4, all findings)
+
+```bash
+go test ./internal/engine/ca/
+go test ./internal/engine/user/
+go test ./internal/server/
+go vet ./...
+```
---
## Implementation Order
-The remediation items should be implemented in this order to respect
-dependencies:
+```
+1. #68 JSON injection (standalone, ship immediately)
+2. #48 Path traversal (standalone, blocks safe engine operation)
+3. #39 Barrier TOCTOU race ─┐
+ #40 ReWrapKeys crash safety ┘ (coupled, requires careful testing)
+4. #49 CA TTL enforcement ─┐
+ #61 ECDH zeroization │
+ #62 User policy path │ (independent fixes, parallelizable)
+ #69 Policy bypass ─┘
+```
-1. **#37** — `adminOnlyOperations` qualification (critical, blocks user engine
- `rotate-key`). This is a code change to `internal/server/routes.go` plus
- spec updates. Do first because it affects all engine implementations.
+Items 1 and 2 have no dependencies and can be done in parallel by
+different engineers.
-2. **#28, #29, #30, #31, #32** — Transit spec fixes (can be done as a single
- spec update pass).
+Items 3 and 4 can also be done in parallel since they touch different
+subsystems (barrier/seal vs engines/server).
-3. **#25, #26, #27** — SSH CA spec fixes (single spec update pass).
+---
-4. **#33, #34, #35, #36** — User spec fixes (single spec update pass).
+## Post-Remediation
-5. **#38** — Cross-reference update (trivial, do with transit and user spec
- fixes).
+After all eight findings are resolved:
-Items within the same group are independent and can be done in parallel.
+1. **Update AUDIT.md** — mark #39, #40, #48, #49, #61, #62, #68, #69 as
+ RESOLVED with resolution summaries.
+2. **Run the full pipeline**: `make all` (vet, lint, test, build).
+3. **Run race detector**: `go test -race ./...`
+4. **Address related medium findings** that interact with these fixes:
+ - #54 (SSH CA default-allow) and #58 (transit default-allow) — once
+ #69 is fixed, the typed handlers will pass policy checkers to the
+ engines, but the engines still default-allow when `CheckPolicy`
+ returns no match. Consider changing the engine-level default to deny
+ for non-admin callers.
+ - #72 (policy ID path traversal) — already covered by #48's
+ `ValidateName` fix on `CreateRule`.
diff --git a/engines/webui.md b/engines/webui.md
new file mode 100644
index 0000000..7c6574b
--- /dev/null
+++ b/engines/webui.md
@@ -0,0 +1,844 @@
+# Web UI Implementation Plan: SSH CA, Transit, and User Engines
+
+## Overview
+
+Three engines (SSH CA, Transit, User) are fully implemented at the core,
+gRPC, and REST layers but have no web UI. This plan adds browser-based
+management for each, following the patterns established by the PKI engine UI.
+
+## Architecture
+
+The web UI is served by `metacrypt-web`, a separate binary that talks to the
+API server over gRPC. All data access flows through the gRPC client — the web
+server has no direct database or barrier access. Authorization is enforced by
+the API server; the web UI only controls visibility (e.g. hiding admin-only
+forms from non-admin users).
+
+### Existing Patterns (from PKI reference)
+
+| Concern | Pattern |
+|---------|---------|
+| Template composition | `layout.html` defines `"layout"` block; page templates define `"title"` and `"content"` blocks |
+| Template rendering | `renderTemplate(w, "page.html", data)` — parses `layout.html` + page template, injects CSRF func |
+| gRPC calls | `ws.vault.Method(ctx, token)` via `vaultBackend` interface; token from cookie |
+| CSRF | Signed double-submit cookie; `{{csrfField}}` in every form |
+| Error display | `{{if .Error}}
{{.Error}}
{{end}}` at top of content |
+| Success display | `
...
` inline after action |
+| Mount discovery | `findCAMount()` pattern — iterate `ListMounts()`, match on `.Type` |
+| Tables | `.table-wrapper` > `
` with ``/`` |
+| Detail views | `.card` with `.card-title` + `.kv-table` for metadata |
+| Admin actions | `{{if .IsAdmin}}` guards around admin-only cards |
+| Forms | `.form-row` > `.form-group` > `