# Remediation Plan **Date**: 2026-03-16 **Scope**: Audit findings #25–#38 from engine design review This document provides a concrete remediation plan for each open finding. Items are grouped by priority and ordered for efficient implementation (dependencies first). --- ## Critical ### #37 — `adminOnlyOperations` name collision blocks user `rotate-key` **Problem**: The `adminOnlyOperations` map in `handleEngineRequest` (`internal/server/routes.go:265`) is a flat `map[string]bool` keyed by operation name. The transit engine's `rotate-key` is admin-only, but the user engine's `rotate-key` is user-self. Since the map is checked before engine dispatch, non-admin users are blocked from calling `rotate-key` on any engine mount — including user engine mounts where it should be allowed. **Fix**: Replace the flat map with an engine-type-qualified lookup. Two options: **Option A — Qualify the map key** (minimal change): Change the map type to include the engine type prefix: ```go var adminOnlyOperations = map[string]bool{ "ca:import-root": true, "ca:create-issuer": true, "ca:delete-issuer": true, "ca:revoke-cert": true, "ca:delete-cert": true, "transit:create-key": true, "transit:delete-key": true, "transit:rotate-key": true, "transit:update-key-config": true, "transit:trim-key": true, "sshca:create-profile": true, "sshca:update-profile": true, "sshca:delete-profile": true, "sshca:revoke-cert": true, "sshca:delete-cert": true, "user:provision": true, "user:delete-user": true, } ``` In `handleEngineRequest`, look up `engineType + ":" + operation` instead of just `operation`. The `engineType` is already known from the mount registry (the generic endpoint resolves the mount to an engine type). **Option B — Per-engine admin operations** (cleaner but more code): Each engine implements an `AdminOperations() []string` method. The server queries the resolved engine for its admin operations instead of using a global map. **Recommendation**: Option A. It requires a one-line change to the lookup and a mechanical update to the map keys. The generic endpoint already resolves the mount to get the engine type. **Files to change**: - `internal/server/routes.go` — update map and lookup in `handleEngineRequest` - `engines/sshca.md` — update `adminOnlyOperations` section - `engines/transit.md` — update `adminOnlyOperations` section - `engines/user.md` — update `adminOnlyOperations` section **Tests**: Add test case in `internal/server/server_test.go` — non-admin user calling `rotate-key` via generic endpoint on a user engine mount should succeed (policy permitting). Same call on a transit mount should return 403. --- ## High ### #28 — HMAC output not versioned **Problem**: HMAC output is raw base64 with no key version indicator. After key rotation and `min_decryption_version` advancement, old HMACs are unverifiable because the engine doesn't know which key version produced them. **Fix**: Use the same versioned prefix format as ciphertext and signatures: ``` metacrypt:v{version}:{base64(mac_bytes)} ``` Update the `hmac` operation to include `key_version` in the response. Update internal HMAC verification to parse the version prefix and select the corresponding key version (subject to `min_decryption_version` enforcement). **Files to change**: - `engines/transit.md` — update HMAC section, add HMAC output format, update Cryptographic Details section - Implementation: `internal/engine/transit/sign.go` (when implemented) ### #30 — `max_key_versions` vs `min_decryption_version` unclear **Problem**: The spec doesn't define when `max_key_versions` pruning happens or whether it respects `min_decryption_version`. Auto-pruning on rotation could destroy versions that still have unrewrapped ciphertext. **Fix**: Define the behavior explicitly in `engines/transit.md`: 1. `max_key_versions` pruning happens during `rotate-key`, after the new version is created. 2. Pruning **only** deletes versions **strictly less than** `min_decryption_version`. If `max_key_versions` would require deleting a version at or above `min_decryption_version`, the version is **retained** and a warning is included in the response: `"warning": "max_key_versions exceeded; advance min_decryption_version to enable pruning"`. 3. This means `max_key_versions` is a soft limit — it is only enforceable after the operator completes the rotation cycle (rotate → rewrap → advance min → prune happens automatically on next rotate). This resolves the original audit finding #16 as well. **Files to change**: - `engines/transit.md` — add `max_key_versions` behavior to Key Rotation section and `rotate-key` flow - `AUDIT.md` — mark #16 as RESOLVED with reference to the new behavior ### #33 — Auto-provision creates keys for arbitrary usernames **Problem**: The encrypt flow auto-provisions recipients without validating that the username exists in MCIAS. Any authenticated user can create barrier entries for non-existent users. **Fix**: Before auto-provisioning, validate the recipient username against MCIAS. The engine has access to the auth system via `req.CallerInfo` context. Add an MCIAS user lookup: 1. Add a `ValidateUsername(username string) (bool, error)` method to the auth client interface. This calls the MCIAS user info endpoint to check if the username exists. 2. In the encrypt flow, before auto-provisioning a recipient, call `ValidateUsername`. If the user doesn't exist in MCIAS, return an error: `"recipient not found: {username}"`. 3. Document this validation in the encrypt flow and security considerations. **Alternative** (simpler, weaker): Skip MCIAS validation but add a rate limit on auto-provisioning (e.g., max 10 new provisions per encrypt request, max 100 total auto-provisions per hour per caller). This prevents storage inflation but doesn't prevent phantom users. **Recommendation**: MCIAS validation. It's the correct security boundary — only real MCIAS users should have keypairs. **Files to change**: - `engines/user.md` — update encrypt flow step 2, add MCIAS validation - `internal/auth/` — add `ValidateUsername` to auth client (when implemented) --- ## Medium ### #25 — Missing `list-certs` REST route (SSH CA) **Fix**: Add to the REST endpoints table: ``` | GET | `/v1/sshca/{mount}/certs` | List cert records | ``` Add to the route registration code block: ```go r.Get("/v1/sshca/{mount}/certs", s.requireAuth(s.handleSSHCAListCerts)) ``` **Files to change**: `engines/sshca.md` ### #26 — KRL section type description error **Fix**: Change the description block from: ``` Section type: KRL_SECTION_CERT_SERIAL_LIST (0x21) ``` to: ``` Section type: KRL_SECTION_CERTIFICATES (0x01) CA key blob: ssh.MarshalAuthorizedKey(caSigner.PublicKey()) Subsection type: KRL_SECTION_CERT_SERIAL_LIST (0x20) ``` This matches the pseudocode comments and the OpenSSH `PROTOCOL.krl` spec. **Files to change**: `engines/sshca.md` ### #27 — Policy check after cert construction (SSH CA) **Fix**: Reorder the sign-host flow steps: 1. Authenticate caller. 2. Parse the supplied SSH public key. 3. Parse TTL. 4. **Policy check**: for each hostname, check policy on `sshca/{mount}/id/{hostname}`, action `sign`. 5. Generate serial (only after policy passes). 6. Build `ssh.Certificate`. 7. Sign, store, return. Same reordering for sign-user. **Files to change**: `engines/sshca.md` ### #29 — `rewrap` policy action not specified **Fix**: Add `rewrap` as an explicit action in the `operationAction` mapping. `rewrap` maps to `decrypt` (since it requires internal access to plaintext). Batch variants map to the same action. Add to the authorization section in `engines/transit.md`: > The `rewrap` and `batch-rewrap` operations require the `decrypt` action — > rewrap internally decrypts with the old version and re-encrypts with the > latest, so the caller must have decrypt permission. Alternatively, a > dedicated `rewrap` action could be added for finer-grained control, but > `decrypt` is the safer default (granting `rewrap` without `decrypt` would be > odd since rewrap implies decrypt capability). **Recommendation**: Map to `decrypt`. Simpler, and anyone who should rewrap should also be able to decrypt. **Files to change**: `engines/transit.md` ### #31 — Missing `get-public-key` REST route (Transit) **Fix**: Add to the REST endpoints table: ``` | GET | `/v1/transit/{mount}/keys/{name}/public-key` | Get public key | ``` Add to the route registration code block: ```go r.Get("/v1/transit/{mount}/keys/{name}/public-key", s.requireAuth(s.handleTransitGetPublicKey)) ``` **Files to change**: `engines/transit.md` ### #34 — No recipient limit on encrypt (User) **Fix**: Add a compile-time constant `maxRecipients = 100` to the user engine. Reject requests exceeding this limit with `400 Bad Request` / `InvalidArgument` before any ECDH computation. Add to the encrypt flow in `engines/user.md` after step 1: > Validate that `len(recipients) <= maxRecipients` (100). Reject with error if > exceeded. Add to the security considerations section. **Files to change**: `engines/user.md` --- ## Low ### #32 — `exportable` flag with no export operation (Transit) **Fix**: Add an `export-key` operation to the transit engine: - Auth: User+Policy (action `read`). - Only succeeds if the key's `exportable` flag is `true`. - Returns raw key material (base64-encoded) for the current version only. - Asymmetric keys: returns private key in PKCS8 PEM. - Symmetric keys: returns raw key bytes, base64-encoded. - Add to HandleRequest dispatch, gRPC service, REST endpoints. Alternatively, if key export is never intended, remove the `exportable` flag from `create-key` to avoid dead code. Given that transit is meant to keep keys server-side, **removing the flag** may be the better choice. Document the decision either way. **Recommendation**: Remove `exportable`. Transit's entire value proposition is that keys never leave the service. If export is needed for migration, a dedicated admin-only `export-key` can be added later with appropriate audit logging (#7). **Files to change**: `engines/transit.md` ### #35 — No re-encryption support for user key rotation **Fix**: Add a `re-encrypt` operation: - Auth: User (self) — only the envelope recipient can re-encrypt. - Input: old envelope. - Flow: decrypt with current key, generate new DEK, re-encrypt, return new envelope. - The old key must still be valid at the time of re-encryption. Document the workflow: re-encrypt all stored envelopes, then rotate-key. This is a quality-of-life improvement, not a security fix. The current design (decrypt + encrypt separately) works but requires the caller to handle plaintext. **Files to change**: `engines/user.md` ### #36 — `UserKeyConfig` type undefined **Fix**: Add the type definition to the in-memory state section: ```go type UserKeyConfig struct { Algorithm string `json:"algorithm"` // key exchange algorithm used CreatedAt time.Time `json:"created_at"` AutoProvisioned bool `json:"auto_provisioned"` // created via auto-provision } ``` **Files to change**: `engines/user.md` ### #38 — `ZeroizeKey` prerequisite not cross-referenced **Fix**: Add to the Implementation Steps section in both `engines/transit.md` and `engines/user.md`: > **Prerequisite**: `engine.ZeroizeKey` must exist in > `internal/engine/helpers.go` (created as part of the SSH CA engine > implementation — see `engines/sshca.md` step 1). **Files to change**: `engines/transit.md`, `engines/user.md` --- ## Implementation Order The remediation items should be implemented in this order to respect dependencies: 1. **#37** — `adminOnlyOperations` qualification (critical, blocks user engine `rotate-key`). This is a code change to `internal/server/routes.go` plus spec updates. Do first because it affects all engine implementations. 2. **#28, #29, #30, #31, #32** — Transit spec fixes (can be done as a single spec update pass). 3. **#25, #26, #27** — SSH CA spec fixes (single spec update pass). 4. **#33, #34, #35, #36** — User spec fixes (single spec update pass). 5. **#38** — Cross-reference update (trivial, do with transit and user spec fixes). Items within the same group are independent and can be done in parallel.