12 KiB
Remediation Plan
Date: 2026-03-16 Scope: Audit findings #25–#38 from engine design review
This document provides a concrete remediation plan for each open finding. Items are grouped by priority and ordered for efficient implementation (dependencies first).
Critical
#37 — adminOnlyOperations name collision blocks user rotate-key
Problem: The adminOnlyOperations map in handleEngineRequest
(internal/server/routes.go:265) is a flat map[string]bool keyed by
operation name. The transit engine's rotate-key is admin-only, but the user
engine's rotate-key is user-self. Since the map is checked before engine
dispatch, non-admin users are blocked from calling rotate-key on any engine
mount — including user engine mounts where it should be allowed.
Fix: Replace the flat map with an engine-type-qualified lookup. Two options:
Option A — Qualify the map key (minimal change):
Change the map type to include the engine type prefix:
var adminOnlyOperations = map[string]bool{
"ca:import-root": true,
"ca:create-issuer": true,
"ca:delete-issuer": true,
"ca:revoke-cert": true,
"ca:delete-cert": true,
"transit:create-key": true,
"transit:delete-key": true,
"transit:rotate-key": true,
"transit:update-key-config": true,
"transit:trim-key": true,
"sshca:create-profile": true,
"sshca:update-profile": true,
"sshca:delete-profile": true,
"sshca:revoke-cert": true,
"sshca:delete-cert": true,
"user:provision": true,
"user:delete-user": true,
}
In handleEngineRequest, look up engineType + ":" + operation instead of
just operation. The engineType is already known from the mount registry
(the generic endpoint resolves the mount to an engine type).
Option B — Per-engine admin operations (cleaner but more code):
Each engine implements an AdminOperations() []string method. The server
queries the resolved engine for its admin operations instead of using a global
map.
Recommendation: Option A. It requires a one-line change to the lookup and a mechanical update to the map keys. The generic endpoint already resolves the mount to get the engine type.
Files to change:
internal/server/routes.go— update map and lookup inhandleEngineRequestengines/sshca.md— updateadminOnlyOperationssectionengines/transit.md— updateadminOnlyOperationssectionengines/user.md— updateadminOnlyOperationssection
Tests: Add test case in internal/server/server_test.go — non-admin user
calling rotate-key via generic endpoint on a user engine mount should succeed
(policy permitting). Same call on a transit mount should return 403.
High
#28 — HMAC output not versioned
Problem: HMAC output is raw base64 with no key version indicator. After key
rotation and min_decryption_version advancement, old HMACs are unverifiable
because the engine doesn't know which key version produced them.
Fix: Use the same versioned prefix format as ciphertext and signatures:
metacrypt:v{version}:{base64(mac_bytes)}
Update the hmac operation to include key_version in the response. Update
internal HMAC verification to parse the version prefix and select the
corresponding key version (subject to min_decryption_version enforcement).
Files to change:
engines/transit.md— update HMAC section, add HMAC output format, update Cryptographic Details section- Implementation:
internal/engine/transit/sign.go(when implemented)
#30 — max_key_versions vs min_decryption_version unclear
Problem: The spec doesn't define when max_key_versions pruning happens or
whether it respects min_decryption_version. Auto-pruning on rotation could
destroy versions that still have unrewrapped ciphertext.
Fix: Define the behavior explicitly in engines/transit.md:
max_key_versionspruning happens duringrotate-key, after the new version is created.- Pruning only deletes versions strictly less than
min_decryption_version. Ifmax_key_versionswould require deleting a version at or abovemin_decryption_version, the version is retained and a warning is included in the response:"warning": "max_key_versions exceeded; advance min_decryption_version to enable pruning". - This means
max_key_versionsis a soft limit — it is only enforceable after the operator completes the rotation cycle (rotate → rewrap → advance min → prune happens automatically on next rotate).
This resolves the original audit finding #16 as well.
Files to change:
engines/transit.md— addmax_key_versionsbehavior to Key Rotation section androtate-keyflowAUDIT.md— mark #16 as RESOLVED with reference to the new behavior
#33 — Auto-provision creates keys for arbitrary usernames
Problem: The encrypt flow auto-provisions recipients without validating that the username exists in MCIAS. Any authenticated user can create barrier entries for non-existent users.
Fix: Before auto-provisioning, validate the recipient username against
MCIAS. The engine has access to the auth system via req.CallerInfo context.
Add an MCIAS user lookup:
- Add a
ValidateUsername(username string) (bool, error)method to the auth client interface. This calls the MCIAS user info endpoint to check if the username exists. - In the encrypt flow, before auto-provisioning a recipient, call
ValidateUsername. If the user doesn't exist in MCIAS, return an error:"recipient not found: {username}". - Document this validation in the encrypt flow and security considerations.
Alternative (simpler, weaker): Skip MCIAS validation but add a rate limit on auto-provisioning (e.g., max 10 new provisions per encrypt request, max 100 total auto-provisions per hour per caller). This prevents storage inflation but doesn't prevent phantom users.
Recommendation: MCIAS validation. It's the correct security boundary — only real MCIAS users should have keypairs.
Files to change:
engines/user.md— update encrypt flow step 2, add MCIAS validationinternal/auth/— addValidateUsernameto auth client (when implemented)
Medium
#25 — Missing list-certs REST route (SSH CA)
Fix: Add to the REST endpoints table:
| GET | `/v1/sshca/{mount}/certs` | List cert records |
Add to the route registration code block:
r.Get("/v1/sshca/{mount}/certs", s.requireAuth(s.handleSSHCAListCerts))
Files to change: engines/sshca.md
#26 — KRL section type description error
Fix: Change the description block from:
Section type: KRL_SECTION_CERT_SERIAL_LIST (0x21)
to:
Section type: KRL_SECTION_CERTIFICATES (0x01)
CA key blob: ssh.MarshalAuthorizedKey(caSigner.PublicKey())
Subsection type: KRL_SECTION_CERT_SERIAL_LIST (0x20)
This matches the pseudocode comments and the OpenSSH PROTOCOL.krl spec.
Files to change: engines/sshca.md
#27 — Policy check after cert construction (SSH CA)
Fix: Reorder the sign-host flow steps:
- Authenticate caller.
- Parse the supplied SSH public key.
- Parse TTL.
- Policy check: for each hostname, check policy on
sshca/{mount}/id/{hostname}, actionsign. - Generate serial (only after policy passes).
- Build
ssh.Certificate. - Sign, store, return.
Same reordering for sign-user.
Files to change: engines/sshca.md
#29 — rewrap policy action not specified
Fix: Add rewrap as an explicit action in the operationAction mapping.
rewrap maps to decrypt (since it requires internal access to plaintext).
Batch variants map to the same action.
Add to the authorization section in engines/transit.md:
The
rewrapandbatch-rewrapoperations require thedecryptaction — rewrap internally decrypts with the old version and re-encrypts with the latest, so the caller must have decrypt permission. Alternatively, a dedicatedrewrapaction could be added for finer-grained control, butdecryptis the safer default (grantingrewrapwithoutdecryptwould be odd since rewrap implies decrypt capability).
Recommendation: Map to decrypt. Simpler, and anyone who should rewrap
should also be able to decrypt.
Files to change: engines/transit.md
#31 — Missing get-public-key REST route (Transit)
Fix: Add to the REST endpoints table:
| GET | `/v1/transit/{mount}/keys/{name}/public-key` | Get public key |
Add to the route registration code block:
r.Get("/v1/transit/{mount}/keys/{name}/public-key", s.requireAuth(s.handleTransitGetPublicKey))
Files to change: engines/transit.md
#34 — No recipient limit on encrypt (User)
Fix: Add a compile-time constant maxRecipients = 100 to the user engine.
Reject requests exceeding this limit with 400 Bad Request / InvalidArgument
before any ECDH computation.
Add to the encrypt flow in engines/user.md after step 1:
Validate that
len(recipients) <= maxRecipients(100). Reject with error if exceeded.
Add to the security considerations section.
Files to change: engines/user.md
Low
#32 — exportable flag with no export operation (Transit)
Fix: Add an export-key operation to the transit engine:
- Auth: User+Policy (action
read). - Only succeeds if the key's
exportableflag istrue. - Returns raw key material (base64-encoded) for the current version only.
- Asymmetric keys: returns private key in PKCS8 PEM.
- Symmetric keys: returns raw key bytes, base64-encoded.
- Add to HandleRequest dispatch, gRPC service, REST endpoints.
Alternatively, if key export is never intended, remove the exportable flag
from create-key to avoid dead code. Given that transit is meant to keep keys
server-side, removing the flag may be the better choice. Document the
decision either way.
Recommendation: Remove exportable. Transit's entire value proposition is
that keys never leave the service. If export is needed for migration, a
dedicated admin-only export-key can be added later with appropriate audit
logging (#7).
Files to change: engines/transit.md
#35 — No re-encryption support for user key rotation
Fix: Add a re-encrypt operation:
- Auth: User (self) — only the envelope recipient can re-encrypt.
- Input: old envelope.
- Flow: decrypt with current key, generate new DEK, re-encrypt, return new envelope.
- The old key must still be valid at the time of re-encryption. Document the workflow: re-encrypt all stored envelopes, then rotate-key.
This is a quality-of-life improvement, not a security fix. The current design (decrypt + encrypt separately) works but requires the caller to handle plaintext.
Files to change: engines/user.md
#36 — UserKeyConfig type undefined
Fix: Add the type definition to the in-memory state section:
type UserKeyConfig struct {
Algorithm string `json:"algorithm"` // key exchange algorithm used
CreatedAt time.Time `json:"created_at"`
AutoProvisioned bool `json:"auto_provisioned"` // created via auto-provision
}
Files to change: engines/user.md
#38 — ZeroizeKey prerequisite not cross-referenced
Fix: Add to the Implementation Steps section in both engines/transit.md
and engines/user.md:
Prerequisite:
engine.ZeroizeKeymust exist ininternal/engine/helpers.go(created as part of the SSH CA engine implementation — seeengines/sshca.mdstep 1).
Files to change: engines/transit.md, engines/user.md
Implementation Order
The remediation items should be implemented in this order to respect dependencies:
-
#37 —
adminOnlyOperationsqualification (critical, blocks user enginerotate-key). This is a code change tointernal/server/routes.goplus spec updates. Do first because it affects all engine implementations. -
#28, #29, #30, #31, #32 — Transit spec fixes (can be done as a single spec update pass).
-
#25, #26, #27 — SSH CA spec fixes (single spec update pass).
-
#33, #34, #35, #36 — User spec fixes (single spec update pass).
-
#38 — Cross-reference update (trivial, do with transit and user spec fixes).
Items within the same group are independent and can be done in parallel.