Merge transit engine branch, resolve conflicts in shared files
This commit is contained in:
354
REMEDIATION.md
Normal file
354
REMEDIATION.md
Normal file
@@ -0,0 +1,354 @@
|
||||
# Remediation Plan
|
||||
|
||||
**Date**: 2026-03-16
|
||||
**Scope**: Audit findings #25–#38 from engine design review
|
||||
|
||||
This document provides a concrete remediation plan for each open finding. Items
|
||||
are grouped by priority and ordered for efficient implementation (dependencies
|
||||
first).
|
||||
|
||||
---
|
||||
|
||||
## Critical
|
||||
|
||||
### #37 — `adminOnlyOperations` name collision blocks user `rotate-key`
|
||||
|
||||
**Problem**: The `adminOnlyOperations` map in `handleEngineRequest`
|
||||
(`internal/server/routes.go:265`) is a flat `map[string]bool` keyed by
|
||||
operation name. The transit engine's `rotate-key` is admin-only, but the user
|
||||
engine's `rotate-key` is user-self. Since the map is checked before engine
|
||||
dispatch, non-admin users are blocked from calling `rotate-key` on any engine
|
||||
mount — including user engine mounts where it should be allowed.
|
||||
|
||||
**Fix**: Replace the flat map with an engine-type-qualified lookup. Two options:
|
||||
|
||||
**Option A — Qualify the map key** (minimal change):
|
||||
|
||||
Change the map type to include the engine type prefix:
|
||||
|
||||
```go
|
||||
var adminOnlyOperations = map[string]bool{
|
||||
"ca:import-root": true,
|
||||
"ca:create-issuer": true,
|
||||
"ca:delete-issuer": true,
|
||||
"ca:revoke-cert": true,
|
||||
"ca:delete-cert": true,
|
||||
"transit:create-key": true,
|
||||
"transit:delete-key": true,
|
||||
"transit:rotate-key": true,
|
||||
"transit:update-key-config": true,
|
||||
"transit:trim-key": true,
|
||||
"sshca:create-profile": true,
|
||||
"sshca:update-profile": true,
|
||||
"sshca:delete-profile": true,
|
||||
"sshca:revoke-cert": true,
|
||||
"sshca:delete-cert": true,
|
||||
"user:provision": true,
|
||||
"user:delete-user": true,
|
||||
}
|
||||
```
|
||||
|
||||
In `handleEngineRequest`, look up `engineType + ":" + operation` instead of
|
||||
just `operation`. The `engineType` is already known from the mount registry
|
||||
(the generic endpoint resolves the mount to an engine type).
|
||||
|
||||
**Option B — Per-engine admin operations** (cleaner but more code):
|
||||
|
||||
Each engine implements an `AdminOperations() []string` method. The server
|
||||
queries the resolved engine for its admin operations instead of using a global
|
||||
map.
|
||||
|
||||
**Recommendation**: Option A. It requires a one-line change to the lookup and
|
||||
a mechanical update to the map keys. The generic endpoint already resolves the
|
||||
mount to get the engine type.
|
||||
|
||||
**Files to change**:
|
||||
- `internal/server/routes.go` — update map and lookup in `handleEngineRequest`
|
||||
- `engines/sshca.md` — update `adminOnlyOperations` section
|
||||
- `engines/transit.md` — update `adminOnlyOperations` section
|
||||
- `engines/user.md` — update `adminOnlyOperations` section
|
||||
|
||||
**Tests**: Add test case in `internal/server/server_test.go` — non-admin user
|
||||
calling `rotate-key` via generic endpoint on a user engine mount should succeed
|
||||
(policy permitting). Same call on a transit mount should return 403.
|
||||
|
||||
---
|
||||
|
||||
## High
|
||||
|
||||
### #28 — HMAC output not versioned
|
||||
|
||||
**Problem**: HMAC output is raw base64 with no key version indicator. After key
|
||||
rotation and `min_decryption_version` advancement, old HMACs are unverifiable
|
||||
because the engine doesn't know which key version produced them.
|
||||
|
||||
**Fix**: Use the same versioned prefix format as ciphertext and signatures:
|
||||
|
||||
```
|
||||
metacrypt:v{version}:{base64(mac_bytes)}
|
||||
```
|
||||
|
||||
Update the `hmac` operation to include `key_version` in the response. Update
|
||||
internal HMAC verification to parse the version prefix and select the
|
||||
corresponding key version (subject to `min_decryption_version` enforcement).
|
||||
|
||||
**Files to change**:
|
||||
- `engines/transit.md` — update HMAC section, add HMAC output format, update
|
||||
Cryptographic Details section
|
||||
- Implementation: `internal/engine/transit/sign.go` (when implemented)
|
||||
|
||||
### #30 — `max_key_versions` vs `min_decryption_version` unclear
|
||||
|
||||
**Problem**: The spec doesn't define when `max_key_versions` pruning happens or
|
||||
whether it respects `min_decryption_version`. Auto-pruning on rotation could
|
||||
destroy versions that still have unrewrapped ciphertext.
|
||||
|
||||
**Fix**: Define the behavior explicitly in `engines/transit.md`:
|
||||
|
||||
1. `max_key_versions` pruning happens during `rotate-key`, after the new
|
||||
version is created.
|
||||
2. Pruning **only** deletes versions **strictly less than**
|
||||
`min_decryption_version`. If `max_key_versions` would require deleting a
|
||||
version at or above `min_decryption_version`, the version is **retained**
|
||||
and a warning is included in the response:
|
||||
`"warning": "max_key_versions exceeded; advance min_decryption_version to enable pruning"`.
|
||||
3. This means `max_key_versions` is a soft limit — it is only enforceable
|
||||
after the operator completes the rotation cycle (rotate → rewrap → advance
|
||||
min → prune happens automatically on next rotate).
|
||||
|
||||
This resolves the original audit finding #16 as well.
|
||||
|
||||
**Files to change**:
|
||||
- `engines/transit.md` — add `max_key_versions` behavior to Key Rotation
|
||||
section and `rotate-key` flow
|
||||
- `AUDIT.md` — mark #16 as RESOLVED with reference to the new behavior
|
||||
|
||||
### #33 — Auto-provision creates keys for arbitrary usernames
|
||||
|
||||
**Problem**: The encrypt flow auto-provisions recipients without validating
|
||||
that the username exists in MCIAS. Any authenticated user can create barrier
|
||||
entries for non-existent users.
|
||||
|
||||
**Fix**: Before auto-provisioning, validate the recipient username against
|
||||
MCIAS. The engine has access to the auth system via `req.CallerInfo` context.
|
||||
Add an MCIAS user lookup:
|
||||
|
||||
1. Add a `ValidateUsername(username string) (bool, error)` method to the auth
|
||||
client interface. This calls the MCIAS user info endpoint to check if the
|
||||
username exists.
|
||||
2. In the encrypt flow, before auto-provisioning a recipient, call
|
||||
`ValidateUsername`. If the user doesn't exist in MCIAS, return an error:
|
||||
`"recipient not found: {username}"`.
|
||||
3. Document this validation in the encrypt flow and security considerations.
|
||||
|
||||
**Alternative** (simpler, weaker): Skip MCIAS validation but add a
|
||||
rate limit on auto-provisioning (e.g., max 10 new provisions per encrypt
|
||||
request, max 100 total auto-provisions per hour per caller). This prevents
|
||||
storage inflation but doesn't prevent phantom users.
|
||||
|
||||
**Recommendation**: MCIAS validation. It's the correct security boundary —
|
||||
only real MCIAS users should have keypairs.
|
||||
|
||||
**Files to change**:
|
||||
- `engines/user.md` — update encrypt flow step 2, add MCIAS validation
|
||||
- `internal/auth/` — add `ValidateUsername` to auth client (when implemented)
|
||||
|
||||
---
|
||||
|
||||
## Medium
|
||||
|
||||
### #25 — Missing `list-certs` REST route (SSH CA)
|
||||
|
||||
**Fix**: Add to the REST endpoints table:
|
||||
|
||||
```
|
||||
| GET | `/v1/sshca/{mount}/certs` | List cert records |
|
||||
```
|
||||
|
||||
Add to the route registration code block:
|
||||
|
||||
```go
|
||||
r.Get("/v1/sshca/{mount}/certs", s.requireAuth(s.handleSSHCAListCerts))
|
||||
```
|
||||
|
||||
**Files to change**: `engines/sshca.md`
|
||||
|
||||
### #26 — KRL section type description error
|
||||
|
||||
**Fix**: Change the description block from:
|
||||
|
||||
```
|
||||
Section type: KRL_SECTION_CERT_SERIAL_LIST (0x21)
|
||||
```
|
||||
|
||||
to:
|
||||
|
||||
```
|
||||
Section type: KRL_SECTION_CERTIFICATES (0x01)
|
||||
CA key blob: ssh.MarshalAuthorizedKey(caSigner.PublicKey())
|
||||
Subsection type: KRL_SECTION_CERT_SERIAL_LIST (0x20)
|
||||
```
|
||||
|
||||
This matches the pseudocode comments and the OpenSSH `PROTOCOL.krl` spec.
|
||||
|
||||
**Files to change**: `engines/sshca.md`
|
||||
|
||||
### #27 — Policy check after cert construction (SSH CA)
|
||||
|
||||
**Fix**: Reorder the sign-host flow steps:
|
||||
|
||||
1. Authenticate caller.
|
||||
2. Parse the supplied SSH public key.
|
||||
3. Parse TTL.
|
||||
4. **Policy check**: for each hostname, check policy on
|
||||
`sshca/{mount}/id/{hostname}`, action `sign`.
|
||||
5. Generate serial (only after policy passes).
|
||||
6. Build `ssh.Certificate`.
|
||||
7. Sign, store, return.
|
||||
|
||||
Same reordering for sign-user.
|
||||
|
||||
**Files to change**: `engines/sshca.md`
|
||||
|
||||
### #29 — `rewrap` policy action not specified
|
||||
|
||||
**Fix**: Add `rewrap` as an explicit action in the `operationAction` mapping.
|
||||
`rewrap` maps to `decrypt` (since it requires internal access to plaintext).
|
||||
Batch variants map to the same action.
|
||||
|
||||
Add to the authorization section in `engines/transit.md`:
|
||||
|
||||
> The `rewrap` and `batch-rewrap` operations require the `decrypt` action —
|
||||
> rewrap internally decrypts with the old version and re-encrypts with the
|
||||
> latest, so the caller must have decrypt permission. Alternatively, a
|
||||
> dedicated `rewrap` action could be added for finer-grained control, but
|
||||
> `decrypt` is the safer default (granting `rewrap` without `decrypt` would be
|
||||
> odd since rewrap implies decrypt capability).
|
||||
|
||||
**Recommendation**: Map to `decrypt`. Simpler, and anyone who should rewrap
|
||||
should also be able to decrypt.
|
||||
|
||||
**Files to change**: `engines/transit.md`
|
||||
|
||||
### #31 — Missing `get-public-key` REST route (Transit)
|
||||
|
||||
**Fix**: Add to the REST endpoints table:
|
||||
|
||||
```
|
||||
| GET | `/v1/transit/{mount}/keys/{name}/public-key` | Get public key |
|
||||
```
|
||||
|
||||
Add to the route registration code block:
|
||||
|
||||
```go
|
||||
r.Get("/v1/transit/{mount}/keys/{name}/public-key", s.requireAuth(s.handleTransitGetPublicKey))
|
||||
```
|
||||
|
||||
**Files to change**: `engines/transit.md`
|
||||
|
||||
### #34 — No recipient limit on encrypt (User)
|
||||
|
||||
**Fix**: Add a compile-time constant `maxRecipients = 100` to the user engine.
|
||||
Reject requests exceeding this limit with `400 Bad Request` / `InvalidArgument`
|
||||
before any ECDH computation.
|
||||
|
||||
Add to the encrypt flow in `engines/user.md` after step 1:
|
||||
|
||||
> Validate that `len(recipients) <= maxRecipients` (100). Reject with error if
|
||||
> exceeded.
|
||||
|
||||
Add to the security considerations section.
|
||||
|
||||
**Files to change**: `engines/user.md`
|
||||
|
||||
---
|
||||
|
||||
## Low
|
||||
|
||||
### #32 — `exportable` flag with no export operation (Transit)
|
||||
|
||||
**Fix**: Add an `export-key` operation to the transit engine:
|
||||
|
||||
- Auth: User+Policy (action `read`).
|
||||
- Only succeeds if the key's `exportable` flag is `true`.
|
||||
- Returns raw key material (base64-encoded) for the current version only.
|
||||
- Asymmetric keys: returns private key in PKCS8 PEM.
|
||||
- Symmetric keys: returns raw key bytes, base64-encoded.
|
||||
- Add to HandleRequest dispatch, gRPC service, REST endpoints.
|
||||
|
||||
Alternatively, if key export is never intended, remove the `exportable` flag
|
||||
from `create-key` to avoid dead code. Given that transit is meant to keep keys
|
||||
server-side, **removing the flag** may be the better choice. Document the
|
||||
decision either way.
|
||||
|
||||
**Recommendation**: Remove `exportable`. Transit's entire value proposition is
|
||||
that keys never leave the service. If export is needed for migration, a
|
||||
dedicated admin-only `export-key` can be added later with appropriate audit
|
||||
logging (#7).
|
||||
|
||||
**Files to change**: `engines/transit.md`
|
||||
|
||||
### #35 — No re-encryption support for user key rotation
|
||||
|
||||
**Fix**: Add a `re-encrypt` operation:
|
||||
|
||||
- Auth: User (self) — only the envelope recipient can re-encrypt.
|
||||
- Input: old envelope.
|
||||
- Flow: decrypt with current key, generate new DEK, re-encrypt, return new
|
||||
envelope.
|
||||
- The old key must still be valid at the time of re-encryption. Document the
|
||||
workflow: re-encrypt all stored envelopes, then rotate-key.
|
||||
|
||||
This is a quality-of-life improvement, not a security fix. The current design
|
||||
(decrypt + encrypt separately) works but requires the caller to handle
|
||||
plaintext.
|
||||
|
||||
**Files to change**: `engines/user.md`
|
||||
|
||||
### #36 — `UserKeyConfig` type undefined
|
||||
|
||||
**Fix**: Add the type definition to the in-memory state section:
|
||||
|
||||
```go
|
||||
type UserKeyConfig struct {
|
||||
Algorithm string `json:"algorithm"` // key exchange algorithm used
|
||||
CreatedAt time.Time `json:"created_at"`
|
||||
AutoProvisioned bool `json:"auto_provisioned"` // created via auto-provision
|
||||
}
|
||||
```
|
||||
|
||||
**Files to change**: `engines/user.md`
|
||||
|
||||
### #38 — `ZeroizeKey` prerequisite not cross-referenced
|
||||
|
||||
**Fix**: Add to the Implementation Steps section in both `engines/transit.md`
|
||||
and `engines/user.md`:
|
||||
|
||||
> **Prerequisite**: `engine.ZeroizeKey` must exist in
|
||||
> `internal/engine/helpers.go` (created as part of the SSH CA engine
|
||||
> implementation — see `engines/sshca.md` step 1).
|
||||
|
||||
**Files to change**: `engines/transit.md`, `engines/user.md`
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
The remediation items should be implemented in this order to respect
|
||||
dependencies:
|
||||
|
||||
1. **#37** — `adminOnlyOperations` qualification (critical, blocks user engine
|
||||
`rotate-key`). This is a code change to `internal/server/routes.go` plus
|
||||
spec updates. Do first because it affects all engine implementations.
|
||||
|
||||
2. **#28, #29, #30, #31, #32** — Transit spec fixes (can be done as a single
|
||||
spec update pass).
|
||||
|
||||
3. **#25, #26, #27** — SSH CA spec fixes (single spec update pass).
|
||||
|
||||
4. **#33, #34, #35, #36** — User spec fixes (single spec update pass).
|
||||
|
||||
5. **#38** — Cross-reference update (trivial, do with transit and user spec
|
||||
fixes).
|
||||
|
||||
Items within the same group are independent and can be done in parallel.
|
||||
Reference in New Issue
Block a user