Implement a two-level key hierarchy: the MEK now wraps per-engine DEKs stored in a new barrier_keys table, rather than encrypting all barrier entries directly. A v2 ciphertext format (0x02) embeds the key ID so the barrier can resolve which DEK to use on decryption. v1 ciphertext remains supported for backward compatibility. Key changes: - crypto: EncryptV2/DecryptV2/ExtractKeyID for v2 ciphertext with key IDs - barrier: key registry (CreateKey, RotateKey, ListKeys, MigrateToV2, ReWrapKeys) - seal: RotateMEK re-wraps DEKs without re-encrypting data - engine: Mount auto-creates per-engine DEK - REST + gRPC: barrier/keys, barrier/rotate-mek, barrier/rotate-key, barrier/migrate - proto: BarrierService (v1 + v2) with ListKeys, RotateMEK, RotateKey, Migrate - db: migration v2 adds barrier_keys table Also includes: security audit report, CSRF protection, engine design specs (sshca, transit, user), path-bound AAD migration tool, policy engine enhancements, and ARCHITECTURE.md updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 KiB
Security Audit Report
Date: 2026-03-16 Scope: ARCHITECTURE.md, engines/sshca.md, engines/transit.md
ARCHITECTURE.md
Strengths
- Solid key hierarchy: password → Argon2id → KWK → MEK → per-entry encryption. Defense-in-depth.
- Fail-closed design with
ErrSealedon all operations when sealed. - Fresh nonce per write, constant-time comparisons, explicit zeroization — all correct fundamentals.
- Default-deny policy engine with priority-based rule evaluation.
- Issued leaf private keys never stored — good principle of least persistence.
Issues
1. TLS minimum version should be 1.3, not 1.2 RESOLVED
Updated all TLS configurations (HTTP server, gRPC server, web server, vault client, Go client library, CLI commands) from tls.VersionTLS12 to tls.VersionTLS13. Removed explicit cipher suite list from HTTP server (TLS 1.3 manages its own). Updated ARCHITECTURE.md TLS section and threat mitigations table.
2. Token cache TTL of 30 seconds is a revocation gap ACCEPTED
Accepted as an explicit trade-off. The 30-second cache TTL balances MCIAS load against revocation latency. For this system's scale and threat model, the window is acceptable.
3. Admin bypass in policy engine is an all-or-nothing model ACCEPTED
The all-or-nothing admin model is intentional by design. MCIAS admin users get full access to all engines and operations. This is the desired behavior for this system.
4. Policy rule creation is listed as both Admin-only and User-accessible RESOLVED
The second policy table in ARCHITECTURE.md incorrectly listed User auth; removed the duplicate. gRPC adminRequiredMethods now includes ListPolicies and GetPolicy to match REST behavior. All policy CRUD is admin-only across both API surfaces.
5. No integrity protection on barrier entry paths RESOLVED
Updated crypto.Encrypt/crypto.Decrypt to accept an additionalData parameter. The barrier now passes the entry path as GCM AAD on both Put and Get, binding each ciphertext to its storage path. Seal operations pass nil (no path context). Added TestEncryptDecryptWithAAD covering correct-AAD, wrong-AAD, and nil-AAD cases. Existing barrier entries will fail to decrypt after this change — a one-off migration tool is needed to re-encrypt all entries (decrypt with nil AAD under old code, re-encrypt with path AAD).
6. Single MEK with no rotation mechanism RESOLVED
Implemented MEK rotation and per-engine DEKs. The v2 ciphertext format (0x02) embeds a key ID that identifies which DEK encrypted each entry. MEK rotation (POST /v1/barrier/rotate-mek) re-wraps all DEKs without re-encrypting data. DEK rotation (POST /v1/barrier/rotate-key) re-encrypts entries under a specific key. A migration endpoint converts v1 entries to v2 format. The barrier_keys table stores MEK-wrapped DEKs with version tracking.
7. No audit logging
Acknowledged as future work, but for a cryptographic service this is a significant gap. Every certificate issuance, every sign operation, every policy change should be logged with caller identity, timestamp, and operation details. Without this, incident response is blind.
8. Rate limiting is in-memory only ACCEPTED
The in-memory rate limit protects against remote brute-force over the network, which is the realistic threat. Persisting the counter in the database would not add tamper resistance: the barrier is sealed during unseal attempts so encrypted storage is unavailable, and the unencrypted database could be reset by an attacker with disk access. An attacker who can restart the service already has local system access, making the rate limit moot regardless of persistence. Argon2id cost parameters (128 MiB memory-hard) are the primary brute-force mitigation and are stored in seal_config.
engines/sshca.md
Strengths
- Flat CA model is correct for SSH (no intermediate hierarchy needed).
- Default principal restriction (users can only sign certs for their own username) is the right default.
max_ttlenforced server-side — good.- Key zeroization on seal, no private keys in cert records.
Issues
9. User-controllable serial numbers RESOLVED
Removed the optional serial field from both sign-host and sign-user request data. Serials are always generated server-side using crypto/rand (64-bit). Updated flows and security considerations in sshca.md.
10. No explicit extension allowlist for host certificates
The extensions field for sign-host accepts an arbitrary map. SSH extensions have security implications (e.g., permit-pty, permit-port-forwarding, permit-user-rc). Without an allowlist, a user could request extensions that grant more capabilities than intended. The engine should define a default extension set and either:
- Restrict to an allowlist, or
- Require admin for non-default extensions.
11. RESOLVEDcritical_options on user certs is a privilege escalation surface
Removed critical_options from the sign-user request. Critical options can only be applied via admin-defined signing profiles, which are policy-gated (sshca/{mount}/profile/{name}, action read). Profile CRUD is admin-only. Profiles specify critical options, extensions, optional max TTL, and optional principal restrictions. Security considerations updated accordingly.
12. No KRL (Key Revocation List) support RESOLVED
Added a full KRL section to sshca.md covering: in-memory KRL generation from revoked serials, barrier persistence at engine/sshca/{mount}/krl.bin, automatic rebuild on revoke/delete/unseal, a public GET /v1/sshca/{mount}/krl endpoint with ETag and Cache-Control headers, GetKRL gRPC RPC, and a pull-based distribution model with example sshd_config and cron fetch.
13. Policy resource path uses RESOLVEDca/ prefix instead of sshca/
Updated policy check paths in sshca.md from ca/{mount}/id/... to sshca/{mount}/id/... for both sign-host and sign-user flows, eliminating the namespace collision with the CA (PKI) engine.
14. No source-address restriction by default
User certificates should ideally include source-address critical options to limit where they can be used from. At minimum, consider a mount-level configuration for default critical options that get applied to all user certs.
engines/transit.md
Strengths
- Ciphertext format with version prefix enables clean key rotation.
exportableandallow_deletionimmutable after creation — prevents policy weakening.- AAD/context binding for AEAD ciphers.
- Rewrap never exposes plaintext to caller.
Issues
15. No minimum key version enforcement RESOLVED
Added min_decryption_version per key (default 1). Decryption requests for versions below the minimum are rejected. New update-key-config operation (admin-only) advances the minimum (can only increase, cannot exceed current version). New trim-key operation permanently deletes versions older than the minimum. Both have corresponding gRPC RPCs and REST endpoints. The rotation cycle is documented: rotate → rewrap → advance min → trim.
16. Key version pruning with max_key_versions has no safety check
If max_key_versions is set and data encrypted with an old version hasn't been re-wrapped, pruning that version makes the data permanently unrecoverable. There should be either:
- A warning/confirmation mechanism, or
- A way to scan for ciphertext referencing a version before pruning, or
- At minimum, clear documentation that pruning is destructive.
17. RSA encryption without specifying padding scheme RESOLVED
RSA key types (rsa-2048, rsa-4096) removed entirely from the transit engine. Asymmetric encryption belongs in the user engine (via ECDH); RSA signing offers no advantage over Ed25519/ECDSA. crypto/rsa removed from dependencies. Rationale documented in key types section and security considerations.
18. HMAC keys used for RESOLVEDsign operation is confusing
sign and verify are now restricted to asymmetric key types (Ed25519, ECDSA). HMAC keys are rejected with an error — HMAC must use the dedicated hmac operation. Policy actions are already split: sign, verify, and hmac are separate granular actions, all matched by any.
19. No batch encrypt/decrypt operations RESOLVED
Added batch-encrypt, batch-decrypt, and batch-rewrap operations to the transit engine plan. Each targets a single named key with an array of items; results are returned in order with per-item errors (partial success model). An optional reference field lets callers correlate results with source records. Policy is checked once per batch. Added corresponding gRPC RPCs and REST endpoints. operationAction maps batch variants to the same granular actions as their single counterparts.
20. RESOLVEDread action maps to decrypt and verify — semantics are misleading
Replaced the coarse read/write action model with granular per-operation actions: encrypt, decrypt, sign, verify, hmac for cryptographic operations; read for metadata retrieval; write for key management; admin for administrative operations. Added any action that matches all non-admin actions. Added LintRule validation that rejects unknown effects and actions. CreateRule now validates before storing. Updated operationAction mapping and all tests.
21. No rate limiting or quota on cryptographic operations
A compromised or malicious user token could issue unlimited encrypt/decrypt/sign requests, potentially using the service as a cryptographic oracle. Consider per-user rate limits on transit operations.
Cross-Cutting Issues
22. No forward secrecy for stored data RESOLVED: Per-engine DEKs limit blast radius — compromise of one DEK only exposes that engine's data, not the entire barrier. MEK compromise still exposes all DEKs, but MEK rotation enables periodic re-keying. Each engine mount gets its own DEK created automatically; a "system" DEK protects non-engine data. v2 ciphertext format embeds key IDs for DEK lookup.
23. Generic RESOLVED: Added an POST /v1/engine/request bypasses typed route middlewareadminOnlyOperations map to handleEngineRequest that mirrors the admin gates on typed REST routes (e.g. create-issuer, delete-cert, create-key, rotate-key, create-profile, provision). Non-admin users are rejected with 403 before policy evaluation or engine dispatch. The v1 gRPC Execute RPC is defined in the proto but not registered in the server — only v2 typed RPCs are used, so the gRPC surface is not affected. Tests cover both admin and non-admin paths through the generic endpoint.
24. No CSRF protection mentioned for web UI RESOLVED: Added signed double-submit cookie CSRF protection. A per-server HMAC secret signs random nonce-based tokens. Every form includes a {{csrfField}} hidden input; a middleware validates that the form field matches the cookie and has a valid HMAC signature on all POST/PUT/PATCH/DELETE requests. Session cookie upgraded from SameSite=Lax to SameSite=Strict. CSRF cookie is also HttpOnly, Secure, SameSite=Strict. Tests cover token generation/validation, cross-secret rejection, middleware pass/block/mismatch scenarios.
Priority Summary
| Priority | Issue | Location |
|---|---|---|
| ARCHITECTURE.md | ||
| sshca.md | ||
ca/ vs sshca/) |
sshca.md | |
| ARCHITECTURE.md | ||
| sshca.md | ||
| transit.md | ||
| transit.md | ||
critical_options not restricted |
sshca.md | |
| ARCHITECTURE.md | ||
| Cross-cutting | ||
| ARCHITECTURE.md | ||
| ARCHITECTURE.md | ||
| ARCHITECTURE.md | ||
decrypt mapped to read action |
transit.md | |
| ARCHITECTURE.md | ||
| ARCHITECTURE.md | ||
| transit.md | ||
| transit.md | ||
| Cross-cutting |