Add MEK rotation, per-engine DEKs, and v2 ciphertext format (audit #6, #22)

Implement a two-level key hierarchy: the MEK now wraps per-engine DEKs
stored in a new barrier_keys table, rather than encrypting all barrier
entries directly. A v2 ciphertext format (0x02) embeds the key ID so the
barrier can resolve which DEK to use on decryption. v1 ciphertext remains
supported for backward compatibility.

Key changes:
- crypto: EncryptV2/DecryptV2/ExtractKeyID for v2 ciphertext with key IDs
- barrier: key registry (CreateKey, RotateKey, ListKeys, MigrateToV2, ReWrapKeys)
- seal: RotateMEK re-wraps DEKs without re-encrypting data
- engine: Mount auto-creates per-engine DEK
- REST + gRPC: barrier/keys, barrier/rotate-mek, barrier/rotate-key, barrier/migrate
- proto: BarrierService (v1 + v2) with ListKeys, RotateMEK, RotateKey, Migrate
- db: migration v2 adds barrier_keys table

Also includes: security audit report, CSRF protection, engine design specs
(sshca, transit, user), path-bound AAD migration tool, policy engine
enhancements, and ARCHITECTURE.md updates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-16 18:27:44 -07:00
parent ac4577f778
commit 64d921827e
44 changed files with 5184 additions and 90 deletions

View File

@@ -100,7 +100,8 @@ deploy/ Docker Compose, example configs
| Salt size | 256 bits | Argon2id salt |
| CSPRNG | `crypto/rand` | Keys, salts, nonces |
| Constant-time comparison | `crypto/subtle` | Password & token comparison |
| Zeroization | Explicit overwrite | MEK, KWK, passwords in memory |
| DEK wrapping | AES-256-GCM | MEK wraps per-engine DEKs |
| Zeroization | Explicit overwrite | MEK, KWK, DEKs, passwords in memory |
### Key Hierarchy
@@ -123,24 +124,76 @@ User Password (not stored)
Master Encryption Key (MEK) 256-bit, held in memory only when unsealed
┌──────────────────────┐
Barrier Entries Each entry encrypted individually
│ ├── Policy rules │ with MEK via AES-256-GCM
│ ├── Engine configs │
── Engine DEKs │ Per-engine data encryption keys
└──────────────────────┘
┌──────────────────────────
barrier_keys table MEK wraps per-engine DEKs
│ ├── "system" DEK │ policy rules, mount metadata
│ ├── "engine/ca/prod" DEK │ per-engine data encryption
── "engine/ca/dev" DEK │
│ └── ... │
└────────────┬─────────────┘
┌──────────────────────────┐
│ Barrier Entries │ Each entry encrypted with its
│ ├── Policy rules │ engine's DEK via AES-256-GCM
│ ├── Engine configs │ (v2 ciphertext format)
│ └── Engine secrets │
└──────────────────────────┘
```
Each engine mount gets its own Data Encryption Key (DEK) stored in the
`barrier_keys` table, wrapped by the MEK. Non-engine data (policy rules,
mount metadata) uses a `"system"` DEK. This limits blast radius: compromise
of a single DEK only exposes one engine's data.
### Ciphertext Format
All encrypted values use a versioned binary format:
Two versioned binary formats are supported:
**v1** (legacy, `0x01`):
```
[version: 1 byte][nonce: 12 bytes][ciphertext + GCM tag]
```
The version byte (currently `0x01`) enables future algorithm migration,
including post-quantum hybrid schemes.
**v2** (current, `0x02`):
```
[version: 1 byte][key_id_len: 1 byte][key_id: N bytes][nonce: 12 bytes][ciphertext + GCM tag]
```
The v2 format embeds a key identifier in the ciphertext, allowing the
barrier to determine which DEK to use for decryption without external
metadata. Key IDs are short path-like strings:
- `"system"` — system DEK (policy rules, mount metadata)
- `"engine/{type}/{mount}"` — per-engine DEK (e.g. `"engine/ca/prod"`)
v1 ciphertext is still accepted for backward compatibility: it is
decrypted with the MEK directly (no key ID lookup). The `migrate-barrier`
command converts all v1 entries to v2 format with per-engine DEKs.
### Key Rotation
**MEK rotation** (`POST /v1/barrier/rotate-mek`): Generates a new MEK,
re-wraps all DEKs in `barrier_keys` with the new MEK, and updates
`seal_config`. Requires the unseal password for verification. This is
O(number of engines) — no data re-encryption is needed.
**DEK rotation** (`POST /v1/barrier/rotate-key`): Generates a new DEK for
a specific key ID, re-encrypts all barrier entries under that key's prefix
with the new DEK, and updates `barrier_keys`. This is O(entries per engine).
**Migration** (`POST /v1/barrier/migrate`): Converts all v1 (MEK-encrypted)
barrier entries to v2 format with per-engine DEKs. Creates DEKs on demand
for each engine mount. Idempotent — entries already in v2 format are skipped.
The `barrier_keys` table schema:
```sql
CREATE TABLE barrier_keys (
key_id TEXT PRIMARY KEY,
version INTEGER NOT NULL DEFAULT 1,
encrypted_dek BLOB NOT NULL,
created_at DATETIME DEFAULT (datetime('now')),
rotated_at DATETIME DEFAULT (datetime('now'))
);
```
---
@@ -167,8 +220,9 @@ Metacrypt operates as a state machine with four states:
│ │ • derive KWK = Argon2id(password, salt)
│ │ • MEK = Decrypt(KWK, encrypted_mek)
│ │ • barrier.Unseal(MEK)
│ │ • load & decrypt DEKs from barrier_keys
│ ┌──────────────────┐
└─────────────►│ Sealed │ MEK zeroized; barrier locked
└─────────────►│ Sealed │ MEK + DEKs zeroized; barrier locked
└──────────────────┘
```
@@ -182,7 +236,7 @@ Unseal attempts are rate-limited to mitigate online brute-force:
### Sealing
Calling `Seal()` immediately:
1. Zeroizes the MEK from memory
1. Zeroizes all DEKs and the MEK from memory
2. Seals the storage barrier (all reads/writes return `ErrSealed`)
3. Seals all mounted engines
4. Flushes the authentication token cache
@@ -221,6 +275,9 @@ type Barrier interface {
### Properties
- **Encryption at rest**: All values encrypted with MEK before database write
- **Path-bound integrity**: The entry path is included as GCM additional
authenticated data (AAD), preventing an attacker with database access from
swapping encrypted blobs between paths
- **Fresh nonce per write**: Every `Put` generates a new random nonce
- **Atomic upsert**: Uses `INSERT ... ON CONFLICT UPDATE` for Put
- **Glob listing**: `List(prefix)` returns relative paths matching the prefix
@@ -262,7 +319,7 @@ type Rule struct {
Usernames []string // match specific users (optional)
Roles []string // match roles (optional)
Resources []string // glob patterns, e.g. "engine/transit/*" (optional)
Actions []string // e.g. "read", "write", "admin" (optional)
Actions []string // "any", "read", "write", "encrypt", "decrypt", "sign", "verify", "hmac", "admin" (optional)
}
```
@@ -275,7 +332,10 @@ type Rule struct {
5. **Default deny** if no rules match
Matching is case-insensitive for usernames and roles. Resources use glob
patterns. Empty fields in a rule match everything.
patterns. Empty fields in a rule match everything. The special action `any`
matches all actions except `admin` — admin must always be granted explicitly.
Rules are validated on creation: effects must be `allow` or `deny`, and
actions must be from the recognized set.
---
@@ -457,6 +517,15 @@ kept in sync — every operation available via REST has a corresponding gRPC RPC
|--------|-------------|-------------------------|
| POST | `/v1/seal` | Seal service & engines |
### Barrier Key Management (Admin Only)
| Method | Path | Description |
|--------|----------------------------|------------------------------------------|
| GET | `/v1/barrier/keys` | List DEKs with version + rotation times |
| POST | `/v1/barrier/rotate-mek` | Rotate MEK (re-wraps all DEKs) |
| POST | `/v1/barrier/rotate-key` | Rotate a specific DEK (re-encrypts data) |
| POST | `/v1/barrier/migrate` | Migrate v1 entries to v2 with per-engine DEKs |
### Authentication
| Method | Path | Description | Auth Required |
@@ -509,15 +578,6 @@ must be of type `ca`; returns 404 otherwise.
| POST | `/v1/ca/{mount}/cert/{serial}/revoke` | Revoke a certificate | Admin |
| DELETE | `/v1/ca/{mount}/cert/{serial}` | Delete a certificate record | Admin |
### Policy (Authenticated)
| Method | Path | Description | Auth |
|--------|-----------------------|---------------------|-------|
| GET | `/v1/policy/rules` | List all rules | User |
| POST | `/v1/policy/rules` | Create a rule | User |
| GET | `/v1/policy/rule?id=` | Get rule by ID | User |
| DELETE | `/v1/policy/rule?id=` | Delete rule by ID | User |
### ACME (RFC 8555)
ACME protocol endpoints are mounted per CA engine instance and require no
@@ -606,6 +666,8 @@ from v1:
`google.protobuf.Timestamp` instead of RFC3339 strings.
- **Message types**: `CertRecord` (full certificate data) and `CertSummary`
(lightweight, for list responses) replace the generic struct maps.
- **`BarrierService`**: `ListKeys`, `RotateMEK`, `RotateKey`, `Migrate`
admin-only key management RPCs for MEK/DEK rotation and v1→v2 migration.
- **`ACMEService`**: `CreateEAB`, `SetConfig`, `ListAccounts`, `ListOrders`.
- **`AuthService`**: String timestamps replaced by `google.protobuf.Timestamp`.
@@ -758,9 +820,7 @@ ENTRYPOINT ["metacrypt", "server", "--config", "/data/metacrypt.toml"]
### TLS Configuration
- Minimum TLS version: 1.2
- Cipher suites: `TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384`,
`TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384`
- Minimum TLS version: 1.3 (cipher suites managed by Go's TLS 1.3 implementation)
- Timeouts: read 30s, write 30s, idle 120s
### CLI Commands
@@ -793,7 +853,7 @@ closing connections before exit.
| Nonce reuse | Fresh random nonce per encryption operation |
| Timing attacks | Constant-time comparison for passwords and tokens |
| Unauthorized access at rest | Database file permissions 0600; non-root container user |
| TLS downgrade | Minimum TLS 1.2; only AEAD cipher suites |
| TLS downgrade | Minimum TLS 1.3 |
| CA key compromise | CA/issuer keys encrypted in barrier; zeroized on seal; two-tier PKI limits blast radius |
| Leaf key leakage via storage | Issued cert private keys never persisted; only returned to requester |