diff --git a/AUDIT.md b/AUDIT.md index 6c0607e..ad844ef 100644 --- a/AUDIT.md +++ b/AUDIT.md @@ -105,12 +105,9 @@ User certificates should ideally include `source-address` critical options to li Added `min_decryption_version` per key (default 1). Decryption requests for versions below the minimum are rejected. New `update-key-config` operation (admin-only) advances the minimum (can only increase, cannot exceed current version). New `trim-key` operation permanently deletes versions older than the minimum. Both have corresponding gRPC RPCs and REST endpoints. The rotation cycle is documented: rotate → rewrap → advance min → trim. -**16. Key version pruning with `max_key_versions` has no safety check** +**16. ~~Key version pruning with `max_key_versions` has no safety check~~ RESOLVED** -If `max_key_versions` is set and data encrypted with an old version hasn't been re-wrapped, pruning that version makes the data permanently unrecoverable. There should be either: -- A warning/confirmation mechanism, or -- A way to scan for ciphertext referencing a version before pruning, or -- At minimum, clear documentation that pruning is destructive. +Added explicit `max_key_versions` behavior: auto-pruning during `rotate-key` only deletes versions strictly less than `min_decryption_version`. If the version count exceeds the limit but no eligible candidates remain, a warning is returned. This ensures pruning never destroys versions that may still have unrewrapped ciphertext. See also #30. **17. ~~RSA encryption without specifying padding scheme~~ RESOLVED** @@ -144,6 +141,106 @@ A compromised or malicious user token could issue unlimited encrypt/decrypt/sign --- +## Engine Design Review (2026-03-16) + +**Scope**: engines/sshca.md, engines/transit.md, engines/user.md (patched specs) + +### engines/sshca.md + +#### Strengths + +- RSA excluded — reduces attack surface, correct for SSH CA use case. +- Detailed Go code snippets for Initialize, sign-host, sign-user flows. +- KRL custom implementation correctly identified that `x/crypto/ssh` lacks KRL builders. +- Signing profiles are the only path to critical options — good privilege separation. +- Server-side serial generation with `crypto/rand` — no user-controllable serials. + +#### Issues + +**25. ~~Missing `list-certs` REST route~~ RESOLVED** + +Added `GET /v1/sshca/{mount}/certs` to the REST endpoints table and route registration code block. API sync restored. + +**26. ~~KRL section type description contradicts pseudocode~~ RESOLVED** + +Fixed the description block to use `KRL_SECTION_CERTIFICATES (0x01)` for the outer section type, matching the pseudocode and the OpenSSH `PROTOCOL.krl` spec. + +**27. ~~Policy check after certificate construction in sign-host~~ RESOLVED** + +Reordered both `sign-host` and `sign-user` flows to perform the policy check before generating the serial and building the certificate. Serial generation now only happens after authorization succeeds. + +### engines/transit.md + +#### Strengths + +- XChaCha20-Poly1305 (not ChaCha20-Poly1305) — correct for random nonce safety. +- All nonce sizes, hash algorithms, and signature encodings now specified. +- `trim-key` logic is detailed and safe (no-op when `min_decryption_version` is 1). +- Batch operations hold a read lock for atomicity with respect to key rotation. +- 500-item batch limit prevents resource exhaustion. + +#### Issues + +**28. ~~HMAC output not versioned — unverifiable after key rotation~~ RESOLVED** + +HMAC output now uses the same `metacrypt:v{version}:{base64}` format as ciphertext and signatures. Verification parses the version prefix, loads the corresponding key (subject to `min_decryption_version`), and uses `hmac.Equal` for constant-time comparison. + +**29. ~~`rewrap` policy action not specified~~ RESOLVED** + +`rewrap` and `batch-rewrap` now map to the `decrypt` action — rewrap internally decrypts and re-encrypts, so the caller must have decrypt permission. Batch variants map to the same action as their single counterparts. Documented in the authorization section. + +**30. ~~`max_key_versions` interaction with `min_decryption_version` unclear~~ RESOLVED** + +Added explicit `max_key_versions` behavior section. Pruning happens during `rotate-key` and only deletes versions strictly less than `min_decryption_version`. If the limit is exceeded but no eligible candidates remain, a warning is returned. This also resolves audit finding #16. + +**31. ~~Missing `get-public-key` REST route~~ RESOLVED** + +Added `GET /v1/transit/{mount}/keys/{name}/public-key` to the REST endpoints table and route registration code block. API sync restored. + +**32. ~~`exportable` flag with no export operation~~ RESOLVED** + +Removed the `exportable` flag from `create-key`. Transit's value proposition is that keys never leave the service. If export is needed for migration, a dedicated admin-only operation can be added later with audit logging. + +### engines/user.md + +#### Strengths + +- HKDF with per-recipient random salt — prevents wrapping key reuse across messages. +- AES-256-GCM for DEK wrapping (consistent with codebase, avoids new primitive). +- ECDH key agreement with info-string binding prevents key confusion. +- Explicit zeroization of all intermediate secrets documented. +- Envelope format includes salt per-recipient — correct for HKDF security. + +#### Issues + +**33. ~~Auto-provisioning creates keys for arbitrary usernames~~ RESOLVED** + +The encrypt flow now validates recipient usernames against MCIAS via `auth.ValidateUsername` before auto-provisioning. Non-existent usernames are rejected with an error, preventing barrier pollution. + +**34. ~~No recipient limit on encrypt~~ RESOLVED** + +Added a `maxRecipients = 100` limit. Requests exceeding this limit are rejected with `400 Bad Request` before any ECDH computation. + +**35. ~~No re-encryption support for key rotation~~ RESOLVED** + +Added a `re-encrypt` operation that decrypts an envelope and re-encrypts it with current key pairs for all recipients. This enables safe key rotation: re-encrypt all stored envelopes first, then call `rotate-key`. Added to HandleRequest dispatch, gRPC service, REST endpoints, and route registration. + +**36. ~~`UserKeyConfig` type undefined~~ RESOLVED** + +Defined `UserKeyConfig` struct with `Algorithm`, `CreatedAt`, and `AutoProvisioned` fields in the in-memory state section. + +### Cross-Cutting Issues (Engine Designs) + +**37. ~~`adminOnlyOperations` name collision blocks user engine `rotate-key`~~ RESOLVED** + +Changed the `adminOnlyOperations` map from flat operation names to engine-type-qualified keys (`engineType:operation`, e.g. `"transit:rotate-key"`). The generic endpoint now resolves the mount's engine type via `GetMount` before checking the map. Added tests verifying that `rotate-key` on a user mount succeeds for non-admin users while `rotate-key` on a transit mount correctly requires admin. + +**38. ~~`engine.ZeroizeKey` helper prerequisite not cross-referenced~~ RESOLVED** + +Added prerequisite step to both transit and user implementation steps referencing `engines/sshca.md` step 1 for the `engine.ZeroizeKey` shared helper. + +--- + ## Priority Summary | Priority | Issue | Location | @@ -151,6 +248,7 @@ A compromised or malicious user token could issue unlimited encrypt/decrypt/sign | ~~**Critical**~~ | ~~#4 — Policy auth contradiction (admin vs user)~~ **RESOLVED** | ARCHITECTURE.md | | ~~**Critical**~~ | ~~#9 — User-controllable SSH cert serials~~ **RESOLVED** | sshca.md | | ~~**Critical**~~ | ~~#13 — Policy path collision (`ca/` vs `sshca/`)~~ **RESOLVED** | sshca.md | +| ~~**Critical**~~ | ~~#37 — `adminOnlyOperations` name collision blocks user `rotate-key`~~ **RESOLVED** | Cross-cutting | | ~~**High**~~ | ~~#5 — No path AAD in barrier encryption~~ **RESOLVED** | ARCHITECTURE.md | | ~~**High**~~ | ~~#12 — No KRL distribution for SSH revocation~~ **RESOLVED** | sshca.md | | ~~**High**~~ | ~~#15 — No min key version for transit rotation~~ **RESOLVED** | transit.md | @@ -158,12 +256,25 @@ A compromised or malicious user token could issue unlimited encrypt/decrypt/sign | ~~**High**~~ | ~~#11 — `critical_options` not restricted~~ **RESOLVED** | sshca.md | | ~~**High**~~ | ~~#6 — Single MEK with no rotation~~ **RESOLVED** | ARCHITECTURE.md | | ~~**High**~~ | ~~#22 — No forward secrecy / per-engine DEKs~~ **RESOLVED** | Cross-cutting | +| ~~**High**~~ | ~~#28 — HMAC output not versioned~~ **RESOLVED** | transit.md | +| ~~**High**~~ | ~~#30 — `max_key_versions` vs `min_decryption_version` unclear~~ **RESOLVED** | transit.md | +| ~~**High**~~ | ~~#33 — Auto-provision creates keys for arbitrary usernames~~ **RESOLVED** | user.md | | ~~**Medium**~~ | ~~#2 — Token cache revocation gap~~ **ACCEPTED** | ARCHITECTURE.md | | ~~**Medium**~~ | ~~#3 — Admin all-or-nothing access~~ **ACCEPTED** | ARCHITECTURE.md | | ~~**Medium**~~ | ~~#8 — Unseal rate limit resets on restart~~ **ACCEPTED** | ARCHITECTURE.md | | ~~**Medium**~~ | ~~#20 — `decrypt` mapped to `read` action~~ **RESOLVED** | transit.md | | ~~**Medium**~~ | ~~#24 — No CSRF protection for web UI~~ **RESOLVED** | ARCHITECTURE.md | +| ~~**Medium**~~ | ~~#25 — Missing `list-certs` REST route~~ **RESOLVED** | sshca.md | +| ~~**Medium**~~ | ~~#26 — KRL section type description error~~ **RESOLVED** | sshca.md | +| ~~**Medium**~~ | ~~#27 — Policy check after cert construction~~ **RESOLVED** | sshca.md | +| ~~**Medium**~~ | ~~#29 — `rewrap` policy action not specified~~ **RESOLVED** | transit.md | +| ~~**Medium**~~ | ~~#31 — Missing `get-public-key` REST route~~ **RESOLVED** | transit.md | +| ~~**Medium**~~ | ~~#34 — No recipient limit on encrypt~~ **RESOLVED** | user.md | | ~~**Low**~~ | ~~#1 — TLS 1.2 vs 1.3~~ **RESOLVED** | ARCHITECTURE.md | | ~~**Low**~~ | ~~#19 — No batch transit operations~~ **RESOLVED** | transit.md | | ~~**Low**~~ | ~~#18 — HMAC/sign semantic confusion~~ **RESOLVED** | transit.md | | ~~**Medium**~~ | ~~#23 — Generic endpoint bypasses typed route middleware~~ **RESOLVED** | Cross-cutting | +| ~~**Low**~~ | ~~#32 — `exportable` flag with no export operation~~ **RESOLVED** | transit.md | +| ~~**Low**~~ | ~~#35 — No re-encryption support for user key rotation~~ **RESOLVED** | user.md | +| ~~**Low**~~ | ~~#36 — `UserKeyConfig` type undefined~~ **RESOLVED** | user.md | +| ~~**Low**~~ | ~~#38 — `ZeroizeKey` prerequisite not cross-referenced~~ **RESOLVED** | Cross-cutting | diff --git a/docs/engineering-standards.md b/docs/engineering-standards.md index 6f97a86..99d19c8 100644 --- a/docs/engineering-standards.md +++ b/docs/engineering-standards.md @@ -1,12 +1,31 @@ # Metacircular Dynamics — Engineering Standards +Source: https://metacircular.net/roam/20260314210051-metacircular_dynamics.html + This document describes the standard repository layout, tooling, and software -development lifecycle (SDLC) for services built at Metacircular Dynamics. It is -derived from the conventions established in Metacrypt and codifies them as the -baseline for all new and existing services. +development lifecycle (SDLC) for services built at Metacircular Dynamics. It +incorporates the platform-wide project guidelines and codifies the conventions +established in Metacrypt as the baseline for all services. + +## Platform Rules + +These four rules apply to every Metacircular service: + +1. **Data Storage**: All service data goes in `/srv//` to enable + straightforward migration across systems. +2. **Deployment Architecture**: Services require systemd unit files but + prioritize container-first design to support deployment via the + Metacircular Control Plane (MCP). +3. **Identity Management**: Services must integrate with MCIAS (Metacircular + Identity and Access Service) for user management and access control. Three + role levels: `admin` (full administrative access), `user` (full + non-administrative access), `guest` (service-dependent restrictions). +4. **API Design**: Services expose both gRPC and REST interfaces, kept in + sync. Web UIs are built with htmx. ## Table of Contents +0. [Platform Rules](#platform-rules) 1. [Repository Layout](#repository-layout) 2. [Language & Toolchain](#language--toolchain) 3. [Build System](#build-system) @@ -559,10 +578,35 @@ Services handle `SIGINT` and `SIGTERM`, shutting down cleanly: | File | Purpose | Audience | |------|---------|----------| +| `README.md` | Project overview, quick-start, and contributor guide | Everyone | | `CLAUDE.md` | AI-assisted development context | Claude Code | | `ARCHITECTURE.md` | Full system specification | Engineers | +| `RUNBOOK.md` | Operational procedures and incident response | Operators | | `deploy/examples/.toml` | Example configuration | Operators | +### Suggested Files + +These are not required for every project but should be created where applicable: + +| File | When to Include | Purpose | +|------|-----------------|---------| +| `AUDIT.md` | Services handling cryptography, secrets, PII, or auth | Security audit findings with issue tracking and resolution status | +| `POLICY.md` | Services with fine-grained access control | Policy engine documentation: rule structure, evaluation algorithm, resource paths, action classification, common patterns | + +### README.md + +The README is the front door. A new engineer or user should be able to +understand what the service does and get it running from this file alone. +It should contain: + +- Project name and one-paragraph description. +- Quick-start instructions (build, configure, run). +- Link to `ARCHITECTURE.md` for full technical details. +- Link to `RUNBOOK.md` for operational procedures. +- License and contribution notes (if applicable). + +Keep it concise. The README is not the spec — that's `ARCHITECTURE.md`. + ### CLAUDE.md This file provides context for AI-assisted development. It should contain: @@ -596,6 +640,56 @@ This is the canonical specification for the service. It should cover: This document is the source of truth. When the code and the spec disagree, one of them has a bug. +### RUNBOOK.md + +The runbook is written for operators, not developers. It covers what to do +when things go wrong and how to perform routine maintenance. It should +contain: + +1. **Service overview** — what the service does, in one paragraph. +2. **Health checks** — how to verify the service is healthy (endpoints, + CLI commands, expected responses). +3. **Common operations** — start, stop, restart, seal/unseal, backup, + restore, log inspection. +4. **Alerting** — what alerts exist, what they mean, and how to respond. +5. **Incident procedures** — step-by-step playbooks for known failure + modes (database corruption, certificate expiry, MCIAS outage, disk + full, etc.). +6. **Escalation** — when and how to escalate beyond the runbook. + +Write runbook entries as numbered steps, not prose. An operator at 3 AM +should be able to follow them without thinking. + +### AUDIT.md (Suggested) + +For services that handle cryptography, secrets, PII, or authentication, +maintain a security audit log. Each finding gets a numbered entry with: + +- Description of the issue. +- Severity (critical, high, medium, low). +- Resolution status: open, resolved (with summary), or accepted (with + rationale for accepting the risk). + +The priority summary table at the bottom provides a scannable overview. +Resolved and accepted items are struck through but retained for history. +See Metacrypt's `AUDIT.md` for the reference format. + +### POLICY.md (Suggested) + +For services with a policy engine or fine-grained access control, document +the policy model separately from the architecture spec. It should cover: + +- Rule structure (fields, types, semantics). +- Evaluation algorithm (match logic, priority, default effect). +- Resource path conventions and glob patterns. +- Action classification. +- API endpoints for policy CRUD. +- Common policy patterns with examples. +- Role summary (what each MCIAS role gets by default). + +This document is aimed at administrators who need to write policy rules, +not engineers who need to understand the implementation. + ### Engine/Feature Design Documents For services with a modular architecture, each module gets its own design diff --git a/engines/sshca.md b/engines/sshca.md index b88da69..6b531a6 100644 --- a/engines/sshca.md +++ b/engines/sshca.md @@ -17,11 +17,13 @@ Passed as `config` at mount time: | Field | Default | Description | |-----------------|------------------|------------------------------------------| -| `key_algorithm` | `"ed25519"` | CA key type: ed25519, ecdsa, rsa | -| `key_size` | `0` | Key size (ignored for ed25519; 256/384/521 for ECDSA, 2048/4096 for RSA) | +| `key_algorithm` | `"ed25519"` | CA key type: `ed25519`, `ecdsa-p256`, `ecdsa-p384` | | `max_ttl` | `"87600h"` | Maximum certificate validity | | `default_ttl` | `"24h"` | Default certificate validity | +RSA is intentionally excluded — Ed25519 and ECDSA are preferred for SSH CAs. +This avoids the need for a `key_size` parameter and simplifies key generation. + ## Barrier Storage Layout ``` @@ -30,19 +32,20 @@ engine/sshca/{mount}/ca/key.pem CA private key (PEM, PKCS8) engine/sshca/{mount}/ca/pubkey.pub CA public key (SSH authorized_keys format) engine/sshca/{mount}/profiles/{name}.json Signing profiles engine/sshca/{mount}/certs/{serial}.json Signed cert records -engine/sshca/{mount}/krl.bin Current KRL (OpenSSH format) +engine/sshca/{mount}/krl_version.json KRL version counter ``` ## In-Memory State ```go type SSHCAEngine struct { - barrier barrier.Barrier - config *SSHCAConfig - caKey crypto.PrivateKey // CA signing key - caSigner ssh.Signer // ssh.Signer wrapping caKey - mountPath string - mu sync.RWMutex + barrier barrier.Barrier + config *SSHCAConfig + caKey crypto.PrivateKey // CA signing key + caSigner ssh.Signer // ssh.Signer wrapping caKey + mountPath string + krlVersion uint64 // monotonically increasing + mu sync.RWMutex } ``` @@ -52,20 +55,34 @@ Key material (`caKey`, `caSigner`) is zeroized on `Seal()`. ### Initialize -1. Parse and store config in barrier as `config.json`. -2. Generate CA key pair using the configured algorithm. -3. Store private key PEM and SSH public key in barrier. -4. Load key into memory as `ssh.Signer`. +1. Parse and validate config: ensure `key_algorithm` is one of `ed25519`, + `ecdsa-p256`, `ecdsa-p384`. Parse `max_ttl` and `default_ttl` as + `time.Duration`. +2. Store config in barrier as `{mountPath}config.json`. +3. Generate CA key pair: + - `ed25519`: `ed25519.GenerateKey(rand.Reader)` + - `ecdsa-p256`: `ecdsa.GenerateKey(elliptic.P256(), rand.Reader)` + - `ecdsa-p384`: `ecdsa.GenerateKey(elliptic.P384(), rand.Reader)` +4. Marshal private key to PEM using `x509.MarshalPKCS8PrivateKey` → + `pem.EncodeToMemory(&pem.Block{Type: "PRIVATE KEY", Bytes: der})`. +5. Store private key PEM in barrier at `{mountPath}ca/key.pem`. +6. Generate SSH public key via `ssh.NewPublicKey(pubKey)`, marshal with + `ssh.MarshalAuthorizedKey`. Store at `{mountPath}ca/pubkey.pub`. +7. Load key into memory: `ssh.NewSignerFromKey(caKey)` → `caSigner`. +8. Initialize `krlVersion` to 0, store in barrier. ### Unseal -1. Load config from barrier. -2. Load CA private key from barrier, parse into `crypto.PrivateKey`. -3. Wrap as `ssh.Signer`. +1. Load config JSON from barrier, unmarshal into `*SSHCAConfig`. +2. Load `{mountPath}ca/key.pem` from barrier, decode PEM, parse with + `x509.ParsePKCS8PrivateKey` → `caKey`. +3. Create `caSigner` via `ssh.NewSignerFromKey(caKey)`. +4. Load `krl_version.json` from barrier → `krlVersion`. ### Seal -1. Zeroize `caKey` (same `zeroizeKey` helper used by CA engine). +1. Zeroize `caKey` using the shared `zeroizeKey` helper (see Implementation + References below). 2. Nil out `caSigner`, `config`. ## Operations @@ -85,6 +102,43 @@ Key material (`caKey`, `caSigner`) is zeroized on `Seal()`. | `revoke-cert` | Admin | Revoke a certificate (soft flag) | | `delete-cert` | Admin | Delete a certificate record | +### HandleRequest dispatch + +Follow the CA engine's pattern (`internal/engine/ca/ca.go:284-317`): + +```go +func (e *SSHCAEngine) HandleRequest(ctx context.Context, req *engine.Request) (*engine.Response, error) { + switch req.Operation { + case "get-ca-pubkey": + return e.handleGetCAPublicKey(ctx) + case "sign-host": + return e.handleSignHost(ctx, req) + case "sign-user": + return e.handleSignUser(ctx, req) + case "create-profile": + return e.handleCreateProfile(ctx, req) + case "update-profile": + return e.handleUpdateProfile(ctx, req) + case "get-profile": + return e.handleGetProfile(ctx, req) + case "list-profiles": + return e.handleListProfiles(ctx, req) + case "delete-profile": + return e.handleDeleteProfile(ctx, req) + case "get-cert": + return e.handleGetCert(ctx, req) + case "list-certs": + return e.handleListCerts(ctx, req) + case "revoke-cert": + return e.handleRevokeCert(ctx, req) + case "delete-cert": + return e.handleDeleteCert(ctx, req) + default: + return nil, fmt.Errorf("sshca: unknown operation: %s", req.Operation) + } +} +``` + ### sign-host Request data: @@ -97,15 +151,30 @@ Request data: | `extensions` | No | Map of extensions to include | Flow: -1. Authenticate caller (`IsUser()`); admins bypass policy/ownership checks. -2. Parse the supplied SSH public key. -3. Generate a 64-bit serial using `crypto/rand`. -4. Build `ssh.Certificate` with `CertType: ssh.HostCert`, principals, validity, serial. -5. Policy check: `sshca/{mount}/id/{hostname}` for each principal, with ownership - rules (same as CA engine — hostname not held by another user's active cert). -6. Sign with `caSigner`. -7. Store `CertRecord` in barrier (certificate bytes, metadata; **no private key**). -8. Return signed certificate in OpenSSH format. +1. Authenticate caller (`req.CallerInfo.IsUser()`); admins bypass policy checks. +2. Parse the supplied SSH public key with `ssh.ParsePublicKey(ssh.ParseAuthorizedKey(...))`. +3. Parse TTL: if provided parse as `time.Duration`, cap at `config.MaxTTL`. + If not provided, use `config.DefaultTTL`. +4. Policy check: for each hostname, check policy on + `sshca/{mount}/id/{hostname}`, action `sign`. Use `req.CheckPolicy`. + Fail early before generating a serial or building the cert. +5. Generate a 64-bit serial: `var buf [8]byte; rand.Read(buf[:]); serial := binary.BigEndian.Uint64(buf[:])`. +6. Build `ssh.Certificate`: + ```go + cert := &ssh.Certificate{ + Key: parsedPubKey, + Serial: serial, + CertType: ssh.HostCert, + KeyId: fmt.Sprintf("host:%s:%d", hostnames[0], serial), + ValidPrincipals: hostnames, + ValidAfter: uint64(time.Now().Unix()), + ValidBefore: uint64(time.Now().Add(ttl).Unix()), + Permissions: ssh.Permissions{Extensions: extensions}, + } + ``` +7. Sign: `cert.SignCert(rand.Reader, e.caSigner)`. +8. Store `CertRecord` in barrier at `{mountPath}certs/{serial}.json`. +9. Return: `{"certificate": ssh.MarshalAuthorizedKey(cert), "serial": serial}`. ### sign-user @@ -126,20 +195,26 @@ setting security-sensitive options like `force-command` or `source-address`. Flow: 1. Authenticate caller (`IsUser()`); admins bypass. 2. Parse the supplied SSH public key. -3. If `profile` is specified, load the signing profile and check policy - (`sshca/{mount}/profile/{profile_name}`, action `read`). Merge the +3. If `profile` is specified, load the signing profile from barrier and check + policy (`sshca/{mount}/profile/{profile_name}`, action `read`). Merge the profile's critical options and extensions into the certificate. Any extensions in the request are merged with profile extensions; conflicts - are resolved in favor of the profile. -4. Generate a 64-bit serial using `crypto/rand`. -5. Build `ssh.Certificate` with `CertType: ssh.UserCert`, principals, validity, serial. -6. If the profile specifies `max_ttl`, enforce it (cap the requested TTL). -7. Policy check: `sshca/{mount}/id/{principal}` for each principal. + are resolved in favor of the profile. If the profile specifies + `allowed_principals`, verify all requested principals are in the list. +4. If the profile specifies `max_ttl`, enforce it (cap the requested TTL). +5. Policy check: `sshca/{mount}/id/{principal}` for each principal, action `sign`. Default rule: a user can only sign certs for their own username as principal, - unless a policy grants access to other principals. -8. Sign with `caSigner`. -9. Store `CertRecord` in barrier (includes profile name if used). -10. Return signed certificate in OpenSSH format. + unless a policy grants access to other principals. Implement by checking + `req.CallerInfo.Username == principal` as the default-allow case. + Fail early before generating a serial or building the cert. +6. Generate a 64-bit serial using `crypto/rand`. +7. Build `ssh.Certificate` with `CertType: ssh.UserCert`, principals, validity. +8. Set `Permissions.CriticalOptions` from profile (if any) and + `Permissions.Extensions` from merged extensions. Default extensions when + none specified: `{"permit-pty": ""}`. +9. Sign with `caSigner`. +10. Store `CertRecord` in barrier (includes profile name if used). +11. Return signed certificate in OpenSSH format + serial. ### Signing Profiles @@ -151,11 +226,11 @@ options, and access to each profile is policy-gated. ```go type SigningProfile struct { - Name string `json:"name"` - CriticalOptions map[string]string `json:"critical_options"` // e.g. {"force-command": "/usr/bin/rsync", "source-address": "10.0.0.0/8"} - Extensions map[string]string `json:"extensions"` // merged with request extensions - MaxTTL string `json:"max_ttl,omitempty"` // overrides engine max_ttl if shorter - AllowedPrincipals []string `json:"allowed_principals,omitempty"` // if set, restricts principals + Name string `json:"name"` + CriticalOptions map[string]string `json:"critical_options"` // e.g. {"force-command": "/usr/bin/rsync", "source-address": "10.0.0.0/8"} + Extensions map[string]string `json:"extensions"` // merged with request extensions + MaxTTL string `json:"max_ttl,omitempty"` // overrides engine max_ttl if shorter + AllowedPrincipals []string `json:"allowed_principals,omitempty"` // if set, restricts principals } ``` @@ -165,16 +240,6 @@ type SigningProfile struct { engine/sshca/{mount}/profiles/{name}.json ``` -#### Operations - -| Operation | Auth Required | Description | -|------------------|---------------|------------------------------------------| -| `create-profile` | Admin | Create a signing profile | -| `update-profile` | Admin | Update a signing profile | -| `get-profile` | User/Admin | Get profile details | -| `list-profiles` | User/Admin | List available profiles | -| `delete-profile` | Admin | Delete a signing profile | - #### Policy Gating Access to a profile is controlled via policy on resource @@ -194,7 +259,9 @@ type CertRecord struct { Serial uint64 `json:"serial"` CertType string `json:"cert_type"` // "host" or "user" Principals []string `json:"principals"` - CertData string `json:"cert_data"` // OpenSSH format + CertData string `json:"cert_data"` // OpenSSH authorized_keys format + KeyID string `json:"key_id"` // certificate KeyId field + Profile string `json:"profile,omitempty"` // signing profile used (if any) IssuedBy string `json:"issued_by"` IssuedAt time.Time `json:"issued_at"` ExpiresAt time.Time `json:"expires_at"` @@ -204,31 +271,71 @@ type CertRecord struct { } ``` +Serial is stored as `uint64` (not string) since SSH certificate serials are +uint64 natively. Barrier path uses the decimal string representation: +`fmt.Sprintf("%d", serial)`. + ## Key Revocation List (KRL) SSH servers cannot query Metacrypt in real time to check whether a certificate -has been revoked. Instead, the SSH CA engine generates an OpenSSH-format KRL -(Key Revocation List) that SSH servers fetch periodically and reference via -`RevokedKeys` in `sshd_config`. +has been revoked. Instead, the SSH CA engine generates a KRL that SSH servers +fetch periodically and reference via `RevokedKeys` in `sshd_config`. -### KRL Generation +### KRL Generation — Custom Implementation -The engine maintains a KRL in memory, rebuilt whenever a certificate is revoked -or deleted. The KRL is a binary blob in OpenSSH KRL format -(`golang.org/x/crypto/ssh` provides marshalling helpers), containing: +**Important**: `golang.org/x/crypto/ssh` does **not** provide KRL generation +helpers. It can parse KRLs but not build them. The engine must implement KRL +serialization directly per the OpenSSH KRL format specification +(`PROTOCOL.krl` in the OpenSSH source). -- **Serial revocations**: Revoked certificate serial numbers, keyed to the CA - public key. This is the most compact representation. -- **KRL version**: Monotonically increasing counter, incremented on each - rebuild. SSH servers can use this to detect stale KRLs. -- **Generated-at timestamp**: Included in the KRL for freshness checking. +The KRL format is a binary structure: -The KRL is stored in the barrier at `engine/sshca/{mount}/krl.bin` and cached -in memory. It is rebuilt on: -- `revoke-cert` — adds the serial to the KRL. -- `delete-cert` — if the cert was revoked, the KRL is regenerated from all - remaining revoked certs. -- Engine unseal — loaded from barrier into memory. +``` +MAGIC = "OPENSSH_KRL\x00" (12 bytes) +VERSION = uint32 (format version, always 1) +KRL_VERSION = uint64 (monotonically increasing per rebuild) +GENERATED_DATE = uint64 (Unix timestamp) +FLAGS = uint64 (0) +RESERVED = string (empty) +COMMENT = string (empty) +SECTIONS... (one or more typed sections) +``` + +For serial-based revocation (the simplest and most compact representation): + +``` +Section type: KRL_SECTION_CERTIFICATES (0x01) + CA key blob: ssh.MarshalAuthorizedKey(caSigner.PublicKey()) + Subsection type: KRL_SECTION_CERT_SERIAL_LIST (0x20) + Revoked serials: sorted list of uint64 serials +``` + +Implement as a `buildKRL` function: + +```go +func (e *SSHCAEngine) buildKRL(revokedSerials []uint64) []byte { + // 1. Sort serials. + // 2. Write MAGIC header. + // 3. Write KRL_VERSION (e.krlVersion), GENERATED_DATE (now), FLAGS (0). + // 4. Write RESERVED (empty string), COMMENT (empty string). + // 5. Write section header: type=0x01 (KRL_SECTION_CERTIFICATES). + // 6. Write CA public key blob. + // 7. Write subsection: type=0x20 (KRL_SECTION_CERT_SERIAL_LIST), + // followed by each serial as uint64 big-endian. + // 8. Return assembled bytes. +} +``` + +Use `encoding/binary` with `binary.BigEndian` for all integer encoding. +SSH strings are length-prefixed: `uint32(len) + bytes`. + +The KRL version counter is persisted in barrier at `{mountPath}krl_version.json` +and incremented on each rebuild. On unseal, the counter is loaded from barrier. + +The KRL is rebuilt (not stored in barrier — it's a derived artifact) on: +- `revoke-cert` — collects all revoked serials, rebuilds. +- `delete-cert` — if the cert was revoked, rebuilds from remaining revoked certs. +- Engine unseal — rebuilds from all revoked certs. ### Distribution @@ -237,13 +344,12 @@ unauthenticated endpoint (analogous to the public CA key endpoint): | Method | Path | Description | |--------|-------------------------------------|--------------------------------| -| GET | `/v1/sshca/{mount}/krl` | Current KRL (binary, OpenSSH format) | +| GET | `/v1/sshca/{mount}/krl` | Current KRL (binary) | The response includes: - `Content-Type: application/octet-stream` -- `ETag` header derived from the KRL version, enabling conditional fetches. -- `Cache-Control: max-age=60` to encourage periodic refresh without - overwhelming the server. +- `ETag` header: `fmt.Sprintf("%d", e.krlVersion)`, enabling conditional fetches. +- `Cache-Control: max-age=60` to encourage periodic refresh. SSH servers should be configured to fetch the KRL on a cron schedule (e.g. every 1–5 minutes) and write it to a local file referenced by `sshd_config`: @@ -252,19 +358,6 @@ every 1–5 minutes) and write it to a local file referenced by `sshd_config`: RevokedKeys /etc/ssh/metacrypt_krl ``` -A helper script or systemd timer can fetch the KRL: - -```bash -curl -s -o /etc/ssh/metacrypt_krl \ - https://metacrypt.example.com:8443/v1/sshca/ssh/krl -``` - -### Operations - -| Operation | Auth Required | Description | -|------------|---------------|----------------------------------------------| -| `get-krl` | None | Return the current KRL in OpenSSH format | - ## gRPC Service (proto/metacrypt/v2/sshca.proto) ```protobuf @@ -292,7 +385,7 @@ Public (unseal required, no auth): | Method | Path | Description | |--------|-------------------------------------|--------------------------------| | GET | `/v1/sshca/{mount}/ca` | CA public key (SSH format) | -| GET | `/v1/sshca/{mount}/krl` | Current KRL (OpenSSH format) | +| GET | `/v1/sshca/{mount}/krl` | Current KRL (binary) | Typed endpoints (auth required): @@ -305,12 +398,96 @@ Typed endpoints (auth required): | GET | `/v1/sshca/{mount}/profiles/{name}` | Get profile | | PUT | `/v1/sshca/{mount}/profiles/{name}` | Update profile | | DELETE | `/v1/sshca/{mount}/profiles/{name}` | Delete profile | +| GET | `/v1/sshca/{mount}/certs` | List cert records | | GET | `/v1/sshca/{mount}/cert/{serial}` | Get cert record | | POST | `/v1/sshca/{mount}/cert/{serial}/revoke` | Revoke cert | | DELETE | `/v1/sshca/{mount}/cert/{serial}` | Delete cert record | +### REST Route Registration + +Add to `internal/server/routes.go` in `registerRoutes`, following the CA +engine's pattern with `chi.URLParam`: + +```go +// SSH CA public routes (no auth, unseal required). +r.Get("/v1/sshca/{mount}/ca", s.requireUnseal(s.handleSSHCAPublicKey)) +r.Get("/v1/sshca/{mount}/krl", s.requireUnseal(s.handleSSHCAKRL)) + +// SSH CA typed routes (auth required). +r.Post("/v1/sshca/{mount}/sign-host", s.requireAuth(s.handleSSHCASignHost)) +r.Post("/v1/sshca/{mount}/sign-user", s.requireAuth(s.handleSSHCASignUser)) +r.Post("/v1/sshca/{mount}/profiles", s.requireAdmin(s.handleSSHCACreateProfile)) +r.Get("/v1/sshca/{mount}/profiles", s.requireAuth(s.handleSSHCAListProfiles)) +r.Get("/v1/sshca/{mount}/profiles/{name}", s.requireAuth(s.handleSSHCAGetProfile)) +r.Put("/v1/sshca/{mount}/profiles/{name}", s.requireAdmin(s.handleSSHCAUpdateProfile)) +r.Delete("/v1/sshca/{mount}/profiles/{name}", s.requireAdmin(s.handleSSHCADeleteProfile)) +r.Get("/v1/sshca/{mount}/certs", s.requireAuth(s.handleSSHCAListCerts)) +r.Get("/v1/sshca/{mount}/cert/{serial}", s.requireAuth(s.handleSSHCAGetCert)) +r.Post("/v1/sshca/{mount}/cert/{serial}/revoke", s.requireAdmin(s.handleSSHCARevokeCert)) +r.Delete("/v1/sshca/{mount}/cert/{serial}", s.requireAdmin(s.handleSSHCADeleteCert)) +``` + +Each handler extracts `chi.URLParam(r, "mount")`, builds an `engine.Request` +with the appropriate operation name and data, and calls +`s.engines.HandleRequest(...)`. Follow the `handleGetCert`/`handleRevokeCert` +pattern in the existing code. + All operations are also accessible via the generic `POST /v1/engine/request`. +### gRPC Interceptor Maps + +Add to `sealRequiredMethods`, `authRequiredMethods`, and `adminRequiredMethods` +in `internal/grpcserver/server.go`: + +```go +// sealRequiredMethods: +"/metacrypt.v2.SSHCAService/GetCAPublicKey": true, +"/metacrypt.v2.SSHCAService/SignHost": true, +"/metacrypt.v2.SSHCAService/SignUser": true, +"/metacrypt.v2.SSHCAService/CreateProfile": true, +"/metacrypt.v2.SSHCAService/UpdateProfile": true, +"/metacrypt.v2.SSHCAService/GetProfile": true, +"/metacrypt.v2.SSHCAService/ListProfiles": true, +"/metacrypt.v2.SSHCAService/DeleteProfile": true, +"/metacrypt.v2.SSHCAService/GetCert": true, +"/metacrypt.v2.SSHCAService/ListCerts": true, +"/metacrypt.v2.SSHCAService/RevokeCert": true, +"/metacrypt.v2.SSHCAService/DeleteCert": true, +"/metacrypt.v2.SSHCAService/GetKRL": true, + +// authRequiredMethods (all except GetCAPublicKey and GetKRL): +"/metacrypt.v2.SSHCAService/SignHost": true, +"/metacrypt.v2.SSHCAService/SignUser": true, +"/metacrypt.v2.SSHCAService/CreateProfile": true, +"/metacrypt.v2.SSHCAService/UpdateProfile": true, +"/metacrypt.v2.SSHCAService/GetProfile": true, +"/metacrypt.v2.SSHCAService/ListProfiles": true, +"/metacrypt.v2.SSHCAService/DeleteProfile": true, +"/metacrypt.v2.SSHCAService/GetCert": true, +"/metacrypt.v2.SSHCAService/ListCerts": true, +"/metacrypt.v2.SSHCAService/RevokeCert": true, +"/metacrypt.v2.SSHCAService/DeleteCert": true, + +// adminRequiredMethods: +"/metacrypt.v2.SSHCAService/CreateProfile": true, +"/metacrypt.v2.SSHCAService/UpdateProfile": true, +"/metacrypt.v2.SSHCAService/DeleteProfile": true, +"/metacrypt.v2.SSHCAService/RevokeCert": true, +"/metacrypt.v2.SSHCAService/DeleteCert": true, +``` + +Also add SSH CA operations to `adminOnlyOperations` in `routes.go` (keys are +`engineType:operation` to avoid cross-engine name collisions): + +```go +// SSH CA engine. +"sshca:create-profile": true, +"sshca:update-profile": true, +"sshca:delete-profile": true, +"sshca:revoke-cert": true, +"sshca:delete-cert": true, +``` + ## Web UI Add to `/dashboard` the ability to mount an SSH CA engine. @@ -322,20 +499,54 @@ Add an `/sshca` page (or section on the existing PKI page) displaying: ## Implementation Steps -1. **`internal/engine/sshca/`** — Implement `SSHCAEngine` (types, lifecycle, - operations). Reuse `zeroizeKey` from `internal/engine/ca/` (move to shared - helper or duplicate). -2. **Register factory** in `cmd/metacrypt/main.go`: - `registry.RegisterFactory(engine.EngineTypeSSHCA, sshca.NewSSHCAEngine)`. -3. **Proto definitions** — `proto/metacrypt/v2/sshca.proto`, run `make proto`. -4. **gRPC handlers** — `internal/grpcserver/sshca.go`. -5. **REST routes** — Add to `internal/server/routes.go`. -6. **Web UI** — Add template + webserver routes. -7. **Tests** — Unit tests with in-memory barrier following the CA test pattern. +1. **Move `zeroizeKey` to shared location**: Copy the `zeroizeKey` function + from `internal/engine/ca/ca.go` (lines 1481–1498) to a new file + `internal/engine/helpers.go` in the `engine` package. Export it as + `engine.ZeroizeKey`. Update the CA engine to call `engine.ZeroizeKey` + instead of its local copy. This avoids a circular import (sshca cannot + import ca). + +2. **`internal/engine/sshca/`** — Implement `SSHCAEngine`: + - `types.go` — `SSHCAConfig`, `CertRecord`, `SigningProfile` structs. + - `sshca.go` — `NewSSHCAEngine` factory, lifecycle methods (`Type`, + `Initialize`, `Unseal`, `Seal`), `HandleRequest` dispatch. + - `sign.go` — `handleSignHost`, `handleSignUser`. + - `profiles.go` — Profile CRUD handlers. + - `certs.go` — `handleGetCert`, `handleListCerts`, `handleRevokeCert`, + `handleDeleteCert`. + - `krl.go` — `buildKRL`, `rebuildKRL`, `handleGetKRL`, + `collectRevokedSerials`. + +3. **Register factory** in `cmd/metacrypt/server.go` (line 76): + ```go + engineRegistry.RegisterFactory(engine.EngineTypeSSHCA, sshca.NewSSHCAEngine) + ``` + +4. **Proto definitions** — `proto/metacrypt/v2/sshca.proto`, run `make proto`. + +5. **gRPC handlers** — `internal/grpcserver/sshca.go`. Follow + `internal/grpcserver/ca.go` pattern: `sshcaServer` struct wrapping + `GRPCServer`, helper function for error mapping, typed RPC methods. + Register with `pb.RegisterSSHCAServiceServer(s.srv, &sshcaServer{s: s})` + in `server.go`. + +6. **REST routes** — Add to `internal/server/routes.go` per the route + registration section above. + +7. **Tests** — `internal/engine/sshca/sshca_test.go`: unit tests with + in-memory barrier following the CA test pattern. Test: + - Initialize + unseal lifecycle + - sign-host: valid signing, TTL enforcement, serial uniqueness + - sign-user: own-principal default, profile merging, profile TTL cap + - Profile CRUD + - Certificate list/get/revoke/delete + - KRL rebuild correctness (revoked serials present, unrevoked absent) + - Seal zeroizes key material ## Dependencies - `golang.org/x/crypto/ssh` (already in `go.mod` via transitive deps) +- `encoding/binary` (stdlib, for KRL serialization) ## Security Considerations @@ -351,3 +562,19 @@ Add an `/sshca` page (or section on the existing PKI page) displaying: prevents unprivileged users from bypassing `sshd_config` restrictions. - Profile access is policy-gated: a user must have policy access to `sshca/{mount}/profile/{name}` to use a profile. +- RSA keys are excluded to reduce attack surface and simplify the implementation. + +## Implementation References + +These existing code patterns should be followed exactly: + +| Pattern | Reference File | Lines | +|---------|---------------|-------| +| HandleRequest switch dispatch | `internal/engine/ca/ca.go` | 284–317 | +| zeroizeKey helper | `internal/engine/ca/ca.go` | 1481–1498 | +| CertRecord storage (JSON in barrier) | `internal/engine/ca/ca.go` | cert storage pattern | +| REST route registration with chi | `internal/server/routes.go` | 38–50 | +| gRPC handler structure | `internal/grpcserver/ca.go` | full file | +| gRPC interceptor maps | `internal/grpcserver/server.go` | 107–192 | +| Engine factory registration | `cmd/metacrypt/server.go` | 76 | +| adminOnlyOperations map | `internal/server/routes.go` | 259–279 | diff --git a/engines/transit.md b/engines/transit.md index 003925f..ea379d5 100644 --- a/engines/transit.md +++ b/engines/transit.md @@ -38,7 +38,7 @@ The transit engine manages **named encryption keys**. Each key has: | Type | Algorithm | Operations | |-----------------|-------------------|------------------| | `aes256-gcm` | AES-256-GCM | Encrypt, Decrypt | -| `chacha20-poly` | ChaCha20-Poly1305 | Encrypt, Decrypt | +| `chacha20-poly` | XChaCha20-Poly1305 | Encrypt, Decrypt | | `ed25519` | Ed25519 | Sign, Verify | | `ecdsa-p256` | ECDSA P-256 | Sign, Verify | | `ecdsa-p384` | ECDSA P-384 | Sign, Verify | @@ -49,6 +49,71 @@ RSA key types are intentionally excluded. The transit engine is not the right place for RSA — asymmetric encryption belongs in the user engine (via ECDH), and RSA signing offers no advantage over Ed25519/ECDSA for this use case. +### Cryptographic Details + +**Nonce sizes:** +- `aes256-gcm`: 12-byte nonce via `cipher.AEAD.NonceSize()` (standard GCM). +- `chacha20-poly`: 24-byte nonce via `chacha20poly1305.NewX()` (XChaCha20- + Poly1305). The `X` variant is used specifically because it has a large + enough nonce (192-bit) for safe random generation without birthday-bound + concerns. Use `chacha20poly1305.NonceSizeX` (24). + +**Nonce generation:** Always `crypto/rand.Read(nonce)`. Never use a counter — +keys may be used concurrently from multiple goroutines. + +**Signing algorithms:** +- `ed25519`: Direct Ed25519 signing (`ed25519.Sign`). The input is the raw + message — Ed25519 performs its own internal SHA-512 hashing. No prehash. +- `ecdsa-p256`: SHA-256 hash of input, then `ecdsa.SignASN1(rand, key, + hash)`. Signature is ASN.1 DER encoded (the standard Go representation). +- `ecdsa-p384`: SHA-384 hash of input, then `ecdsa.SignASN1(rand, key, + hash)`. Signature is ASN.1 DER encoded. + +The `algorithm` field in sign requests is currently unused (reserved for +future prehash options). Each key type has exactly one hash algorithm; there +is no caller choice. + +**Signature format:** +``` +metacrypt:v{version}:{base64(signature_bytes)} +``` +The `v{version}` identifies which key version was used for signing. For +Ed25519, `signature_bytes` is the raw 64-byte signature. For ECDSA, +`signature_bytes` is the ASN.1 DER encoding. + +**Verification:** `verify` parses the version from the signature string, loads +the corresponding public key version, and calls `ed25519.Verify` or +`ecdsa.VerifyASN1` as appropriate. + +**HMAC:** `hmac-sha256` uses `hmac.New(sha256.New, key)`, `hmac-sha512` uses +`hmac.New(sha512.New, key)`. Output uses the same versioned prefix format as +ciphertext and signatures: + +``` +metacrypt:v{version}:{base64(mac_bytes)} +``` + +The `v{version}` identifies which HMAC key version produced the MAC. This is +essential for HMAC verification after key rotation — without the version +prefix, the engine would not know which key version to use for recomputation. +HMAC verification parses the version, loads the corresponding key (subject to +`min_decryption_version` enforcement), recomputes the MAC, and compares using +`hmac.Equal` for constant-time comparison. + +**Key material sizes:** +- `aes256-gcm`: 32 bytes (`crypto/rand`). +- `chacha20-poly`: 32 bytes (`crypto/rand`). +- `ed25519`: `ed25519.GenerateKey(rand.Reader)` — 64-byte private key. +- `ecdsa-p256`: `ecdsa.GenerateKey(elliptic.P256(), rand.Reader)`. +- `ecdsa-p384`: `ecdsa.GenerateKey(elliptic.P384(), rand.Reader)`. +- `hmac-sha256`: 32 bytes (`crypto/rand`). +- `hmac-sha512`: 64 bytes (`crypto/rand`). + +**Key serialization in barrier:** +- Symmetric keys: raw bytes. +- Ed25519: `ed25519.PrivateKey` raw bytes (64 bytes). +- ECDSA: PKCS8 DER via `x509.MarshalPKCS8PrivateKey`. + ### Key Rotation Each key has a current version and may retain older versions. Encryption always @@ -67,6 +132,26 @@ lets operators complete a rotation cycle: Until `min_decryption_version` is advanced, old versions must be retained. +### `max_key_versions` Behavior + +When `max_key_versions` is set (> 0), the engine enforces a soft limit on the +number of retained versions. Pruning happens automatically during `rotate-key`, +after the new version is created: + +1. Count total versions. If `<= max_key_versions`, no pruning needed. +2. Identify candidate versions for pruning: versions **strictly less than** + `min_decryption_version`. +3. Delete candidates (oldest first) until the total count is within the limit + or no more candidates remain. +4. If the total still exceeds `max_key_versions` after pruning all eligible + candidates, include a warning in the response: + `"warning": "max_key_versions exceeded; advance min_decryption_version to enable pruning"`. + +This ensures `max_key_versions` **never** deletes a version at or above +`min_decryption_version`. The operator must complete the rotation cycle +(rotate → rewrap → advance min) before old versions become prunable. +`max_key_versions` is a safety net, not a foot-gun. + ### Ciphertext Format Transit ciphertexts use a versioned prefix: @@ -114,19 +199,32 @@ type keyVersion struct { ### Initialize -1. Parse and store config in barrier. -2. No keys are created at init time (keys are created on demand). +1. Parse and validate config: parse `max_key_versions` as integer (must be ≥ 0). +2. Store config in barrier as `{mountPath}config.json`: + ```go + configJSON, _ := json.Marshal(config) + barrier.Put(ctx, mountPath+"config.json", configJSON) + ``` +3. No keys are created at init time (keys are created on demand via + `create-key`). ### Unseal -1. Load config from barrier. -2. Discover and load all named keys and their versions from the barrier. +1. Load config JSON from barrier, unmarshal into `*TransitConfig`. +2. List all key directories under `{mountPath}keys/`. +3. For each key, load `config.json` and all `v{N}.key` entries: + - Symmetric keys (`aes256-gcm`, `chacha20-poly`, `hmac-*`): raw 32-byte + or 64-byte key material. + - Ed25519: `ed25519.PrivateKey` (64 bytes), derive public key. + - ECDSA: parse PKCS8 DER → `*ecdsa.PrivateKey`, extract `PublicKey`. +4. Populate `keys` map with all loaded key states. ### Seal -1. Zeroize all key material (symmetric keys overwritten with zeros, - asymmetric keys via `zeroizeKey`). -2. Nil out all maps. +1. Zeroize all key material: symmetric keys overwritten with zeros via + `crypto.Zeroize(key)`, asymmetric keys via `engine.ZeroizeKey(privKey)` + (shared helper, see sshca.md Implementation References). +2. Nil out `keys` map and `config`. ## Operations @@ -150,6 +248,53 @@ type keyVersion struct { | `hmac` | User+Policy | Compute HMAC with an HMAC key | | `get-public-key` | User/Admin | Get public key for asymmetric keys | +### HandleRequest dispatch + +Follow the CA engine's pattern (`internal/engine/ca/ca.go:284-317`): + +```go +func (e *TransitEngine) HandleRequest(ctx context.Context, req *engine.Request) (*engine.Response, error) { + switch req.Operation { + case "create-key": + return e.handleCreateKey(ctx, req) + case "delete-key": + return e.handleDeleteKey(ctx, req) + case "get-key": + return e.handleGetKey(ctx, req) + case "list-keys": + return e.handleListKeys(ctx, req) + case "rotate-key": + return e.handleRotateKey(ctx, req) + case "update-key-config": + return e.handleUpdateKeyConfig(ctx, req) + case "trim-key": + return e.handleTrimKey(ctx, req) + case "encrypt": + return e.handleEncrypt(ctx, req) + case "decrypt": + return e.handleDecrypt(ctx, req) + case "rewrap": + return e.handleRewrap(ctx, req) + case "batch-encrypt": + return e.handleBatchEncrypt(ctx, req) + case "batch-decrypt": + return e.handleBatchDecrypt(ctx, req) + case "batch-rewrap": + return e.handleBatchRewrap(ctx, req) + case "sign": + return e.handleSign(ctx, req) + case "verify": + return e.handleVerify(ctx, req) + case "hmac": + return e.handleHmac(ctx, req) + case "get-public-key": + return e.handleGetPublicKey(ctx, req) + default: + return nil, fmt.Errorf("transit: unknown operation: %s", req.Operation) + } +} +``` + ### create-key Request data: @@ -158,9 +303,14 @@ Request data: |-------------------|----------|----------------|----------------------------------| | `name` | Yes | | Key name | | `type` | Yes | | Key type (see table above) | -| `exportable` | No | `false` | Whether raw key material can be exported | | `allow_deletion` | No | `false` | Whether key can be deleted | +The `exportable` flag has been intentionally omitted. Transit's value +proposition is that keys never leave the service — all cryptographic operations +happen server-side. If key export is ever needed (e.g., for migration), a +dedicated admin-only export operation can be added with appropriate audit +logging. + The key is created at version 1 with `min_decryption_version` = 1. ### encrypt @@ -199,11 +349,12 @@ Request data: |-------------|----------|--------------------------------------------| | `key` | Yes | Named key (Ed25519 or ECDSA type) | | `input` | Yes | Base64-encoded data to sign | -| `algorithm` | No | Hash algorithm (default varies by key type) | +| `algorithm` | No | Reserved for future prehash options (currently ignored) | -The engine rejects `sign` requests for HMAC key types with an error. +The engine rejects `sign` requests for HMAC and symmetric key types with an +error. Only Ed25519 and ECDSA keys are accepted. -Response: `{ "signature": "metacrypt:v1:..." }` +Response: `{ "signature": "metacrypt:v{version}:...", "key_version": N }` ### verify @@ -236,9 +387,9 @@ exceed the current version (you must always be able to decrypt with the latest). ### trim-key -Admin-only. Permanently deletes key versions older than `min_decryption_version`. -This is irreversible — ciphertext encrypted with trimmed versions can never be -decrypted. +Admin-only. Permanently deletes key versions **strictly less than** +`min_decryption_version`. This is irreversible — ciphertext encrypted with +trimmed versions can never be decrypted. Request data: @@ -246,6 +397,24 @@ Request data: |-------|----------|-------------| | `key` | Yes | Named key | +Deletion logic: +1. Load the key's `min_decryption_version` (must be > 1, otherwise no-op). +2. Enumerate all version files: `{mountPath}keys/{name}/v{N}.key`. +3. For each version `N` where `N < min_decryption_version`: + - Zeroize the in-memory key material (`crypto.Zeroize` for symmetric, + `engine.ZeroizeKey` for asymmetric). + - Delete the version from the barrier: `barrier.Delete(ctx, versionPath)`. + - Remove from the in-memory `versions` map. +4. Return the list of trimmed version numbers. + +If `min_decryption_version` is 1 (the default), trim-key is a no-op and +returns an empty list. This ensures you cannot accidentally trim all versions +without first explicitly advancing the minimum. + +The current version is **never** trimmable — `min_decryption_version` cannot +exceed the current version (enforced by `update-key-config`), so the latest +version is always retained. + Response: `{ "trimmed_versions": [1, 2, ...] }` ## Batch Operations @@ -346,13 +515,25 @@ Each result: | `reference` | Echoed from the request item (if provided) | | `error` | Error message on failure, empty on success | +### Batch Size Limits + +Each batch request is limited to **500 items**. Requests exceeding this limit +are rejected before processing with a `400 Bad Request` / `InvalidArgument` +error. This prevents a single request from monopolizing the engine's lock and +memory. + +The limit is a compile-time constant (`maxBatchSize = 500`) in the engine +package. It can be tuned if needed but should not be exposed as user- +configurable — it exists as a safety valve, not a feature. + ### Implementation Notes Batch operations are handled inside the transit engine's `HandleRequest` as three additional operation cases (`batch-encrypt`, `batch-decrypt`, `batch-rewrap`). No changes to the `Engine` interface are needed. The engine -loops over items internally, loading the key once and reusing it for all items -in the batch. +acquires a read lock once, loads the key once, and processes all items in the +batch while holding the lock. This ensures atomicity with respect to key +rotation (all items in a batch use the same key version). The `reference` field is opaque to the engine — it allows callers to correlate results with their source records (e.g. a database row ID) without maintaining @@ -367,6 +548,10 @@ Follows the same model as the CA engine: `encrypt`, `decrypt`, `sign`, `verify`, `hmac` for cryptographic operations; `read` for metadata (get-key, list-keys, get-public-key); `write` for management (create-key, delete-key, rotate-key, update-key-config, trim-key). + `rewrap` maps to the `decrypt` action — rewrap internally decrypts with the + old version and re-encrypts with the latest, so the caller must have decrypt + permission. Batch variants (`batch-encrypt`, `batch-decrypt`, `batch-rewrap`) + map to the same action as their single counterparts. The `any` action matches all of the above (but never `admin`). - No ownership concept (transit keys are shared resources); access is purely policy-based. @@ -417,48 +602,151 @@ All auth required: | POST | `/v1/transit/{mount}/sign/{key}` | Sign | | POST | `/v1/transit/{mount}/verify/{key}` | Verify | | POST | `/v1/transit/{mount}/hmac/{key}` | HMAC | +| GET | `/v1/transit/{mount}/keys/{name}/public-key` | Get public key | All operations are also accessible via the generic `POST /v1/engine/request`. +### REST Route Registration + +Add to `internal/server/routes.go` in `registerRoutes`, following the CA +engine's pattern with `chi.URLParam`: + +```go +// Transit key management routes (admin). +r.Post("/v1/transit/{mount}/keys", s.requireAdmin(s.handleTransitCreateKey)) +r.Get("/v1/transit/{mount}/keys", s.requireAuth(s.handleTransitListKeys)) +r.Get("/v1/transit/{mount}/keys/{name}", s.requireAuth(s.handleTransitGetKey)) +r.Delete("/v1/transit/{mount}/keys/{name}", s.requireAdmin(s.handleTransitDeleteKey)) +r.Post("/v1/transit/{mount}/keys/{name}/rotate", s.requireAdmin(s.handleTransitRotateKey)) +r.Patch("/v1/transit/{mount}/keys/{name}/config", s.requireAdmin(s.handleTransitUpdateKeyConfig)) +r.Post("/v1/transit/{mount}/keys/{name}/trim", s.requireAdmin(s.handleTransitTrimKey)) + +// Transit crypto operations (auth + policy). +r.Post("/v1/transit/{mount}/encrypt/{key}", s.requireAuth(s.handleTransitEncrypt)) +r.Post("/v1/transit/{mount}/decrypt/{key}", s.requireAuth(s.handleTransitDecrypt)) +r.Post("/v1/transit/{mount}/rewrap/{key}", s.requireAuth(s.handleTransitRewrap)) +r.Post("/v1/transit/{mount}/batch/encrypt/{key}", s.requireAuth(s.handleTransitBatchEncrypt)) +r.Post("/v1/transit/{mount}/batch/decrypt/{key}", s.requireAuth(s.handleTransitBatchDecrypt)) +r.Post("/v1/transit/{mount}/batch/rewrap/{key}", s.requireAuth(s.handleTransitBatchRewrap)) +r.Post("/v1/transit/{mount}/sign/{key}", s.requireAuth(s.handleTransitSign)) +r.Post("/v1/transit/{mount}/verify/{key}", s.requireAuth(s.handleTransitVerify)) +r.Post("/v1/transit/{mount}/hmac/{key}", s.requireAuth(s.handleTransitHmac)) +r.Get("/v1/transit/{mount}/keys/{name}/public-key", s.requireAuth(s.handleTransitGetPublicKey)) +``` + +Each handler extracts `chi.URLParam(r, "mount")` and `chi.URLParam(r, "key")` +or `chi.URLParam(r, "name")`, builds an `engine.Request`, and calls +`s.engines.HandleRequest(...)`. + +### gRPC Interceptor Maps + +Add to `sealRequiredMethods`, `authRequiredMethods`, and `adminRequiredMethods` +in `internal/grpcserver/server.go`: + +```go +// sealRequiredMethods — all transit RPCs: +"/metacrypt.v2.TransitService/CreateKey": true, +"/metacrypt.v2.TransitService/DeleteKey": true, +"/metacrypt.v2.TransitService/GetKey": true, +"/metacrypt.v2.TransitService/ListKeys": true, +"/metacrypt.v2.TransitService/RotateKey": true, +"/metacrypt.v2.TransitService/UpdateKeyConfig": true, +"/metacrypt.v2.TransitService/TrimKey": true, +"/metacrypt.v2.TransitService/Encrypt": true, +"/metacrypt.v2.TransitService/Decrypt": true, +"/metacrypt.v2.TransitService/Rewrap": true, +"/metacrypt.v2.TransitService/BatchEncrypt": true, +"/metacrypt.v2.TransitService/BatchDecrypt": true, +"/metacrypt.v2.TransitService/BatchRewrap": true, +"/metacrypt.v2.TransitService/Sign": true, +"/metacrypt.v2.TransitService/Verify": true, +"/metacrypt.v2.TransitService/Hmac": true, +"/metacrypt.v2.TransitService/GetPublicKey": true, + +// authRequiredMethods — all transit RPCs: +"/metacrypt.v2.TransitService/CreateKey": true, +"/metacrypt.v2.TransitService/DeleteKey": true, +"/metacrypt.v2.TransitService/GetKey": true, +"/metacrypt.v2.TransitService/ListKeys": true, +"/metacrypt.v2.TransitService/RotateKey": true, +"/metacrypt.v2.TransitService/UpdateKeyConfig": true, +"/metacrypt.v2.TransitService/TrimKey": true, +"/metacrypt.v2.TransitService/Encrypt": true, +"/metacrypt.v2.TransitService/Decrypt": true, +"/metacrypt.v2.TransitService/Rewrap": true, +"/metacrypt.v2.TransitService/BatchEncrypt": true, +"/metacrypt.v2.TransitService/BatchDecrypt": true, +"/metacrypt.v2.TransitService/BatchRewrap": true, +"/metacrypt.v2.TransitService/Sign": true, +"/metacrypt.v2.TransitService/Verify": true, +"/metacrypt.v2.TransitService/Hmac": true, +"/metacrypt.v2.TransitService/GetPublicKey": true, + +// adminRequiredMethods — admin-only transit RPCs: +"/metacrypt.v2.TransitService/CreateKey": true, +"/metacrypt.v2.TransitService/DeleteKey": true, +"/metacrypt.v2.TransitService/RotateKey": true, +"/metacrypt.v2.TransitService/UpdateKeyConfig": true, +"/metacrypt.v2.TransitService/TrimKey": true, +``` + +The `adminOnlyOperations` map in `routes.go` already contains transit entries +(qualified as `transit:create-key`, `transit:delete-key`, etc. — keys are +`engineType:operation` to avoid cross-engine name collisions). + ## Web UI Add to `/dashboard` the ability to mount a transit engine. Add a `/transit` page displaying: -- Named key list with metadata (type, version, created, exportable) +- Named key list with metadata (type, version, created, allow_deletion) - Key detail view with version history - Encrypt/decrypt form for interactive testing - Key rotation button (admin) ## Implementation Steps -1. **`internal/engine/transit/`** — Implement `TransitEngine`: +1. **Prerequisite**: `engine.ZeroizeKey` must exist in + `internal/engine/helpers.go` (created as part of the SSH CA engine + implementation — see `engines/sshca.md` step 1). + +2. **`internal/engine/transit/`** — Implement `TransitEngine`: - `types.go` — Config, KeyConfig, key version types. - `transit.go` — Lifecycle (Initialize, Unseal, Seal, HandleRequest). - `encrypt.go` — Encrypt/Decrypt/Rewrap operations. - `sign.go` — Sign/Verify/HMAC operations. - `keys.go` — Key management (create, delete, rotate, list, get). -2. **Register factory** in `cmd/metacrypt/main.go`. -3. **Proto definitions** — `proto/metacrypt/v2/transit.proto`, run `make proto`. -4. **gRPC handlers** — `internal/grpcserver/transit.go`. -5. **REST routes** — Add to `internal/server/routes.go`. -6. **Web UI** — Add template + webserver routes. -7. **Tests** — Unit tests for each operation, key rotation, rewrap correctness. +3. **Register factory** in `cmd/metacrypt/main.go`. +4. **Proto definitions** — `proto/metacrypt/v2/transit.proto`, run `make proto`. +5. **gRPC handlers** — `internal/grpcserver/transit.go`. +6. **REST routes** — Add to `internal/server/routes.go`. +7. **Web UI** — Add template + webserver routes. +8. **Tests** — Unit tests for each operation, key rotation, rewrap correctness. ## Dependencies -- `golang.org/x/crypto/chacha20poly1305` (for ChaCha20-Poly1305 key type) +- `golang.org/x/crypto/chacha20poly1305` (for XChaCha20-Poly1305 key type) - Standard library `crypto/aes`, `crypto/cipher`, `crypto/ecdsa`, - `crypto/ed25519`, `crypto/hmac`, `crypto/sha256`, `crypto/sha512` + `crypto/ed25519`, `crypto/hmac`, `crypto/sha256`, `crypto/sha512`, + `crypto/elliptic`, `crypto/x509`, `crypto/rand` ## Security Considerations - All key material encrypted at rest in the barrier, zeroized on seal. - Symmetric keys generated with `crypto/rand`. +- XChaCha20-Poly1305 used instead of ChaCha20-Poly1305 for its 192-bit nonce, + which is safe for random nonce generation at high volume (birthday bound at + 2^96 messages vs 2^48 for 96-bit nonces). +- Nonces are always random (`crypto/rand`), never counter-based, to avoid + nonce-reuse risks from concurrent access or crash recovery. - Ciphertext format includes version to support key rotation without data loss. -- `exportable` flag is immutable after creation — cannot be enabled later. -- `allow_deletion` is immutable after creation. +- Key export is not supported — transit keys never leave the service. +- `allow_deletion` is immutable after creation; `delete-key` returns an error + if `allow_deletion` is `false`. - `max_key_versions` pruning only removes old versions, never the current one. +- `trim-key` only deletes versions below `min_decryption_version`, and + `min_decryption_version` cannot exceed the current version. This guarantees + the current version is never trimmable. - Rewrap operation never exposes plaintext to the caller. - Context (AAD) binding prevents ciphertext from being used in a different context. - `min_decryption_version` enforces key rotation completion: once advanced, @@ -466,3 +754,25 @@ Add a `/transit` page displaying: - RSA key types are excluded to avoid padding scheme vulnerabilities (Bleichenbacher attacks on PKCS#1 v1.5). Asymmetric encryption belongs in the user engine; signing uses Ed25519/ECDSA. +- ECDSA signatures use ASN.1 DER encoding (Go's native format), not raw + concatenated (r,s) — this avoids signature malleability issues. +- Ed25519 signs raw messages (no prehash) — this is the standard Ed25519 + mode, not Ed25519ph, avoiding the collision resistance reduction. +- Batch operations enforce a 500-item limit to prevent resource exhaustion. +- Batch operations hold a read lock for the entire batch to ensure all items + use the same key version, preventing TOCTOU between key rotation and + encryption. + +## Implementation References + +These existing code patterns should be followed exactly: + +| Pattern | Reference File | Lines | +|---------|---------------|-------| +| HandleRequest switch dispatch | `internal/engine/ca/ca.go` | 284–317 | +| zeroizeKey helper | `internal/engine/ca/ca.go` | 1481–1498 | +| REST route registration with chi | `internal/server/routes.go` | 38–50 | +| gRPC handler structure | `internal/grpcserver/ca.go` | full file | +| gRPC interceptor maps | `internal/grpcserver/server.go` | 107–205 | +| Engine factory registration | `cmd/metacrypt/server.go` | 76 | +| adminOnlyOperations map | `internal/server/routes.go` | 265–285 | diff --git a/engines/user.md b/engines/user.md index e7ccfa2..bfeda16 100644 --- a/engines/user.md +++ b/engines/user.md @@ -54,6 +54,72 @@ encrypted in the barrier and is only used by the engine on behalf of the owning user (enforced in `HandleRequest`). The public key is available to any authenticated user (needed to encrypt messages to that user). +**Key generation by algorithm:** +- `x25519`: `ecdh.X25519().GenerateKey(rand.Reader)` (Go 1.20+ `crypto/ecdh`). +- `ecdh-p256`: `ecdh.P256().GenerateKey(rand.Reader)`. +- `ecdh-p384`: `ecdh.P384().GenerateKey(rand.Reader)`. + +**Key serialization in barrier:** +- Private key: `x509.MarshalPKCS8PrivateKey(privKey)` → PEM block with type + `"PRIVATE KEY"`. Stored at `{mountPath}users/{username}/priv.pem`. +- Public key: `x509.MarshalPKIXPublicKey(pubKey)` → PEM block with type + `"PUBLIC KEY"`. Stored at `{mountPath}users/{username}/pub.pem`. + +Note: `crypto/ecdh` keys implement the interfaces required by +`x509.MarshalPKCS8PrivateKey` and `x509.MarshalPKIXPublicKey` as of Go 1.20. + +### Cryptographic Details + +**ECDH key agreement:** +```go +sharedSecret, err := senderPrivKey.ECDH(recipientPubKey) +``` +The raw shared secret is **never used directly** as a key. It is always fed +through HKDF. + +**HKDF key derivation:** +```go +salt := make([]byte, 32) +rand.Read(salt) +hkdf := hkdf.New(sha256.New, sharedSecret, salt, info) +wrappingKey := make([]byte, 32) // 256-bit AES key +io.ReadFull(hkdf, wrappingKey) +``` + +- **Hash:** SHA-256 (sufficient for 256-bit key derivation). +- **Salt:** 32 bytes of `crypto/rand` randomness, generated fresh per + recipient per encryption. The salt is stored alongside the wrapped DEK in + the envelope (see updated envelope format below). +- **Info:** `"metacrypt-user-v1:" + sender + ":" + recipient` (UTF-8 encoded). + This binds the derived key to the specific sender-recipient pair, preventing + key confusion if the same shared secret were somehow reused. + +**DEK wrapping:** The wrapping key from HKDF encrypts the DEK using AES-256-GCM +(not AES Key Wrap / RFC 3394). AES-GCM is used because: +- It is already a core primitive in the codebase. +- It provides authenticated encryption, same as AES Key Wrap. +- The DEK is 32 bytes — well within GCM's plaintext size limits. + +```go +block, _ := aes.NewCipher(wrappingKey) +gcm, _ := cipher.NewGCM(block) +nonce := make([]byte, gcm.NonceSize()) // 12 bytes +rand.Read(nonce) +wrappedDEK := gcm.Seal(nonce, nonce, dek, nil) // nonce || ciphertext || tag +``` + +**Symmetric encryption (payload):** +```go +block, _ := aes.NewCipher(dek) +gcm, _ := cipher.NewGCM(block) +nonce := make([]byte, gcm.NonceSize()) // 12 bytes +rand.Read(nonce) +ciphertext := gcm.Seal(nonce, nonce, plaintext, aad) +``` +- AAD: if `metadata` is provided, it is used as additional authenticated data. + This means metadata is integrity-protected but not encrypted. +- Nonce: 12 bytes from `crypto/rand`. + ### Encryption Flow (Sender → Recipient) 1. Sender calls `encrypt` with plaintext, recipient username(s), and optional @@ -82,13 +148,24 @@ authenticated user (needed to encrypt messages to that user). "sender": "alice", "sym_algorithm": "aes256-gcm", "ciphertext": "", + "metadata": "", "recipients": { - "bob": "", - "carol": "" + "bob": { + "salt": "", + "wrapped_dek": "" + }, + "carol": { + "salt": "", + "wrapped_dek": "" + } } } ``` +Each recipient entry includes: +- `salt`: the per-recipient random HKDF salt used during key derivation. +- `wrapped_dek`: the AES-256-GCM encryption of the DEK (nonce-prepended). + The envelope is base64-encoded as a single opaque blob for transport. ## Barrier Storage Layout @@ -116,24 +193,43 @@ type userState struct { pubKey crypto.PublicKey // key exchange public key config *UserKeyConfig } + +type UserKeyConfig struct { + Algorithm string `json:"algorithm"` // key exchange algorithm (x25519, ecdh-p256, ecdh-p384) + CreatedAt time.Time `json:"created_at"` + AutoProvisioned bool `json:"auto_provisioned"` // true if created via auto-provisioning +} ``` ## Lifecycle ### Initialize -1. Parse and store config in barrier. -2. No user keys are created at init time (created on demand or via `register`). +1. Parse and validate config: ensure `key_algorithm` is one of `x25519`, + `ecdh-p256`, `ecdh-p384`. Ensure `sym_algorithm` is `aes256-gcm`. +2. Store config in barrier as `{mountPath}config.json`: + ```go + configJSON, _ := json.Marshal(config) + barrier.Put(ctx, mountPath+"config.json", configJSON) + ``` +3. No user keys are created at init time (created on demand via `register`, + `provision`, or auto-provisioning). ### Unseal -1. Load config from barrier. -2. Discover and load all user key pairs from barrier. +1. Load config JSON from barrier, unmarshal into `*UserConfig`. +2. List all user directories under `{mountPath}users/`. +3. For each user, load `priv.pem` and `pub.pem`: + - Parse private key PEM: `pem.Decode` → `x509.ParsePKCS8PrivateKey` → + type-assert to `*ecdh.PrivateKey`. + - Parse public key PEM: `pem.Decode` → `x509.ParsePKIXPublicKey` → + type-assert to `*ecdh.PublicKey`. +4. Populate `users` map with loaded key states. ### Seal -1. Zeroize all private key material. -2. Nil out all maps. +1. Zeroize all private key material using `engine.ZeroizeKey(privKey)`. +2. Nil out `users` map and `config`. ## Operations @@ -145,9 +241,41 @@ type userState struct { | `list-users` | User/Admin | List registered users | | `encrypt` | User+Policy | Encrypt data for one or more recipients | | `decrypt` | User (self) | Decrypt an envelope addressed to the caller | +| `re-encrypt` | User (self) | Re-encrypt an envelope with current key pairs | | `rotate-key` | User (self) | Rotate the caller's key pair | | `delete-user` | Admin | Remove a user's key pair | +### HandleRequest dispatch + +Follow the CA engine's pattern (`internal/engine/ca/ca.go:284-317`): + +```go +func (e *UserEngine) HandleRequest(ctx context.Context, req *engine.Request) (*engine.Response, error) { + switch req.Operation { + case "register": + return e.handleRegister(ctx, req) + case "provision": + return e.handleProvision(ctx, req) + case "get-public-key": + return e.handleGetPublicKey(ctx, req) + case "list-users": + return e.handleListUsers(ctx, req) + case "encrypt": + return e.handleEncrypt(ctx, req) + case "decrypt": + return e.handleDecrypt(ctx, req) + case "re-encrypt": + return e.handleReEncrypt(ctx, req) + case "rotate-key": + return e.handleRotateKey(ctx, req) + case "delete-user": + return e.handleDeleteUser(ctx, req) + default: + return nil, fmt.Errorf("user: unknown operation: %s", req.Operation) + } +} +``` + ### register Creates a key pair for the authenticated caller. No-op if the caller already @@ -177,14 +305,27 @@ Request data: | `metadata` | No | Arbitrary string metadata (authenticated) | Flow: -1. Caller must be provisioned (has a key pair). Auto-provision if not. -2. For each recipient without a keypair: auto-provision them. -3. Load sender's private key and each recipient's public key. -4. Generate random DEK, encrypt plaintext with DEK. -5. For each recipient: ECDH(sender_priv, recipient_pub) → shared_secret, - HKDF(shared_secret, salt, info) → wrapping_key, AES-KeyWrap(wrapping_key, - DEK) → wrapped_dek. -6. Build and return envelope. +1. Validate that `len(recipients) <= maxRecipients` (100). Reject with + `400 Bad Request` if exceeded. +2. Caller must be provisioned (has a key pair). If not, auto-provision the + caller (generate keypair, store in barrier). This is safe because the + caller is already authenticated via MCIAS — their identity is verified. +3. For each recipient without a keypair: validate the username exists in MCIAS + via `auth.ValidateUsername(username)`. If the user does not exist, return an + error: `"recipient not found: {username}"`. If the user exists, auto-provision + them. Auto-provisioning only creates a key pair; it does not grant any MCIAS + roles or permissions. The recipient's private key is only accessible when + they authenticate. +4. Load sender's private key and each recipient's public key. +5. Generate random 32-byte DEK (`crypto/rand`). Encrypt plaintext with DEK + using AES-256-GCM (metadata as AAD if present). +6. For each recipient: + - `sharedSecret := senderPrivKey.ECDH(recipientPubKey)` + - Generate 32-byte random salt. + - `wrappingKey := HKDF(sha256, sharedSecret, salt, info)` + - `wrappedDEK := AES-GCM-Encrypt(wrappingKey, DEK)` +7. Build envelope with ciphertext, per-recipient `{salt, wrapped_dek}`. +8. Zeroize DEK, all shared secrets, and all wrapping keys. Authorization: - Admins: grant-all. @@ -201,19 +342,48 @@ Request data: | `envelope` | Yes | Base64-encoded envelope blob | Flow: -1. Parse envelope, find the caller's wrapped DEK entry. -2. Load sender's public key and caller's private key. -3. ECDH(caller_priv, sender_pub) → shared_secret → wrapping_key → DEK. -4. Decrypt ciphertext with DEK. -5. Return plaintext. +1. Parse envelope JSON, find the caller's entry in `recipients`. + If the caller is not a recipient, return an error. +2. Load sender's public key (from `envelope.sender`) and caller's private key. +3. `sharedSecret := callerPrivKey.ECDH(senderPubKey)`. +4. `wrappingKey := HKDF(sha256, sharedSecret, recipient.salt, info)`. +5. `dek := AES-GCM-Decrypt(wrappingKey, recipient.wrapped_dek)`. +6. Decrypt ciphertext with DEK (metadata as AAD if present in envelope). +7. Zeroize DEK, shared secret, wrapping key. +8. Return plaintext. A user can only decrypt envelopes addressed to themselves. +### re-encrypt + +Re-encrypts an envelope with the caller's current key pair. This is the safe +way to migrate data before a key rotation. + +Request data: + +| Field | Required | Description | +|------------|----------|--------------------------------| +| `envelope` | Yes | Base64-encoded envelope blob | + +Flow: +1. Decrypt the envelope (same as `decrypt` flow). +2. Re-encrypt the plaintext for the same recipients using fresh DEKs and + current key pairs (same as `encrypt` flow, preserving metadata). +3. Return the new envelope. + +The caller must be a recipient in the original envelope. The new envelope uses +current key pairs for all recipients — if any recipient has rotated their key +since the original encryption, the new envelope uses their new public key. + ### rotate-key Generates a new key pair for the caller. The old private key is zeroized and deleted. Old envelopes encrypted with the previous key cannot be decrypted -after rotation — callers should re-encrypt any stored data before rotating. +after rotation. + +**Recommended workflow**: Before rotating, re-encrypt all stored envelopes +using the `re-encrypt` operation. Then call `rotate-key`. This ensures no +data is lost. ## gRPC Service (proto/metacrypt/v2/user.proto) @@ -225,6 +395,7 @@ service UserService { rpc ListUsers(UserListUsersRequest) returns (UserListUsersResponse); rpc Encrypt(UserEncryptRequest) returns (UserEncryptResponse); rpc Decrypt(UserDecryptRequest) returns (UserDecryptResponse); + rpc ReEncrypt(UserReEncryptRequest) returns (UserReEncryptResponse); rpc RotateKey(UserRotateKeyRequest) returns (UserRotateKeyResponse); rpc DeleteUser(UserDeleteUserRequest) returns (UserDeleteUserResponse); } @@ -243,10 +414,70 @@ All auth required: | DELETE | `/v1/user/{mount}/keys/{username}` | Delete user (admin) | | POST | `/v1/user/{mount}/encrypt` | Encrypt for recipients | | POST | `/v1/user/{mount}/decrypt` | Decrypt envelope | +| POST | `/v1/user/{mount}/re-encrypt` | Re-encrypt envelope | | POST | `/v1/user/{mount}/rotate` | Rotate caller's key | All operations are also accessible via the generic `POST /v1/engine/request`. +### REST Route Registration + +Add to `internal/server/routes.go` in `registerRoutes`, following the CA +engine's pattern with `chi.URLParam`: + +```go +// User engine routes. +r.Post("/v1/user/{mount}/register", s.requireAuth(s.handleUserRegister)) +r.Post("/v1/user/{mount}/provision", s.requireAdmin(s.handleUserProvision)) +r.Get("/v1/user/{mount}/keys", s.requireAuth(s.handleUserListUsers)) +r.Get("/v1/user/{mount}/keys/{username}", s.requireAuth(s.handleUserGetPublicKey)) +r.Delete("/v1/user/{mount}/keys/{username}", s.requireAdmin(s.handleUserDeleteUser)) +r.Post("/v1/user/{mount}/encrypt", s.requireAuth(s.handleUserEncrypt)) +r.Post("/v1/user/{mount}/decrypt", s.requireAuth(s.handleUserDecrypt)) +r.Post("/v1/user/{mount}/re-encrypt", s.requireAuth(s.handleUserReEncrypt)) +r.Post("/v1/user/{mount}/rotate", s.requireAuth(s.handleUserRotateKey)) +``` + +Each handler extracts `chi.URLParam(r, "mount")` and optionally +`chi.URLParam(r, "username")`, builds an `engine.Request`, and calls +`s.engines.HandleRequest(...)`. + +### gRPC Interceptor Maps + +Add to `sealRequiredMethods`, `authRequiredMethods`, and `adminRequiredMethods` +in `internal/grpcserver/server.go`: + +```go +// sealRequiredMethods — all user RPCs: +"/metacrypt.v2.UserService/Register": true, +"/metacrypt.v2.UserService/Provision": true, +"/metacrypt.v2.UserService/GetPublicKey": true, +"/metacrypt.v2.UserService/ListUsers": true, +"/metacrypt.v2.UserService/Encrypt": true, +"/metacrypt.v2.UserService/Decrypt": true, +"/metacrypt.v2.UserService/ReEncrypt": true, +"/metacrypt.v2.UserService/RotateKey": true, +"/metacrypt.v2.UserService/DeleteUser": true, + +// authRequiredMethods — all user RPCs: +"/metacrypt.v2.UserService/Register": true, +"/metacrypt.v2.UserService/Provision": true, +"/metacrypt.v2.UserService/GetPublicKey": true, +"/metacrypt.v2.UserService/ListUsers": true, +"/metacrypt.v2.UserService/Encrypt": true, +"/metacrypt.v2.UserService/Decrypt": true, +"/metacrypt.v2.UserService/ReEncrypt": true, +"/metacrypt.v2.UserService/RotateKey": true, +"/metacrypt.v2.UserService/DeleteUser": true, + +// adminRequiredMethods — admin-only user RPCs: +"/metacrypt.v2.UserService/Provision": true, +"/metacrypt.v2.UserService/DeleteUser": true, +``` + +The `adminOnlyOperations` map in `routes.go` already contains user entries +(qualified as `user:provision`, `user:delete-user` — keys are +`engineType:operation` to avoid cross-engine name collisions). + ## Web UI Add to `/dashboard` the ability to mount a user engine. @@ -260,33 +491,47 @@ Add a `/user-crypto` page displaying: ## Implementation Steps -1. **`internal/engine/user/`** — Implement `UserEngine`: +1. **Prerequisite**: `engine.ZeroizeKey` must exist in + `internal/engine/helpers.go` (created as part of the SSH CA engine + implementation — see `engines/sshca.md` step 1). + +2. **`internal/engine/user/`** — Implement `UserEngine`: - `types.go` — Config types, envelope format. - `user.go` — Lifecycle (Initialize, Unseal, Seal, HandleRequest). - `crypto.go` — ECDH key agreement, HKDF derivation, DEK wrap/unwrap, symmetric encrypt/decrypt. - `keys.go` — User registration, key rotation, deletion. -2. **Register factory** in `cmd/metacrypt/main.go`. -3. **Proto definitions** — `proto/metacrypt/v2/user.proto`, run `make proto`. -4. **gRPC handlers** — `internal/grpcserver/user.go`. -5. **REST routes** — Add to `internal/server/routes.go`. -6. **Web UI** — Add template + webserver routes. -7. **Tests** — Unit tests: register, encrypt/decrypt roundtrip, multi-recipient, - key rotation invalidates old envelopes, authorization checks. +3. **Register factory** in `cmd/metacrypt/main.go`. +4. **Proto definitions** — `proto/metacrypt/v2/user.proto`, run `make proto`. +5. **gRPC handlers** — `internal/grpcserver/user.go`. +6. **REST routes** — Add to `internal/server/routes.go`. +7. **Web UI** — Add template + webserver routes. +8. **Tests** — Unit tests: register, encrypt/decrypt roundtrip, multi-recipient, + key rotation invalidates old envelopes, re-encrypt roundtrip, authorization + checks. ## Dependencies - `golang.org/x/crypto/hkdf` (for key derivation from ECDH shared secret) - `crypto/ecdh` (Go 1.20+, for X25519 and NIST curve key exchange) -- Standard library `crypto/aes`, `crypto/cipher`, `crypto/rand` +- Standard library `crypto/aes`, `crypto/cipher`, `crypto/rand`, `crypto/sha256`, + `crypto/x509`, `encoding/pem` ## Security Considerations - Private keys encrypted at rest in the barrier, zeroized on seal. -- DEK is random per-encryption; never reused. -- HKDF derivation includes sender and recipient identities in the info string - to prevent key confusion attacks: - `info = "metacrypt-user-v1:" + sender + ":" + recipient`. +- DEK is random 32 bytes per-encryption; never reused. +- HKDF salt is 32 bytes of `crypto/rand` randomness, generated fresh per + recipient per encryption. Stored in the envelope alongside the wrapped DEK. + A random salt ensures that even if the same sender-recipient pair encrypts + multiple messages, the derived wrapping keys are unique. +- HKDF info string includes sender and recipient identities to prevent key + confusion attacks: `info = "metacrypt-user-v1:" + sender + ":" + recipient`. +- DEK wrapping uses AES-256-GCM (not AES Key Wrap / RFC 3394). Both provide + authenticated encryption; AES-GCM is preferred for consistency with the rest + of the codebase and avoids adding a new primitive. +- All intermediate secrets (shared secrets, wrapping keys, DEKs) are zeroized + immediately after use using `crypto.Zeroize`. - Envelope includes sender identity so the recipient can derive the correct shared secret. - Key rotation is destructive — old data cannot be decrypted. The engine should @@ -294,9 +539,34 @@ Add a `/user-crypto` page displaying: - Server-trust model: the server holds all private keys in the barrier. No API surface exports private keys. Access control is application-enforced — the engine only uses a private key on behalf of its owner during encrypt/decrypt. -- Auto-provisioned users have keypairs waiting for them; their private keys are - protected identically to explicitly registered users. +- Auto-provisioning creates key pairs for unregistered recipients. Before + creating a key pair, the engine validates that the recipient username exists + in MCIAS via `auth.ValidateUsername`. This prevents barrier pollution from + non-existent usernames. Auto-provisioning is safe because: (a) the recipient + must be a real MCIAS user, (b) no MCIAS permissions are granted, (c) the + private key is only usable after MCIAS authentication, (d) key pairs are + stored identically to explicitly registered users. Auto-provisioning is only + triggered by authenticated users during `encrypt`. +- Encrypt requests are limited to 100 recipients to prevent resource exhaustion + from ECDH + HKDF computation. - Metadata in the envelope is authenticated (included as additional data in AEAD) but not encrypted — it is visible to anyone holding the envelope. - Post-quantum readiness: the `key_algorithm` config supports future hybrid schemes (e.g. X25519 + ML-KEM). The envelope version field enables migration. +- X25519 is the default algorithm because it provides 128-bit security with + the smallest key size and fastest operations. NIST curves are offered for + compliance contexts. + +## Implementation References + +These existing code patterns should be followed exactly: + +| Pattern | Reference File | Lines | +|---------|---------------|-------| +| HandleRequest switch dispatch | `internal/engine/ca/ca.go` | 284–317 | +| zeroizeKey helper | `internal/engine/ca/ca.go` | 1481–1498 | +| REST route registration with chi | `internal/server/routes.go` | 38–50 | +| gRPC handler structure | `internal/grpcserver/ca.go` | full file | +| gRPC interceptor maps | `internal/grpcserver/server.go` | 107–205 | +| Engine factory registration | `cmd/metacrypt/server.go` | 76 | +| adminOnlyOperations map | `internal/server/routes.go` | 265–285 | diff --git a/internal/server/server_test.go b/internal/server/server_test.go index d274d20..46ab2ce 100644 --- a/internal/server/server_test.go +++ b/internal/server/server_test.go @@ -175,6 +175,35 @@ func makeEngineRequest(mount, operation string) string { return `{"mount":"` + mount + `","operation":"` + operation + `","data":{}}` } +// stubEngine is a minimal engine implementation for testing the generic endpoint. +type stubEngine struct { + engineType engine.EngineType +} + +func (e *stubEngine) Type() engine.EngineType { return e.engineType } +func (e *stubEngine) Initialize(_ context.Context, _ barrier.Barrier, _ string, _ map[string]interface{}) error { + return nil +} +func (e *stubEngine) Unseal(_ context.Context, _ barrier.Barrier, _ string) error { return nil } +func (e *stubEngine) Seal() error { return nil } +func (e *stubEngine) HandleRequest(_ context.Context, req *engine.Request) (*engine.Response, error) { + return &engine.Response{Data: map[string]interface{}{"ok": true}}, nil +} + +// mountStubEngine registers a factory and mounts a stub engine of the given type. +func mountStubEngine(t *testing.T, srv *Server, name string, engineType engine.EngineType) { + t.Helper() + srv.engines.RegisterFactory(engineType, func() engine.Engine { + return &stubEngine{engineType: engineType} + }) + if err := srv.engines.Mount(context.Background(), name, engineType, nil); err != nil { + // Ignore "already exists" from re-mounting the same name. + if !strings.Contains(err.Error(), "already exists") { + t.Fatalf("mount stub %q as %s: %v", name, engineType, err) + } + } +} + func withTokenInfo(r *http.Request, info *auth.TokenInfo) *http.Request { return r.WithContext(context.WithValue(r.Context(), tokenInfoKey, info)) } @@ -184,6 +213,7 @@ func withTokenInfo(r *http.Request, info *auth.TokenInfo) *http.Request { func TestEngineRequestPolicyDeniesNonAdmin(t *testing.T) { srv, sealMgr, _ := setupTestServer(t) unsealServer(t, sealMgr, nil) + mountStubEngine(t, srv, "pki", engine.EngineTypeCA) body := makeEngineRequest("pki", "list-issuers") req := httptest.NewRequest(http.MethodPost, "/v1/engine/request", strings.NewReader(body)) @@ -200,6 +230,7 @@ func TestEngineRequestPolicyDeniesNonAdmin(t *testing.T) { func TestEngineRequestPolicyAllowsAdmin(t *testing.T) { srv, sealMgr, _ := setupTestServer(t) unsealServer(t, sealMgr, nil) + mountStubEngine(t, srv, "pki", engine.EngineTypeCA) body := makeEngineRequest("pki", "list-issuers") req := httptest.NewRequest(http.MethodPost, "/v1/engine/request", strings.NewReader(body)) @@ -207,7 +238,7 @@ func TestEngineRequestPolicyAllowsAdmin(t *testing.T) { w := httptest.NewRecorder() srv.handleEngineRequest(w, req) - // Admin bypasses policy; will fail with mount-not-found (404), not forbidden (403). + // Admin bypasses policy; stub engine returns 200. if w.Code == http.StatusForbidden { t.Errorf("admin should not be forbidden by policy, got 403: %s", w.Body.String()) } @@ -218,6 +249,7 @@ func TestEngineRequestPolicyAllowsAdmin(t *testing.T) { func TestEngineRequestPolicyAllowsWithRule(t *testing.T) { srv, sealMgr, _ := setupTestServer(t) unsealServer(t, sealMgr, nil) + mountStubEngine(t, srv, "pki", engine.EngineTypeCA) ctx := context.Background() _ = srv.policy.CreateRule(ctx, &policy.Rule{ @@ -235,7 +267,7 @@ func TestEngineRequestPolicyAllowsWithRule(t *testing.T) { w := httptest.NewRecorder() srv.handleEngineRequest(w, req) - // Policy allows; will fail with mount-not-found (404), not forbidden (403). + // Policy allows; stub engine returns 200. if w.Code == http.StatusForbidden { t.Errorf("user with allow rule should not be forbidden, got 403: %s", w.Body.String()) } @@ -247,15 +279,32 @@ func TestEngineRequestAdminOnlyBlocksNonAdmin(t *testing.T) { srv, sealMgr, _ := setupTestServer(t) unsealServer(t, sealMgr, nil) - for _, op := range []string{"create-issuer", "delete-cert", "create-key", "rotate-key", "create-profile", "provision"} { - body := makeEngineRequest("test-mount", op) + // Mount stub engines so the admin-only lookup can resolve engine types. + mountStubEngine(t, srv, "ca-mount", engine.EngineTypeCA) + mountStubEngine(t, srv, "transit-mount", engine.EngineTypeTransit) + mountStubEngine(t, srv, "sshca-mount", engine.EngineTypeSSHCA) + mountStubEngine(t, srv, "user-mount", engine.EngineTypeUser) + + cases := []struct { + mount string + op string + }{ + {"ca-mount", "create-issuer"}, + {"ca-mount", "delete-cert"}, + {"transit-mount", "create-key"}, + {"transit-mount", "rotate-key"}, + {"sshca-mount", "create-profile"}, + {"user-mount", "provision"}, + } + for _, tc := range cases { + body := makeEngineRequest(tc.mount, tc.op) req := httptest.NewRequest(http.MethodPost, "/v1/engine/request", strings.NewReader(body)) req = withTokenInfo(req, &auth.TokenInfo{Username: "alice", Roles: []string{"user"}, IsAdmin: false}) w := httptest.NewRecorder() srv.handleEngineRequest(w, req) if w.Code != http.StatusForbidden { - t.Errorf("operation %q: expected 403 for non-admin, got %d", op, w.Code) + t.Errorf("%s/%s: expected 403 for non-admin, got %d", tc.mount, tc.op, w.Code) } } } @@ -266,20 +315,86 @@ func TestEngineRequestAdminOnlyAllowsAdmin(t *testing.T) { srv, sealMgr, _ := setupTestServer(t) unsealServer(t, sealMgr, nil) - for _, op := range []string{"create-issuer", "delete-cert", "create-key", "rotate-key", "create-profile", "provision"} { - body := makeEngineRequest("test-mount", op) + mountStubEngine(t, srv, "ca-mount", engine.EngineTypeCA) + mountStubEngine(t, srv, "transit-mount", engine.EngineTypeTransit) + mountStubEngine(t, srv, "sshca-mount", engine.EngineTypeSSHCA) + mountStubEngine(t, srv, "user-mount", engine.EngineTypeUser) + + cases := []struct { + mount string + op string + }{ + {"ca-mount", "create-issuer"}, + {"ca-mount", "delete-cert"}, + {"transit-mount", "create-key"}, + {"transit-mount", "rotate-key"}, + {"sshca-mount", "create-profile"}, + {"user-mount", "provision"}, + } + for _, tc := range cases { + body := makeEngineRequest(tc.mount, tc.op) req := httptest.NewRequest(http.MethodPost, "/v1/engine/request", strings.NewReader(body)) req = withTokenInfo(req, &auth.TokenInfo{Username: "admin", Roles: []string{"admin"}, IsAdmin: true}) w := httptest.NewRecorder() srv.handleEngineRequest(w, req) - // Admin passes the admin check; will get 404 (mount not found) not 403. + // Admin passes the admin check; stub engine returns 200. if w.Code == http.StatusForbidden { - t.Errorf("operation %q: admin should not be forbidden, got 403", op) + t.Errorf("%s/%s: admin should not be forbidden, got 403", tc.mount, tc.op) } } } +// TestEngineRequestUserRotateKeyOnUserMount verifies that a non-admin user +// can call rotate-key on a user engine mount (not blocked by transit's admin gate). +func TestEngineRequestUserRotateKeyOnUserMount(t *testing.T) { + srv, sealMgr, _ := setupTestServer(t) + unsealServer(t, sealMgr, nil) + + mountStubEngine(t, srv, "user-mount", engine.EngineTypeUser) + + // Create a policy rule allowing user operations. + ctx := context.Background() + _ = srv.policy.CreateRule(ctx, &policy.Rule{ + ID: "allow-user-ops", + Priority: 100, + Effect: policy.EffectAllow, + Roles: []string{"user"}, + Resources: []string{"engine/*/*"}, + Actions: []string{"any"}, + }) + + body := makeEngineRequest("user-mount", "rotate-key") + req := httptest.NewRequest(http.MethodPost, "/v1/engine/request", strings.NewReader(body)) + req = withTokenInfo(req, &auth.TokenInfo{Username: "alice", Roles: []string{"user"}, IsAdmin: false}) + w := httptest.NewRecorder() + srv.handleEngineRequest(w, req) + + // rotate-key on a user mount should NOT be blocked as admin-only. + if w.Code == http.StatusForbidden { + t.Errorf("user rotate-key on user mount should not be forbidden, got 403: %s", w.Body.String()) + } +} + +// TestEngineRequestUserRotateKeyOnTransitMount verifies that a non-admin user +// is blocked from calling rotate-key on a transit engine mount. +func TestEngineRequestUserRotateKeyOnTransitMount(t *testing.T) { + srv, sealMgr, _ := setupTestServer(t) + unsealServer(t, sealMgr, nil) + + mountStubEngine(t, srv, "transit-mount", engine.EngineTypeTransit) + + body := makeEngineRequest("transit-mount", "rotate-key") + req := httptest.NewRequest(http.MethodPost, "/v1/engine/request", strings.NewReader(body)) + req = withTokenInfo(req, &auth.TokenInfo{Username: "alice", Roles: []string{"user"}, IsAdmin: false}) + w := httptest.NewRecorder() + srv.handleEngineRequest(w, req) + + if w.Code != http.StatusForbidden { + t.Errorf("user rotate-key on transit mount should be 403, got %d", w.Code) + } +} + // TestOperationAction verifies the action classification of operations. func TestOperationAction(t *testing.T) { tests := map[string]string{