Implement Phase 1: core framework, operational tooling, and runbook
Core packages: crypto (Argon2id/AES-256-GCM), config (TOML/viper), db (SQLite/migrations), barrier (encrypted storage), seal (state machine with rate-limited unseal), auth (MCIAS integration with token cache), policy (priority-based ACL engine), engine (interface + registry). Server: HTTPS with TLS 1.2+, REST API, auth/admin middleware, htmx web UI (init, unseal, login, dashboard pages). CLI: cobra/viper subcommands (server, init, status, snapshot) with env var override support (METACRYPT_ prefix). Operational tooling: Dockerfile (multi-stage, non-root), docker-compose, hardened systemd units (service + daily backup timer), install script, backup script with retention pruning, production config examples. Runbook covering installation, configuration, daily operations, backup/restore, monitoring, troubleshooting, and security procedures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
415
RUNBOOK.md
Normal file
415
RUNBOOK.md
Normal file
@@ -0,0 +1,415 @@
|
||||
# Metacrypt Operations Runbook
|
||||
|
||||
## Overview
|
||||
|
||||
Metacrypt is a cryptographic service for the Metacircular platform. It provides an encrypted storage barrier, engine-based cryptographic operations, and MCIAS-backed authentication. The service uses a seal/unseal model: it starts sealed after every restart and must be unsealed with a password before it can serve requests.
|
||||
|
||||
### Service States
|
||||
|
||||
| State | Description |
|
||||
|---|---|
|
||||
| **Uninitialized** | Fresh install. Must run `metacrypt init` or use the web UI. |
|
||||
| **Sealed** | Initialized but locked. No cryptographic operations available. |
|
||||
| **Initializing** | Transient state during first-time setup. |
|
||||
| **Unsealed** | Fully operational. All APIs and engines available. |
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
Client → HTTPS (:8443) → Metacrypt Server
|
||||
├── Auth (proxied to MCIAS)
|
||||
├── Policy Engine (ACL rules in barrier)
|
||||
├── Engine Registry (mount/unmount crypto engines)
|
||||
└── Encrypted Barrier → SQLite (on disk)
|
||||
```
|
||||
|
||||
Key hierarchy:
|
||||
|
||||
```
|
||||
Seal Password (operator-held, never stored)
|
||||
→ Argon2id → Key-Wrapping Key (KWK, ephemeral)
|
||||
→ AES-256-GCM decrypt → Master Encryption Key (MEK)
|
||||
→ AES-256-GCM → all barrier-stored data
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
### Binary Install (systemd)
|
||||
|
||||
```bash
|
||||
# Build
|
||||
make metacrypt
|
||||
|
||||
# Install (as root)
|
||||
sudo deploy/scripts/install.sh ./metacrypt
|
||||
```
|
||||
|
||||
This creates:
|
||||
|
||||
| Path | Purpose |
|
||||
|---|---|
|
||||
| `/usr/local/bin/metacrypt` | Binary |
|
||||
| `/etc/metacrypt/metacrypt.toml` | Configuration |
|
||||
| `/etc/metacrypt/certs/` | TLS certificates |
|
||||
| `/var/lib/metacrypt/` | Database and backups |
|
||||
|
||||
### Docker Install
|
||||
|
||||
```bash
|
||||
# Build image
|
||||
make docker
|
||||
|
||||
# Or use docker compose
|
||||
docker compose -f deploy/docker/docker-compose.yml up -d
|
||||
```
|
||||
|
||||
The Docker container mounts a single volume at `/data` which must contain:
|
||||
|
||||
| File | Required | Description |
|
||||
|---|---|---|
|
||||
| `metacrypt.toml` | Yes | Configuration (use `deploy/examples/metacrypt-docker.toml` as template) |
|
||||
| `certs/server.crt` | Yes | TLS certificate |
|
||||
| `certs/server.key` | Yes | TLS private key |
|
||||
| `certs/mcias-ca.crt` | If MCIAS uses private CA | MCIAS CA certificate |
|
||||
| `metacrypt.db` | No | Created automatically on first run |
|
||||
|
||||
To prepare a Docker volume:
|
||||
|
||||
```bash
|
||||
docker volume create metacrypt-data
|
||||
|
||||
# Copy files into the volume
|
||||
docker run --rm -v metacrypt-data:/data -v $(pwd)/deploy/examples:/src alpine \
|
||||
sh -c "cp /src/metacrypt-docker.toml /data/metacrypt.toml && mkdir -p /data/certs"
|
||||
|
||||
# Then copy your TLS certs into the volume the same way
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration is loaded from TOML. The config file location is determined by (in order):
|
||||
|
||||
1. `--config` flag
|
||||
2. `METACRYPT_CONFIG` environment variable (via viper)
|
||||
3. `metacrypt.toml` in the current directory
|
||||
4. `/etc/metacrypt/metacrypt.toml`
|
||||
|
||||
All config values can be overridden via environment variables with the `METACRYPT_` prefix (e.g., `METACRYPT_SERVER_LISTEN_ADDR`).
|
||||
|
||||
See `deploy/examples/metacrypt.toml` for a fully commented production config.
|
||||
|
||||
### Required Settings
|
||||
|
||||
- `server.listen_addr` — bind address (e.g., `:8443`)
|
||||
- `server.tls_cert` / `server.tls_key` — TLS certificate and key paths
|
||||
- `database.path` — SQLite database file path
|
||||
- `mcias.server_url` — MCIAS authentication server URL
|
||||
|
||||
### TLS Certificates
|
||||
|
||||
Metacrypt always terminates TLS. Minimum TLS 1.2 is enforced.
|
||||
|
||||
To generate a self-signed certificate for testing:
|
||||
|
||||
```bash
|
||||
openssl req -x509 -newkey ec -pkeyopt ec_paramgen_curve:P-384 \
|
||||
-keyout server.key -out server.crt -days 365 -nodes \
|
||||
-subj "/CN=metacrypt.local" \
|
||||
-addext "subjectAltName=DNS:metacrypt.local,DNS:localhost,IP:127.0.0.1"
|
||||
```
|
||||
|
||||
For production, use certificates from your internal CA or a public CA.
|
||||
|
||||
---
|
||||
|
||||
## First-Time Setup
|
||||
|
||||
### Option A: CLI (recommended for servers)
|
||||
|
||||
```bash
|
||||
metacrypt init --config /etc/metacrypt/metacrypt.toml
|
||||
```
|
||||
|
||||
This prompts for a seal password, generates the master encryption key, and stores the encrypted MEK in the database. The service is left in the unsealed state.
|
||||
|
||||
### Option B: Web UI
|
||||
|
||||
Start the server and navigate to `https://<host>:8443/`. If the service is uninitialized, you will be redirected to the init page.
|
||||
|
||||
### Seal Password Requirements
|
||||
|
||||
- The seal password is the root of all security. If lost, the data is unrecoverable.
|
||||
- Store it in a secure location (password manager, HSM, sealed envelope in a safe).
|
||||
- The password is never stored — only a salt and encrypted MEK are persisted.
|
||||
- Argon2id parameters (time=3, memory=128 MiB, threads=4) are stored in the database at init time.
|
||||
|
||||
---
|
||||
|
||||
## Daily Operations
|
||||
|
||||
### Starting the Service
|
||||
|
||||
```bash
|
||||
# systemd
|
||||
sudo systemctl start metacrypt
|
||||
|
||||
# Docker
|
||||
docker compose -f deploy/docker/docker-compose.yml up -d
|
||||
|
||||
# Manual
|
||||
metacrypt server --config /etc/metacrypt/metacrypt.toml
|
||||
```
|
||||
|
||||
The service starts **sealed**. It must be unsealed before it can serve requests.
|
||||
|
||||
### Unsealing
|
||||
|
||||
After every restart, the service must be unsealed:
|
||||
|
||||
```bash
|
||||
# Via API
|
||||
curl -sk -X POST https://localhost:8443/v1/unseal \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"password":"<seal-password>"}'
|
||||
|
||||
# Via web UI
|
||||
# Navigate to https://<host>:8443/unseal
|
||||
```
|
||||
|
||||
**Rate limiting**: After 5 failed unseal attempts within one minute, a 60-second lockout is enforced.
|
||||
|
||||
### Checking Status
|
||||
|
||||
```bash
|
||||
# Remote check
|
||||
metacrypt status --addr https://metacrypt.example.com:8443 --ca-cert /path/to/ca.crt
|
||||
|
||||
# Via API
|
||||
curl -sk https://localhost:8443/v1/status
|
||||
# Returns: {"state":"sealed"} or {"state":"unsealed"} etc.
|
||||
```
|
||||
|
||||
### Sealing (Emergency)
|
||||
|
||||
An admin user can seal the service at any time, which zeroizes all key material in memory:
|
||||
|
||||
```bash
|
||||
curl -sk -X POST https://localhost:8443/v1/seal \
|
||||
-H "Authorization: Bearer <admin-token>"
|
||||
```
|
||||
|
||||
This immediately makes all cryptographic operations unavailable. Use this if you suspect a compromise.
|
||||
|
||||
### Authentication
|
||||
|
||||
Metacrypt proxies authentication to MCIAS. Users log in with their MCIAS credentials:
|
||||
|
||||
```bash
|
||||
# API login
|
||||
curl -sk -X POST https://localhost:8443/v1/auth/login \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{"username":"alice","password":"..."}'
|
||||
# Returns: {"token":"...","expires_at":"..."}
|
||||
|
||||
# Use the token for subsequent requests
|
||||
curl -sk https://localhost:8443/v1/auth/tokeninfo \
|
||||
-H "Authorization: Bearer <token>"
|
||||
```
|
||||
|
||||
Users with the MCIAS `admin` role automatically get admin privileges in Metacrypt.
|
||||
|
||||
---
|
||||
|
||||
## Backup and Restore
|
||||
|
||||
### Creating Backups
|
||||
|
||||
```bash
|
||||
# CLI
|
||||
metacrypt snapshot --config /etc/metacrypt/metacrypt.toml --output /var/lib/metacrypt/backups/metacrypt-$(date +%Y%m%d).db
|
||||
|
||||
# Using the backup script (with 30-day retention)
|
||||
deploy/scripts/backup.sh 30
|
||||
```
|
||||
|
||||
The backup is a consistent SQLite snapshot created with `VACUUM INTO`. The backup file contains the same encrypted data as the live database — the seal password is still required to access it.
|
||||
|
||||
### Automated Backups (systemd)
|
||||
|
||||
```bash
|
||||
sudo systemctl enable --now metacrypt-backup.timer
|
||||
```
|
||||
|
||||
This runs a backup daily at 02:00 with up to 5 minutes of jitter.
|
||||
|
||||
### Restoring from Backup
|
||||
|
||||
1. Stop the service: `systemctl stop metacrypt`
|
||||
2. Replace the database: `cp /var/lib/metacrypt/backups/metacrypt-20260314.db /var/lib/metacrypt/metacrypt.db`
|
||||
3. Fix permissions: `chown metacrypt:metacrypt /var/lib/metacrypt/metacrypt.db && chmod 0600 /var/lib/metacrypt/metacrypt.db`
|
||||
4. Start the service: `systemctl start metacrypt`
|
||||
5. Unseal with the original seal password
|
||||
|
||||
**The seal password does not change between backups.** A backup restored from any point in time uses the same seal password that was set during `metacrypt init`.
|
||||
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Health Check
|
||||
|
||||
```bash
|
||||
curl -sk https://localhost:8443/v1/status
|
||||
```
|
||||
|
||||
Returns HTTP 200 in all states. Check the `state` field:
|
||||
|
||||
- `unsealed` — healthy, fully operational
|
||||
- `sealed` — needs unseal, no crypto operations available
|
||||
- `uninitialized` — needs init
|
||||
|
||||
### Log Output
|
||||
|
||||
Metacrypt logs structured JSON to stdout. When running under systemd, logs go to the journal:
|
||||
|
||||
```bash
|
||||
# Follow logs
|
||||
journalctl -u metacrypt -f
|
||||
|
||||
# Recent errors
|
||||
journalctl -u metacrypt --priority=err --since="1 hour ago"
|
||||
```
|
||||
|
||||
### Key Metrics to Monitor
|
||||
|
||||
| What | How | Alert When |
|
||||
|---|---|---|
|
||||
| Service state | `GET /v1/status` | `state != "unsealed"` for more than a few minutes after restart |
|
||||
| TLS certificate expiry | External cert checker | < 30 days to expiry |
|
||||
| Database file size | `stat /var/lib/metacrypt/metacrypt.db` | Unexpectedly large growth |
|
||||
| Backup age | `find /var/lib/metacrypt/backups -name '*.db' -mtime +2` | No backup in 48 hours |
|
||||
| MCIAS connectivity | Login attempt | Auth failures not caused by bad credentials |
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Service won't start
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---|---|---|
|
||||
| `config: server.tls_cert is required` | Missing or invalid config | Check config file path and contents |
|
||||
| `db: create file: permission denied` | Wrong permissions on data dir | `chown -R metacrypt:metacrypt /var/lib/metacrypt` |
|
||||
| `server: tls: failed to find any PEM data` | Bad cert/key files | Verify PEM format: `openssl x509 -in server.crt -text -noout` |
|
||||
|
||||
### Unseal fails
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---|---|---|
|
||||
| `invalid password` (401) | Wrong seal password | Verify password. There is no recovery if the password is lost. |
|
||||
| `too many attempts` (429) | Rate limited | Wait 60 seconds, then try again |
|
||||
| `not initialized` (412) | Database is empty/new | Run `metacrypt init` |
|
||||
|
||||
### Authentication fails
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---|---|---|
|
||||
| `invalid credentials` (401) | Bad username/password or MCIAS down | Verify MCIAS is reachable: `curl -sk https://mcias.metacircular.net:8443/v1/health` |
|
||||
| `sealed` (503) | Service not unsealed | Unseal the service first |
|
||||
| Connection refused to MCIAS | Network/TLS issue | Check `mcias.server_url` and `mcias.ca_cert` in config |
|
||||
|
||||
### Database Issues
|
||||
|
||||
```bash
|
||||
# Check database integrity
|
||||
sqlite3 /var/lib/metacrypt/metacrypt.db "PRAGMA integrity_check;"
|
||||
|
||||
# Check WAL mode
|
||||
sqlite3 /var/lib/metacrypt/metacrypt.db "PRAGMA journal_mode;"
|
||||
# Should return: wal
|
||||
|
||||
# Check file permissions
|
||||
ls -la /var/lib/metacrypt/metacrypt.db
|
||||
# Should be: -rw------- metacrypt metacrypt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Seal Password
|
||||
|
||||
- The seal password is the single point of trust. Protect it accordingly.
|
||||
- Use a strong, unique password (recommend 20+ characters or a passphrase).
|
||||
- Store it in at least two independent secure locations.
|
||||
- Rotate by re-initializing (requires data migration — not yet automated).
|
||||
|
||||
### Key Material Lifecycle
|
||||
|
||||
- **KWK** (Key-Wrapping Key): derived from password, used only during unseal, zeroized immediately after.
|
||||
- **MEK** (Master Encryption Key): held in memory while unsealed, zeroized on seal.
|
||||
- **DEKs** (Data Encryption Keys): per-engine, stored encrypted in the barrier, zeroized on seal.
|
||||
|
||||
Sealing the service (`POST /v1/seal`) explicitly zeroizes all key material from process memory.
|
||||
|
||||
### File Permissions
|
||||
|
||||
| Path | Mode | Owner |
|
||||
|---|---|---|
|
||||
| `/etc/metacrypt/metacrypt.toml` | 0640 | metacrypt:metacrypt |
|
||||
| `/etc/metacrypt/certs/server.key` | 0600 | metacrypt:metacrypt |
|
||||
| `/var/lib/metacrypt/metacrypt.db` | 0600 | metacrypt:metacrypt |
|
||||
| `/var/lib/metacrypt/backups/` | 0700 | metacrypt:metacrypt |
|
||||
|
||||
### systemd Hardening
|
||||
|
||||
The provided service unit applies: `NoNewPrivileges`, `ProtectSystem=strict`, `ProtectHome`, `PrivateTmp`, `PrivateDevices`, `MemoryDenyWriteExecute`, and namespace restrictions. Only `/var/lib/metacrypt` is writable.
|
||||
|
||||
### Docker Security
|
||||
|
||||
The container runs as a non-root `metacrypt` user. The `/data` volume should be owned by the container's metacrypt UID (determined at build time). Do not run the container with `--privileged`.
|
||||
|
||||
---
|
||||
|
||||
## Operational Procedures
|
||||
|
||||
### Planned Restart
|
||||
|
||||
1. Notify users that crypto operations will be briefly unavailable
|
||||
2. `systemctl restart metacrypt`
|
||||
3. Unseal the service
|
||||
4. Verify: `metacrypt status --addr https://localhost:8443`
|
||||
|
||||
### Password Rotation
|
||||
|
||||
There is no online password rotation in Phase 1. To change the seal password:
|
||||
|
||||
1. Create a backup: `metacrypt snapshot --output pre-rotation.db`
|
||||
2. Stop the service
|
||||
3. Re-initialize with a new database and password
|
||||
4. Migrate data from the old barrier (requires custom tooling or a future `metacrypt rekey` command)
|
||||
|
||||
### Disaster Recovery
|
||||
|
||||
If the server is lost but you have a database backup and the seal password:
|
||||
|
||||
1. Install Metacrypt on a new server (see Installation)
|
||||
2. Copy the backup database to `/var/lib/metacrypt/metacrypt.db`
|
||||
3. Fix ownership: `chown metacrypt:metacrypt /var/lib/metacrypt/metacrypt.db`
|
||||
4. Start the service and unseal with the original password
|
||||
|
||||
The database backup contains the encrypted MEK and all barrier data. No additional secrets beyond the seal password are needed for recovery.
|
||||
|
||||
### Upgrading Metacrypt
|
||||
|
||||
1. Build or download the new binary
|
||||
2. Create a backup: `metacrypt snapshot --output pre-upgrade.db`
|
||||
3. Replace the binary: `install -m 0755 metacrypt /usr/local/bin/metacrypt`
|
||||
4. Restart: `systemctl restart metacrypt`
|
||||
5. Unseal and verify
|
||||
|
||||
Database migrations run automatically on startup.
|
||||
Reference in New Issue
Block a user