Update PROGRESS_V1.md with deployment status and remaining work

Documents Phase 6 (deployment), bugs fixed during rollout,
remaining work organized by priority (operational, quality,
design, infrastructure), and current platform state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-26 15:27:30 -07:00
parent 17ac0f3014
commit ff9bfc5087

View File

@@ -48,4 +48,88 @@
- [ ] **P5.1** Integration test suite
- [ ] **P5.2** Bootstrap procedure test
- [ ] **P5.3** Documentation (CLAUDE.md, README.md, RUNBOOK.md)
- [x] **P5.3** Documentation CLAUDE.md done; README.md and RUNBOOK.md pending
## Phase 6: Deployment (completed 2026-03-26)
- [x] **P6.1** NixOS config for mcp user (rootless podman, subuid/subgid, systemd service)
- [x] **P6.2** TLS cert provisioned from Metacrypt (DNS + IP SANs)
- [x] **P6.3** MCIAS system account (mcp-agent with admin role)
- [x] **P6.4** Container migration (metacrypt, mc-proxy, mcr, mcns → mcp user)
- [x] **P6.5** MCP bootstrap (adopt, sync, export service definitions)
- [x] **P6.6** Service definitions completed with full container specs
## Deployment Bugs Fixed During Rollout
- podman ps JSON: `Command` field is `[]string` not `string`
- Container name handling: `splitContainerName` naive split broke `mc-proxy`
→ extracted `ContainerNameFor`/`SplitContainerName` with registry-aware lookup
- CLI default config path: `~/.config/mcp/mcp.toml`
- Token file whitespace: trim newlines before sending in gRPC metadata
- NixOS systemd sandbox: `ProtectHome` blocks `/run/user`, `ProtectSystem=strict`
blocks podman runtime dir → relaxed to `ProtectSystem=full`, `ProtectHome=false`
- Agent needs `PATH`, `HOME`, `XDG_RUNTIME_DIR` in systemd environment
## Remaining Work
### Operational — Next Priority
- [ ] **MCR auth for mcp user** — podman pull from MCR requires OCI token
auth. Currently using image save/load workaround. Need either: OCI token
flow support in the agent, or podman login with service account credentials.
- [ ] **Vade DNS routing** — Tailscale MagicDNS intercepts `*.svc.mcp.metacircular.net`
queries on vade, preventing hostname-based TLS connections. CLI currently
uses IP address directly. Fix: Tailscale DNS configuration or split-horizon
setup on vade.
- [ ] **Service export completeness**`mcp service export` only captures
name + image from the registry. Should include full spec (network, ports,
volumes, user, restart, cmd). Requires the agent's `ListServices` response
to include full `ComponentSpec` data, not just `ComponentInfo`.
### Quality
- [ ] **P5.1** Integration test suite — end-to-end CLI → agent → podman tests
- [ ] **P5.2** Bootstrap procedure test — documented and verified
- [ ] **README.md** — quick-start guide
- [ ] **RUNBOOK.md** — operational procedures (unseal metacrypt, restart
services, disaster recovery)
### Design
- [ ] **Self-management** — how MCP updates mc-proxy and its own agent without
circular dependency. Likely answer: NixOS manages the agent and mc-proxy
binaries; MCP manages their containers. Or: staged restart with health
checks.
- [ ] **ARCHITECTURE.md proto naming** — update spec to match buf-lint-compliant
message names (StopServiceRequest vs ServiceRequest, AdoptContainers vs
AdoptContainer).
- [ ] **mcdsl DefaultPath helper**`DefaultPath(name) string` for consistent
config file discovery across all services. Root: /srv, /etc. User: XDG, /srv.
- [ ] **Engineering standards update** — document REST+gRPC parity exception
for infrastructure services (MCP agent).
### Infrastructure
- [ ] **Certificate renewal** — MCP-managed cert renewal before expiry.
Agent cert expires 2026-06-24. Need automated renewal via Metacrypt ACME
or REST API.
- [ ] **Monitor alerting** — configure alert_command on rift (ntfy, webhook,
or custom script) for drift/flap notifications.
- [ ] **Backup timer** — install mcp-agent-backup timer via NixOS config.
## Current State (2026-03-26)
MCP is deployed and operational on rift. The agent runs as a systemd service
under the `mcp` user with rootless podman. All platform services (metacrypt,
mc-proxy, mcr, mcns) are managed by MCP with complete service definitions.
```
$ mcp status
SERVICE COMPONENT DESIRED OBSERVED VERSION
mc-proxy mc-proxy running running latest
mcns coredns running running 1.12.1
mcr api running running latest
mcr web running running latest
metacrypt api running running latest
metacrypt web running running latest
```