- docs/bootstrap.md: step-by-step bootstrap procedure with lessons learned from the first deployment (NixOS sandbox issues, podman rootless setup, container naming, MCR auth workaround) - README.md: quick-start guide, command reference, doc links - RUNBOOK.md: operational procedures for operators (health checks, common operations, unsealing metacrypt, cert renewal, incident response, disaster recovery, file locations) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5.4 KiB
5.4 KiB
MCP v1 Progress
Phase 0: Project Scaffolding
- P0.1 Repository and module setup
- P0.2 Proto definitions and code generation
Phase 1: Core Libraries
- P1.1 Registry package (
internal/registry/) - P1.2 Runtime package (
internal/runtime/) - P1.3 Service definition package (
internal/servicedef/) - P1.4 Config package (
internal/config/) - P1.5 Auth package (
internal/auth/)
Phase 2: Agent
- P2.1 Agent skeleton and gRPC server
- P2.2 Deploy handler
- P2.3 Lifecycle handlers (stop, start, restart)
- P2.4 Status handlers (list, live check, get status)
- P2.5 Sync handler
- P2.6 File transfer handlers
- P2.7 Adopt handler
- P2.8 Monitor subsystem
- P2.9 Snapshot command
Phase 3: CLI
- P3.1 CLI skeleton
- P3.2 Login command
- P3.3 Deploy command
- P3.4 Lifecycle commands (stop, start, restart)
- P3.5 Status commands (list, ps, status)
- P3.6 Sync command
- P3.7 Adopt command
- P3.8 Service commands (show, edit, export)
- P3.9 Transfer commands (push, pull)
- P3.10 Node commands
Phase 4: Deployment Artifacts
- P4.1 Systemd units
- P4.2 Example configs
- P4.3 Install script
Phase 5: Integration and Polish
- P5.1 Integration test suite
- P5.2 Bootstrap procedure — documented in
docs/bootstrap.md - P5.3 Documentation — CLAUDE.md, README.md, RUNBOOK.md
Phase 6: Deployment (completed 2026-03-26)
- P6.1 NixOS config for mcp user (rootless podman, subuid/subgid, systemd service)
- P6.2 TLS cert provisioned from Metacrypt (DNS + IP SANs)
- P6.3 MCIAS system account (mcp-agent with admin role)
- P6.4 Container migration (metacrypt, mc-proxy, mcr, mcns → mcp user)
- P6.5 MCP bootstrap (adopt, sync, export service definitions)
- P6.6 Service definitions completed with full container specs
Deployment Bugs Fixed During Rollout
- podman ps JSON:
Commandfield is[]stringnotstring - Container name handling:
splitContainerNamenaive split brokemc-proxy→ extractedContainerNameFor/SplitContainerNamewith registry-aware lookup - CLI default config path:
~/.config/mcp/mcp.toml - Token file whitespace: trim newlines before sending in gRPC metadata
- NixOS systemd sandbox:
ProtectHomeblocks/run/user,ProtectSystem=strictblocks podman runtime dir → relaxed toProtectSystem=full,ProtectHome=false - Agent needs
PATH,HOME,XDG_RUNTIME_DIRin systemd environment
Remaining Work
Operational — Next Priority
- MCR auth for mcp user — podman pull from MCR requires OCI token auth. Currently using image save/load workaround. Need either: OCI token flow support in the agent, or podman login with service account credentials.
- Vade DNS routing — Tailscale MagicDNS intercepts
*.svc.mcp.metacircular.netqueries on vade, preventing hostname-based TLS connections. CLI currently uses IP address directly. Fix: Tailscale DNS configuration or split-horizon setup on vade. - Service export completeness —
mcp service exportonly captures name + image from the registry. Should include full spec (network, ports, volumes, user, restart, cmd). Requires the agent'sListServicesresponse to include fullComponentSpecdata, not justComponentInfo.
Quality
- P5.1 Integration test suite — end-to-end CLI → agent → podman tests
- P5.2 Bootstrap procedure test — documented and verified
- README.md — quick-start guide
- RUNBOOK.md — operational procedures (unseal metacrypt, restart services, disaster recovery)
Design
- Self-management — how MCP updates mc-proxy and its own agent without circular dependency. Likely answer: NixOS manages the agent and mc-proxy binaries; MCP manages their containers. Or: staged restart with health checks.
- ARCHITECTURE.md proto naming — update spec to match buf-lint-compliant message names (StopServiceRequest vs ServiceRequest, AdoptContainers vs AdoptContainer).
- mcdsl DefaultPath helper —
DefaultPath(name) stringfor consistent config file discovery across all services. Root: /srv, /etc. User: XDG, /srv. - Engineering standards update — document REST+gRPC parity exception for infrastructure services (MCP agent).
Infrastructure
- Certificate renewal — MCP-managed cert renewal before expiry. Agent cert expires 2026-06-24. Need automated renewal via Metacrypt ACME or REST API.
- Monitor alerting — configure alert_command on rift (ntfy, webhook, or custom script) for drift/flap notifications.
- Backup timer — install mcp-agent-backup timer via NixOS config.
Current State (2026-03-26)
MCP is deployed and operational on rift. The agent runs as a systemd service
under the mcp user with rootless podman. All platform services (metacrypt,
mc-proxy, mcr, mcns) are managed by MCP with complete service definitions.
$ mcp status
SERVICE COMPONENT DESIRED OBSERVED VERSION
mc-proxy mc-proxy running running latest
mcns coredns running running 1.12.1
mcr api running running latest
mcr web running running latest
metacrypt api running running latest
metacrypt web running running latest