Files
mcp/PROGRESS_V1.md
Kyle Isom 6e30cf12f2 Mark Phase B complete in PROGRESS_V1.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:36:50 -07:00

6.3 KiB

MCP v1 Progress

Phase 0: Project Scaffolding

  • P0.1 Repository and module setup
  • P0.2 Proto definitions and code generation

Phase 1: Core Libraries

  • P1.1 Registry package (internal/registry/)
  • P1.2 Runtime package (internal/runtime/)
  • P1.3 Service definition package (internal/servicedef/)
  • P1.4 Config package (internal/config/)
  • P1.5 Auth package (internal/auth/)

Phase 2: Agent

  • P2.1 Agent skeleton and gRPC server
  • P2.2 Deploy handler
  • P2.3 Lifecycle handlers (stop, start, restart)
  • P2.4 Status handlers (list, live check, get status)
  • P2.5 Sync handler
  • P2.6 File transfer handlers
  • P2.7 Adopt handler
  • P2.8 Monitor subsystem
  • P2.9 Snapshot command

Phase 3: CLI

  • P3.1 CLI skeleton
  • P3.2 Login command
  • P3.3 Deploy command
  • P3.4 Lifecycle commands (stop, start, restart)
  • P3.5 Status commands (list, ps, status)
  • P3.6 Sync command
  • P3.7 Adopt command
  • P3.8 Service commands (show, edit, export)
  • P3.9 Transfer commands (push, pull)
  • P3.10 Node commands

Phase 4: Deployment Artifacts

  • P4.1 Systemd units
  • P4.2 Example configs
  • P4.3 Install script

Phase 5: Integration and Polish

  • P5.1 Integration test suite
  • P5.2 Bootstrap procedure — documented in docs/bootstrap.md
  • P5.3 Documentation — CLAUDE.md, README.md, RUNBOOK.md

Phase 6: Deployment (completed 2026-03-26)

  • P6.1 NixOS config for mcp user (rootless podman, subuid/subgid, systemd service)
  • P6.2 TLS cert provisioned from Metacrypt (DNS + IP SANs)
  • P6.3 MCIAS system account (mcp-agent with admin role)
  • P6.4 Container migration (metacrypt, mc-proxy, mcr, mcns → mcp user)
  • P6.5 MCP bootstrap (adopt, sync, export service definitions)
  • P6.6 Service definitions completed with full container specs

Deployment Bugs Fixed During Rollout

  • podman ps JSON: Command field is []string not string
  • Container name handling: splitContainerName naive split broke mc-proxy → extracted ContainerNameFor/SplitContainerName with registry-aware lookup
  • CLI default config path: ~/.config/mcp/mcp.toml
  • Token file whitespace: trim newlines before sending in gRPC metadata
  • NixOS systemd sandbox: ProtectHome blocks /run/user, ProtectSystem=strict blocks podman runtime dir → relaxed to ProtectSystem=full, ProtectHome=false
  • Agent needs PATH, HOME, XDG_RUNTIME_DIR in systemd environment

Platform Evolution (see PLATFORM_EVOLUTION.md)

Phase A — COMPLETE (2026-03-27)

  • Route declarations in service definitions ([[components.routes]])
  • Automatic port allocation by agent (10000-60000, mutex-serialized)
  • $PORT / $PORT_<NAME> env var injection into containers
  • Proto: RouteSpec message, routes + env on ComponentSpec
  • Registry: component_routes table with host_port tracking
  • Backward compatible: old-style ports strings still work

Phase B — COMPLETE (2026-03-27)

  • Agent connects to mc-proxy via Unix socket on deploy
  • Agent calls AddRoute to register routes with mc-proxy
  • Agent calls RemoveRoute on service stop/teardown
  • Agent config: [mcproxy] socket and cert_dir fields
  • TLS certs: pre-provisioned at convention path (Phase C automates)
  • Nil-safe: if socket not configured, route registration silently skipped

Remaining Work

Operational — Next Priority

  • MCR auth for mcp user — podman pull from MCR requires OCI token auth. Currently using image save/load workaround. Need either: OCI token flow support in the agent, or podman login with service account credentials.
  • Vade DNS routing — Tailscale MagicDNS intercepts *.svc.mcp.metacircular.net queries on vade, preventing hostname-based TLS connections. CLI currently uses IP address directly. Fix: Tailscale DNS configuration or split-horizon setup on vade.
  • Service export completenessmcp service export only captures name + image from the registry. Should include full spec (network, ports, volumes, user, restart, cmd). Requires the agent's ListServices response to include full ComponentSpec data, not just ComponentInfo.

Quality

  • P5.1 Integration test suite — end-to-end CLI → agent → podman tests
  • P5.2 Bootstrap procedure test — documented and verified
  • README.md — quick-start guide
  • RUNBOOK.md — operational procedures (unseal metacrypt, restart services, disaster recovery)

Design

  • Self-management — how MCP updates mc-proxy and its own agent without circular dependency. Likely answer: NixOS manages the agent and mc-proxy binaries; MCP manages their containers. Or: staged restart with health checks.
  • ARCHITECTURE.md proto naming — update spec to match buf-lint-compliant message names (StopServiceRequest vs ServiceRequest, AdoptContainers vs AdoptContainer).
  • mcdsl DefaultPath helperDefaultPath(name) string for consistent config file discovery across all services. Root: /srv, /etc. User: XDG, /srv.
  • Engineering standards update — document REST+gRPC parity exception for infrastructure services (MCP agent).

Infrastructure

  • Certificate renewal — MCP-managed cert renewal before expiry. Agent cert expires 2026-06-24. Need automated renewal via Metacrypt ACME or REST API.
  • Monitor alerting — configure alert_command on rift (ntfy, webhook, or custom script) for drift/flap notifications.
  • Backup timer — install mcp-agent-backup timer via NixOS config.

Current State (2026-03-26)

MCP is deployed and operational on rift. The agent runs as a systemd service under the mcp user with rootless podman. All platform services (metacrypt, mc-proxy, mcr, mcns) are managed by MCP with complete service definitions.

$ mcp status
SERVICE    COMPONENT  DESIRED  OBSERVED  VERSION
mc-proxy   mc-proxy   running  running   latest
mcns       coredns    running  running   1.12.1
mcr        api        running  running   latest
mcr        web        running  running   latest
metacrypt  api        running  running   latest
metacrypt  web        running  running   latest