From 6a90b21a622925513af0574b91a98ba7ce722deb Mon Sep 17 00:00:00 2001 From: Kyle Isom Date: Thu, 26 Mar 2026 11:08:06 -0700 Subject: [PATCH] Add PROJECT_PLAN_V1.md and PROGRESS_V1.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 30 discrete tasks across 5 phases, with dependency graph and parallelism analysis. Phase 1 (5 core libraries) is fully parallel. Phases 2+3+4 (agent handlers, CLI commands, deployment artifacts) support up to 8+ concurrent engineers/agents. Critical path is proto → registry + runtime → agent deploy → integration tests. Co-Authored-By: Claude Opus 4.6 (1M context) --- PROGRESS_V1.md | 51 ++++ PROJECT_PLAN_V1.md | 741 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 792 insertions(+) create mode 100644 PROGRESS_V1.md create mode 100644 PROJECT_PLAN_V1.md diff --git a/PROGRESS_V1.md b/PROGRESS_V1.md new file mode 100644 index 0000000..0de1993 --- /dev/null +++ b/PROGRESS_V1.md @@ -0,0 +1,51 @@ +# MCP v1 Progress + +## Phase 0: Project Scaffolding + +- [ ] **P0.1** Repository and module setup +- [ ] **P0.2** Proto definitions and code generation + +## Phase 1: Core Libraries + +- [ ] **P1.1** Registry package (`internal/registry/`) +- [ ] **P1.2** Runtime package (`internal/runtime/`) +- [ ] **P1.3** Service definition package (`internal/servicedef/`) +- [ ] **P1.4** Config package (`internal/config/`) +- [ ] **P1.5** Auth package (`internal/auth/`) + +## Phase 2: Agent + +- [ ] **P2.1** Agent skeleton and gRPC server +- [ ] **P2.2** Deploy handler +- [ ] **P2.3** Lifecycle handlers (stop, start, restart) +- [ ] **P2.4** Status handlers (list, live check, get status) +- [ ] **P2.5** Sync handler +- [ ] **P2.6** File transfer handlers +- [ ] **P2.7** Adopt handler +- [ ] **P2.8** Monitor subsystem +- [ ] **P2.9** Snapshot command + +## Phase 3: CLI + +- [ ] **P3.1** CLI skeleton +- [ ] **P3.2** Login command +- [ ] **P3.3** Deploy command +- [ ] **P3.4** Lifecycle commands (stop, start, restart) +- [ ] **P3.5** Status commands (list, ps, status) +- [ ] **P3.6** Sync command +- [ ] **P3.7** Adopt command +- [ ] **P3.8** Service commands (show, edit, export) +- [ ] **P3.9** Transfer commands (push, pull) +- [ ] **P3.10** Node commands + +## Phase 4: Deployment Artifacts + +- [ ] **P4.1** Systemd units +- [ ] **P4.2** Example configs +- [ ] **P4.3** Install script + +## Phase 5: Integration and Polish + +- [ ] **P5.1** Integration test suite +- [ ] **P5.2** Bootstrap procedure test +- [ ] **P5.3** Documentation (CLAUDE.md, README.md, RUNBOOK.md) diff --git a/PROJECT_PLAN_V1.md b/PROJECT_PLAN_V1.md new file mode 100644 index 0000000..2392170 --- /dev/null +++ b/PROJECT_PLAN_V1.md @@ -0,0 +1,741 @@ +# MCP v1 Project Plan + +## Overview + +This plan breaks MCP v1 into discrete implementation tasks organized into +phases. Tasks within a phase can often be parallelized. Dependencies +between tasks are noted explicitly. + +The critical path is: proto → registry + runtime → agent deploy handler → +integration testing. Parallelizable work (CLI commands, monitoring, file +transfer) can proceed alongside the critical path once the proto and core +libraries are ready. + +## Notation + +- **[Pn]** = Phase n +- **[Pn.m]** = Task m in phase n +- **depends: [Px.y]** = must wait for that task +- **parallel: [Px.y, Px.z]** = can run alongside these tasks +- Each task includes scope, deliverables, and test criteria + +--- + +## Phase 0: Project Scaffolding + +One engineer. Serial. Establishes the project skeleton that everything +else builds on. + +### P0.1: Repository and module setup + +**Scope:** Initialize the Go module, create the standard directory +structure, and configure tooling. + +**Deliverables:** +- `go.mod` with module path `git.wntrmute.dev/kyle/mcp` +- `Makefile` with standard targets (build, test, vet, lint, proto, + proto-lint, clean, all) +- `.golangci.yaml` with platform-standard linter config +- `.gitignore` +- `CLAUDE.md` (project-specific AI context) +- Empty `cmd/mcp/main.go` and `cmd/mcp-agent/main.go` (compile to + verify the skeleton works) + +**Test criteria:** `make build` succeeds. `make vet` and `make lint` pass +on the empty project. + +### P0.2: Proto definitions and code generation + +**Scope:** Write the `mcp.proto` file from the ARCHITECTURE.md spec and +generate Go code. + +**Depends:** P0.1 + +**Deliverables:** +- `proto/mcp/v1/mcp.proto` — full service definition from ARCHITECTURE.md +- `buf.yaml` configuration +- `gen/mcp/v1/` — generated Go code +- `make proto` and `make proto-lint` both pass + +**Test criteria:** Generated code compiles. `buf lint` passes. All message +types and RPC methods from the architecture doc are present. + +--- + +## Phase 1: Core Libraries + +Four independent packages. **All can be built in parallel** once P0.2 is +complete. Each package has a well-defined interface, no dependencies on +other Phase 1 packages, and is fully testable in isolation. + +### P1.1: Registry package (`internal/registry/`) + +**Scope:** SQLite schema, migrations, and CRUD operations for the +node-local registry. + +**Depends:** P0.1 + +**Deliverables:** +- `db.go` — open database, run migrations, close. Schema from + ARCHITECTURE.md (services, components, component_ports, + component_volumes, component_cmd, events tables). +- `services.go` — create, get, list, update, delete services. +- `components.go` — create, get, list (by service), update desired/observed + state, update spec, delete. Support filtering by desired_state. +- `events.go` — insert event, query events by component+service+time + range, count events in window (for flap detection), prune old events. + +**Test criteria:** Full test coverage using `t.TempDir()` + real SQLite. +Tests cover: +- Schema migration is idempotent +- Service and component CRUD +- Desired/observed state updates +- Event insertion, time-range queries, pruning +- Foreign key cascading (delete service → components deleted) +- Component composite primary key (service, name) enforced + +**parallel:** P1.2, P1.3, P1.4 + +### P1.2: Runtime package (`internal/runtime/`) + +**Scope:** Container runtime abstraction with a podman implementation. + +**Depends:** P0.1 + +**Deliverables:** +- `runtime.go` — `Runtime` interface: + ```go + type Runtime interface { + Pull(ctx context.Context, image string) error + Run(ctx context.Context, spec ContainerSpec) error + Stop(ctx context.Context, name string) error + Remove(ctx context.Context, name string) error + Inspect(ctx context.Context, name string) (ContainerInfo, error) + List(ctx context.Context) ([]ContainerInfo, error) + } + ``` + Plus `ContainerSpec` and `ContainerInfo` structs. +- `podman.go` — podman implementation. Builds command-line arguments from + `ContainerSpec`, execs `podman` CLI, parses `podman inspect` JSON output. + +**Test criteria:** +- Unit tests for command-line argument building (given a ContainerSpec, + verify the constructed podman args are correct). These don't require + podman to be installed. +- `ContainerSpec` → podman flag mapping matches the table in ARCHITECTURE.md. +- Container naming follows `-` convention. +- Version extraction from image tag works (e.g., `registry/img:v1.2.0` + → `v1.2.0`, `registry/img:latest` → `latest`, `registry/img` → `""`). + +**parallel:** P1.1, P1.3, P1.4 + +### P1.3: Service definition package (`internal/servicedef/`) + +**Scope:** Parse, validate, and write TOML service definition files. + +**Depends:** P0.1 + +**Deliverables:** +- `servicedef.go` — `Load(path) → ServiceDef`, `Write(path, ServiceDef)`, + `LoadAll(dir) → []ServiceDef`. Validation: required fields (name, node, + at least one component), component names unique within service. Converts + between TOML representation and proto `ServiceSpec`. + +**Test criteria:** +- Round-trip: write a ServiceDef, read it back, verify equality +- Validation rejects missing name, missing node, empty components, + duplicate component names +- `LoadAll` loads all `.toml` files from a directory +- `active` field defaults to `true` if omitted +- Conversion to/from proto `ServiceSpec` is correct + +**parallel:** P1.1, P1.2, P1.4 + +### P1.4: Config package (`internal/config/`) + +**Scope:** Load and validate CLI and agent configuration from TOML files. + +**Depends:** P0.1 + +**Deliverables:** +- `cli.go` — CLI config struct: services dir, MCIAS settings, auth + (token path, optional username/password_file), nodes list. Load from + TOML with env var overrides (`MCP_*`). Validate required fields. +- `agent.go` — Agent config struct: server (grpc_addr, tls_cert, tls_key), + database path, MCIAS settings, agent (node_name, container_runtime), + monitor settings, log level. Load from TOML with env var overrides + (`MCP_AGENT_*`). Validate required fields. + +**Test criteria:** +- Load from TOML file, verify all fields populated +- Required field validation (reject missing grpc_addr, missing tls_cert, etc.) +- Env var overrides work +- Nodes list parses correctly from `[[nodes]]` + +**parallel:** P1.1, P1.2, P1.3 + +### P1.5: Auth package (`internal/auth/`) + +**Scope:** MCIAS token validation for the agent, and token acquisition for +the CLI. + +**Depends:** P0.1, P0.2 (uses proto-generated types for gRPC interceptor) + +**Deliverables:** +- `auth.go`: + - `Interceptor` — gRPC unary server interceptor that extracts bearer + tokens, validates against MCIAS (with 30s SHA-256-keyed cache), + checks admin role, audit-logs every RPC (method, caller, timestamp). + Returns UNAUTHENTICATED or PERMISSION_DENIED on failure. + - `Login(url, username, password) → token` — authenticate to MCIAS, + return bearer token. + - `LoadToken(path) → token` — read cached token from file. + - `SaveToken(path, token)` — write token to file with 0600 permissions. + +**Test criteria:** +- Interceptor rejects missing token (UNAUTHENTICATED) +- Interceptor rejects invalid token (UNAUTHENTICATED) +- Interceptor rejects non-admin token (PERMISSION_DENIED) +- Token caching works (same token within 30s returns cached result) +- Token file read/write with correct permissions +- Audit log entry emitted on every RPC (check slog output) + +**Note:** Full interceptor testing requires an MCIAS mock or test instance. +Unit tests can mock the MCIAS validation call. Integration tests against a +real MCIAS instance are a Phase 4 concern. + +**parallel:** P1.1, P1.2, P1.3, P1.4 (partially; needs P0.2 for proto types) + +--- + +## Phase 2: Agent + +The agent is the core of MCP. Tasks in this phase build on Phase 1 +libraries. Some tasks can be parallelized; dependencies are noted. + +### P2.1: Agent skeleton and gRPC server + +**Scope:** Wire up the agent binary: config loading, database setup, gRPC +server with TLS and auth interceptor, graceful shutdown. + +**Depends:** P0.2, P1.1, P1.4, P1.5 + +**Deliverables:** +- `cmd/mcp-agent/main.go` — cobra root command, `server` subcommand +- `internal/agent/agent.go` — Agent struct holding registry, runtime, + config. Initializes database, starts gRPC server with TLS and auth + interceptor, handles SIGINT/SIGTERM for graceful shutdown. +- Agent starts, listens on configured address, rejects unauthenticated + RPCs, shuts down cleanly. + +**Test criteria:** Agent starts with a test config, accepts TLS +connections, rejects RPCs without a valid token. Graceful shutdown +closes the database and stops the listener. + +### P2.2: Deploy handler + +**Scope:** Implement the `Deploy` RPC on the agent. + +**Depends:** P2.1, P1.2 + +**Deliverables:** +- `internal/agent/deploy.go` — handles DeployRequest: records spec in + registry, iterates components, calls runtime (pull, stop, remove, run, + inspect), updates observed state and version, returns results. +- Supports single-component deploy (when `component` field is set). + +**Test criteria:** +- Deploy with all components records spec in registry +- Deploy with single component only touches that component +- Failed pull returns error for that component, others continue +- Registry is updated with desired_state=running and observed_state +- Version is extracted from image tag + +### P2.3: Lifecycle handlers (stop, start, restart) + +**Scope:** Implement `StopService`, `StartService`, `RestartService` RPCs. + +**Depends:** P2.1, P1.2 + +**parallel:** P2.2 + +**Deliverables:** +- `internal/agent/lifecycle.go` +- Stop: for each component, call runtime stop, update desired_state to + `stopped`, update observed_state. +- Start: for each component, call runtime start (or run if removed), + update desired_state to `running`, update observed_state. +- Restart: stop then start each component. + +**Test criteria:** +- Stop sets desired_state=stopped, calls runtime stop +- Start sets desired_state=running, calls runtime start +- Restart cycles each component +- Returns per-component results + +### P2.4: Status handlers (list, live check, get status) + +**Scope:** Implement `ListServices`, `LiveCheck`, `GetServiceStatus` RPCs. + +**Depends:** P2.1, P1.2 + +**parallel:** P2.2, P2.3 + +**Deliverables:** +- `internal/agent/status.go` +- `ListServices`: read from registry, no runtime query. +- `LiveCheck`: query runtime, reconcile registry, return updated state. +- `GetServiceStatus`: live check + drift detection + recent events. + +**Test criteria:** +- ListServices returns registry contents without touching runtime +- LiveCheck updates observed_state from runtime +- GetServiceStatus includes drift info for mismatched desired/observed +- GetServiceStatus includes recent events + +### P2.5: Sync handler + +**Scope:** Implement `SyncDesiredState` RPC. + +**Depends:** P2.1, P1.2 + +**parallel:** P2.2, P2.3, P2.4 + +**Deliverables:** +- `internal/agent/sync.go` +- Receives list of ServiceSpecs from CLI. +- For each service: create or update in registry, set desired_state based + on `active` flag (running if active, stopped if not). +- Runs reconciliation (discover unmanaged containers, set to ignore). +- Returns per-service summary of what changed. + +**Test criteria:** +- New services are created in registry +- Existing services have specs updated +- Active=false sets desired_state=stopped for all components +- Unmanaged containers discovered and set to ignore +- Returns accurate change summaries + +### P2.6: File transfer handlers + +**Scope:** Implement `PushFile` and `PullFile` RPCs. + +**Depends:** P2.1 + +**parallel:** P2.2, P2.3, P2.4, P2.5 + +**Deliverables:** +- `internal/agent/files.go` +- Path validation: resolve `/srv//`, reject `..` traversal, + reject symlinks escaping the service directory. +- Push: atomic write (temp file + rename), create intermediate dirs. +- Pull: read file, return content and permissions. + +**Test criteria:** +- Push creates file at correct path with correct permissions +- Push creates intermediate directories +- Push is atomic (partial write doesn't leave corrupt file) +- Pull returns file content and mode +- Path traversal rejected (`../etc/passwd`) +- Symlink escape rejected +- Service directory scoping enforced + +### P2.7: Adopt handler + +**Scope:** Implement `AdoptContainer` RPC. + +**Depends:** P2.1, P1.2 + +**parallel:** P2.2, P2.3, P2.4, P2.5, P2.6 + +**Deliverables:** +- `internal/agent/adopt.go` +- Matches containers by `-*` prefix in runtime. +- Creates service if needed. +- Strips prefix to derive component name. +- Sets desired_state based on current observed_state. +- Returns per-container results. + +**Test criteria:** +- Matches containers by prefix +- Creates service when it doesn't exist +- Derives component names correctly (metacrypt-api → api, metacrypt-web → web) +- Single-component service (mc-proxy → mc-proxy) works +- Sets desired_state to running for running containers, stopped for stopped +- Returns results for each adopted container + +### P2.8: Monitor subsystem + +**Scope:** Implement the continuous monitoring loop and alerting. + +**Depends:** P2.1, P1.1, P1.2 + +**parallel:** P2.2-P2.7 (can be built alongside other agent handlers) + +**Deliverables:** +- `internal/monitor/monitor.go` — Monitor struct, Start/Stop methods. + Runs a goroutine with a ticker at the configured interval. Each tick: + queries runtime, reconciles registry, records events, evaluates alerts. +- `internal/monitor/alerting.go` — Alert evaluation: drift detection + (desired != observed for managed components), flap detection (event + count in window > threshold), cooldown tracking per component, alert + command execution via `exec` (argv array, MCP_* env vars). +- Event pruning (delete events older than retention period). + +**Test criteria:** +- Monitor detects state transitions and records events +- Drift alert fires on desired/observed mismatch +- Drift alert respects cooldown (doesn't fire again within window) +- Flap alert fires when transitions exceed threshold in window +- Alert command is exec'd with correct env vars +- Event pruning removes old events, retains recent ones +- Monitor can be stopped cleanly (goroutine exits) + +### P2.9: Snapshot command + +**Scope:** Implement `mcp-agent snapshot` for database backup. + +**Depends:** P2.1, P1.1 + +**parallel:** P2.2-P2.8 + +**Deliverables:** +- `cmd/mcp-agent/snapshot.go` — cobra subcommand. Runs `VACUUM INTO` + to create a consistent backup in `/srv/mcp/backups/`. + +**Test criteria:** +- Creates a backup file with timestamp in name +- Backup is a valid SQLite database +- Original database is unchanged + +--- + +## Phase 3: CLI + +All CLI commands are thin gRPC clients. **Most can be built in parallel** +once the proto (P0.2) and servicedef/config packages (P1.3, P1.4) are +ready. CLI commands can be tested against a running agent (integration) +or with a mock gRPC server (unit). + +### P3.1: CLI skeleton + +**Scope:** Wire up the CLI binary: config loading, gRPC connection setup, +cobra command tree. + +**Depends:** P0.2, P1.3, P1.4 + +**Deliverables:** +- `cmd/mcp/main.go` — cobra root command with `--config` flag. + Subcommand stubs for all commands. +- gRPC dial helper: reads node address from config, establishes TLS + connection with CA verification, attaches bearer token to metadata. + +**Test criteria:** CLI starts, loads config, `--help` shows all +subcommands. + +### P3.2: Login command + +**Scope:** Implement `mcp login`. + +**Depends:** P3.1, P1.5 + +**Deliverables:** +- `cmd/mcp/login.go` — prompts for username/password (or reads from + config for unattended), calls MCIAS, saves token to configured path + with 0600 permissions. + +**Test criteria:** Token is saved to the correct path with correct +permissions. + +### P3.3: Deploy command + +**Scope:** Implement `mcp deploy`. + +**Depends:** P3.1, P1.3 + +**parallel:** P3.4, P3.5, P3.6, P3.7, P3.8, P3.9, P3.10 + +**Deliverables:** +- `cmd/mcp/deploy.go` +- Resolves service spec: file (from `-f` or default path) > agent registry. +- Parses `/` syntax for single-component deploy. +- Pushes spec to agent via Deploy RPC. +- Prints per-component results. + +**Test criteria:** +- Reads service definition from file +- Falls back to agent registry when no file exists +- Fails with clear error when neither exists +- Single-component syntax works +- Prints results + +### P3.4: Lifecycle commands (stop, start, restart) + +**Scope:** Implement `mcp stop`, `mcp start`, `mcp restart`. + +**Depends:** P3.1, P1.3 + +**parallel:** P3.3, P3.5, P3.6, P3.7, P3.8, P3.9, P3.10 + +**Deliverables:** +- `cmd/mcp/lifecycle.go` +- Stop: sets `active = false` in service definition file, calls + StopService RPC. +- Start: sets `active = true` in service definition file, calls + StartService RPC. +- Restart: calls RestartService RPC (does not change active flag). + +**Test criteria:** +- Stop updates the service definition file +- Start updates the service definition file +- Both call the correct RPC +- Restart does not modify the file + +### P3.5: Status commands (list, ps, status) + +**Scope:** Implement `mcp list`, `mcp ps`, `mcp status`. + +**Depends:** P3.1 + +**parallel:** P3.3, P3.4, P3.6, P3.7, P3.8, P3.9, P3.10 + +**Deliverables:** +- `cmd/mcp/status.go` +- List: calls ListServices on all nodes, formats table output. +- Ps: calls LiveCheck on all nodes, formats with uptime and version. +- Status: calls GetServiceStatus, shows drift and recent events. + +**Test criteria:** +- Queries all registered nodes +- Formats output as readable tables +- Status highlights drift clearly + +### P3.6: Sync command + +**Scope:** Implement `mcp sync`. + +**Depends:** P3.1, P1.3 + +**parallel:** P3.3, P3.4, P3.5, P3.7, P3.8, P3.9, P3.10 + +**Deliverables:** +- `cmd/mcp/sync.go` +- Loads all service definitions from the services directory. +- Groups by node. +- Calls SyncDesiredState on each agent with that node's services. +- Prints summary of changes. + +**Test criteria:** +- Loads all service definitions +- Filters by node correctly +- Pushes to correct agents +- Prints change summary + +### P3.7: Adopt command + +**Scope:** Implement `mcp adopt`. + +**Depends:** P3.1 + +**parallel:** P3.3, P3.4, P3.5, P3.6, P3.8, P3.9, P3.10 + +**Deliverables:** +- `cmd/mcp/adopt.go` +- Calls AdoptContainer RPC on the agent. +- Prints adopted containers and their derived component names. + +**Test criteria:** +- Calls RPC with service name +- Prints results + +### P3.8: Service commands (show, edit, export) + +**Scope:** Implement `mcp service show`, `mcp service edit`, +`mcp service export`. + +**Depends:** P3.1, P1.3 + +**parallel:** P3.3, P3.4, P3.5, P3.6, P3.7, P3.9, P3.10 + +**Deliverables:** +- `cmd/mcp/service.go` +- Show: calls ListServices, filters to named service, prints spec. +- Edit: if file exists, open in $EDITOR. If not, export from agent + first, then open. Save to standard path. +- Export: calls ListServices, converts to TOML, writes to file (default + path or `-f`). + +**Test criteria:** +- Show prints the correct spec +- Export writes a valid TOML file that can be loaded back +- Edit opens the correct file (or creates from agent spec) + +### P3.9: Transfer commands (push, pull) + +**Scope:** Implement `mcp push` and `mcp pull`. + +**Depends:** P3.1 + +**parallel:** P3.3, P3.4, P3.5, P3.6, P3.7, P3.8, P3.10 + +**Deliverables:** +- `cmd/mcp/transfer.go` +- Push: reads local file, determines service and path, calls PushFile RPC. + Default relative path = basename of local file. +- Pull: calls PullFile RPC, writes content to local file. + +**Test criteria:** +- Push reads file and sends correct content +- Push derives path from basename when omitted +- Pull writes file locally with correct content + +### P3.10: Node commands + +**Scope:** Implement `mcp node list`, `mcp node add`, `mcp node remove`. + +**Depends:** P3.1, P1.4 + +**parallel:** P3.3, P3.4, P3.5, P3.6, P3.7, P3.8, P3.9 + +**Deliverables:** +- `cmd/mcp/node.go` +- List: reads nodes from config, prints table. +- Add: appends a `[[nodes]]` entry to the config file. +- Remove: removes the named `[[nodes]]` entry from the config file. + +**Test criteria:** +- List shows all configured nodes +- Add creates a new entry +- Remove deletes the named entry +- Config file remains valid TOML after add/remove + +--- + +## Phase 4: Deployment Artifacts + +Can be worked on in parallel with Phase 2 and 3. + +### P4.1: Systemd units + +**Scope:** Write systemd service and timer files. + +**Depends:** None (these are static files) + +**parallel:** All of Phase 2 and 3 + +**Deliverables:** +- `deploy/systemd/mcp-agent.service` — from ARCHITECTURE.md +- `deploy/systemd/mcp-agent-backup.service` — snapshot oneshot +- `deploy/systemd/mcp-agent-backup.timer` — daily 02:00 UTC, 5min jitter + +**Test criteria:** Files match platform conventions (security hardening, +correct paths, correct user). + +### P4.2: Example configs + +**Scope:** Write example configuration files. + +**Depends:** None + +**parallel:** All of Phase 2 and 3 + +**Deliverables:** +- `deploy/examples/mcp.toml` — CLI config with all fields documented +- `deploy/examples/mcp-agent.toml` — agent config with all fields documented + +**Test criteria:** Examples are valid TOML, loadable by the config package. + +### P4.3: Install script + +**Scope:** Write the agent install script. + +**Depends:** None + +**parallel:** All of Phase 2 and 3 + +**Deliverables:** +- `deploy/scripts/install-agent.sh` — idempotent: create user/group, + install binary, create /srv/mcp/, install example config, install + systemd units, reload daemon. + +**Test criteria:** Script is idempotent (running twice produces the same +result). + +--- + +## Phase 5: Integration Testing and Polish + +Serial. Requires all previous phases to be complete. + +### P5.1: Integration test suite + +**Scope:** End-to-end tests: CLI → agent → podman → container lifecycle. + +**Depends:** All of Phase 2 and 3 + +**Deliverables:** +- Test harness that starts an agent with a test config and temp database. +- Tests cover: deploy, stop, start, restart, sync, adopt, push/pull, + list/ps/status. +- Tests verify registry state, runtime state, and CLI output. + +**Test criteria:** All integration tests pass. Coverage of every CLI +command and agent RPC. + +### P5.2: Bootstrap procedure test + +**Scope:** Test the full MCP bootstrap on a clean node with existing +containers. + +**Depends:** P5.1 + +**Deliverables:** +- Documented test procedure: start agent, sync (discover containers), + adopt, export, verify service definitions match running state. +- Verify the container rename flow (bare names → -). + +### P5.3: Documentation + +**Scope:** Final docs pass. + +**Depends:** P5.1 + +**Deliverables:** +- `CLAUDE.md` updated with final project structure and commands +- `README.md` with quick-start +- `RUNBOOK.md` with operational procedures +- Verify ARCHITECTURE.md matches implementation + +--- + +## Parallelism Summary + +``` +Phase 0 (serial): P0.1 → P0.2 + │ + ▼ +Phase 1 (parallel): ┌─── P1.1 (registry) + ├─── P1.2 (runtime) + ├─── P1.3 (servicedef) + ├─── P1.4 (config) + └─── P1.5 (auth) + │ + ┌─────┴──────┐ + ▼ ▼ +Phase 2 (agent): P2.1 ──┐ Phase 3 (CLI): P3.1 ──┐ + │ │ │ │ + ▼ │ ▼ │ + P2.2 P2.3 Phase 4: P3.2 P3.3 + P2.4 P2.5 P4.1-P4.3 P3.4 P3.5 + P2.6 P2.7 (parallel P3.6 P3.7 + P2.8 P2.9 with 2&3) P3.8 P3.9 + │ │ P3.10 + └──────────┬────────────────┘ + ▼ +Phase 5 (serial): P5.1 → P5.2 → P5.3 +``` + +Maximum parallelism: 5 engineers/agents during Phase 1, up to 8+ +during Phase 2+3+4 combined. + +Minimum serial path: P0.1 → P0.2 → P1.1 → P2.1 → P2.2 → P5.1 → P5.3