From 1afbf5e1f613664e0de161d4e04af9fba41916ad Mon Sep 17 00:00:00 2001 From: Kyle Isom Date: Thu, 26 Mar 2026 22:22:27 -0700 Subject: [PATCH] Add purge design to architecture doc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Purge removes stale registry entries — components that are no longer in service definitions and have no running container. Designed as an explicit, safe operation separate from sync: sync is additive (push desired state), purge is subtractive (remove forgotten entries). Includes safety rules (refuses to purge running containers), dry-run mode, agent RPC definition, and rationale for why sync should not be made destructive. Co-Authored-By: Claude Opus 4.6 (1M context) --- ARCHITECTURE.md | 142 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 6be3a5e..9529c2d 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -207,6 +207,7 @@ mcp sync Push service definitions to agent (update state without deploying) mcp adopt Adopt all -* containers into a service +mcp purge [service[/component]] Remove stale registry entries (--dry-run to preview) mcp service show Print current spec from agent registry mcp service edit Open service definition in $EDITOR @@ -1195,6 +1196,147 @@ mcp/ --- +## Registry Cleanup: Purge + +### Problem + +The agent's registry accumulates stale entries over time. A component +that was replaced (e.g., `mcns/coredns` → `mcns/mcns`) or a service +that was decommissioned remains in the registry indefinitely with +`observed=removed` or `observed=unknown`. There is no mechanism to tell +the agent "this component no longer exists and should not be tracked." + +This causes: +- Perpetual drift alerts for components that will never return. +- Noise in `mcp status` and `mcp list` output. +- Confusion about what the agent is actually responsible for. + +The existing `mcp sync` compares local service definitions against the +agent's registry and updates desired state for components that are +defined. But it does not remove components or services that are *absent* +from the local definitions — sync is additive, not declarative. + +### Design: `mcp purge` + +Purge removes registry entries that are both **unwanted** (not in any +current service definition) and **gone** (no corresponding container in +the runtime). It is the garbage collector for the registry. + +``` +mcp purge [--dry-run] Purge all stale entries +mcp purge [--dry-run] Purge stale entries for one service +mcp purge / [--dry-run] Purge a specific component +``` + +#### Semantics + +Purge operates on the agent's registry, not on containers. It never +stops or removes running containers. The rules: + +1. **Component purge**: a component is eligible for purge when: + - Its observed state is `removed`, `unknown`, or `exited`, AND + - It is not present in any current service definition file + (i.e., `mcp sync` would not recreate it). + + Purging a component deletes its registry entry (from `components`, + `component_ports`, `component_volumes`, `component_cmd`) and its + event history. + +2. **Service purge**: a service is eligible for purge when all of its + components have been purged (or it has no components). Purging a + service deletes its `services` row. + +3. **Safety**: purge refuses to remove a component whose observed state + is `running` or `stopped` (i.e., a container still exists in the + runtime). This prevents accidentally losing track of live containers. + The operator must `mcp stop` and wait for the container to be removed + before purging, or manually remove it via podman. + +4. **Dry run**: `--dry-run` lists what would be purged without modifying + the registry. This is the default-safe way to preview the operation. + +#### Interaction with Sync + +`mcp sync` pushes desired state from service definitions. `mcp purge` +removes entries that sync would never touch. They are complementary: + +- `sync` answers: "what should exist?" (additive) +- `purge` answers: "what should be forgotten?" (subtractive) + +A full cleanup is: `mcp sync && mcp purge`. + +An alternative design would make `mcp sync` itself remove entries not +present in service definitions (fully declarative sync). This was +rejected because: + +- Sync currently only operates on services that have local definition + files. A service without a local file is left untouched — this is + desirable when multiple operators or workstations manage different + services. +- Making sync destructive increases the blast radius of a missing file + (accidentally deleting the local `mcr.toml` would cause sync to + purge MCR from the registry). +- Purge as a separate, explicit command with `--dry-run` gives the + operator clear control over what gets cleaned up. + +#### Agent RPC + +```protobuf +rpc PurgeComponent(PurgeRequest) returns (PurgeResponse); + +message PurgeRequest { + string service = 1; // service name (empty = all services) + string component = 2; // component name (empty = all eligible in service) + bool dry_run = 3; // preview only, do not modify registry +} + +message PurgeResponse { + repeated PurgeResult results = 1; +} + +message PurgeResult { + string service = 1; + string component = 2; + bool purged = 3; // true if removed (or would be, in dry-run) + string reason = 4; // why eligible, or why refused +} +``` + +The CLI sends the set of currently-defined service/component names +alongside the purge request so the agent can determine what is "not in +any current service definition" without needing access to the CLI's +filesystem. + +#### Example + +After replacing `mcns/coredns` with `mcns/mcns`: + +``` +$ mcp purge --dry-run +would purge mcns/coredns (observed=removed, not in service definitions) + +$ mcp purge +purged mcns/coredns + +$ mcp status +SERVICE COMPONENT DESIRED OBSERVED VERSION +mc-proxy mc-proxy running running latest +mcns mcns running running v1.0.0 +mcr api running running latest +mcr web running running latest +metacrypt api running running latest +metacrypt web running running latest +``` + +#### Registry Auth + +Purge also cleans up after the `mcp adopt` workflow. When containers are +adopted and later removed (replaced by a proper deploy), the adopted +entries linger. Purge removes them once the containers are gone and the +service definition no longer references them. + +--- + ## Future Work (v2+) These are explicitly out of scope for v1 but inform the design: