Files
metacircular/docs/architecture-v2.md
Kyle Isom 0b11899a9b Distinguish node infrastructure from deployed services
mcp-agent and mc-proxy run on every node as systemd services — they
are not placed by the master and don't belong in the placements table.
Snapshot paths are node-keyed for infrastructure (<service>/<node>/)
to avoid collisions between instances on different nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 13:07:59 -07:00

58 KiB

MCP v2 -- Multi-Node Control Plane

Overview

MCP v2 introduces multi-node orchestration with a master/agent topology. The CLI no longer dials agents directly. A dedicated mcp-master daemon coordinates deployments across nodes, handles cross-node concerns (edge routing, certificate provisioning, DNS), and serves as the single control point for the platform.

Motivation

v1 deployed successfully on a single node (rift) but exposed operational pain points as services needed public-facing routes through svc:

  • Manual edge routing: Exposing mcq.metacircular.net required hand-editing mc-proxy's TOML config on svc, provisioning a TLS cert manually, updating the SQLite database when the config and database diverged, and debugging silent failures. Every redeployment risked breaking the public route.

  • Dynamic port instability: The route system assigns ephemeral host ports that change on every deploy. svc's mc-proxy pointed at a specific port (e.g., 100.95.252.120:48080), which went stale after redeployment. Container ports are also localhost-only under rootless podman, requiring explicit Tailscale IP bindings for external access.

  • $PORT env override conflict: The mcdsl config loader overrides listen_addr from $PORT when routes are present. This meant containers ignored their configured port and listened on the route-allocated one instead, breaking explicit port mappings that expected the config port.

  • Cert chain issues: mc-proxy requires full certificate chains (leaf + intermediates). Certs provisioned outside the standard metacrypt flow were leaf-only and caused silent TLS handshake failures (client_bytes=7 backend_bytes=0 with no error logged).

  • mc-proxy database divergence: mc-proxy persists routes in SQLite. Routes added via the admin API override the TOML config. Editing the TOML alone had no effect until the database was manually updated -- a failure mode that took hours to diagnose.

  • No cross-node coordination: The v1 CLI talks directly to individual agents. There is no mechanism for one agent to tell another "set up a route for this service." Every cross-node operation was manual.

v2 addresses all of these by making the master the single coordination point for deployments, with agents handling local concerns (containers, mc-proxy routes, cert provisioning) on instruction from the master.

What Changes from v1

Concern v1 v2
CLI target CLI dials agents directly CLI dials the master
Node awareness CLI routes by node field in service defs Master owns the node registry
Service placement Explicit node required tier field; master auto-places workers
Edge routing Manual mc-proxy config on svc Master coordinates edge setup
Cert provisioning Agent provisions for local mc-proxy only Edge agent provisions its own public certs
DNS registration Agent registers records on deploy Master coordinates DNS across zones
Auth model Token validation only Per-RPC role-based authorization

What Stays the Same

The agent's core responsibilities are unchanged: it manages containers via podman, stores its local registry in SQLite, monitors for drift, and alerts the operator. The agent gains new RPCs for edge routing and health reporting but does not become aware of other nodes -- the master handles all cross-node coordination. Agents never communicate with each other.


Topology

Operator workstation (vade)
  ┌──────────────────────────┐
  │  mcp (CLI)               │
  │                          │
  │  gRPC ───────────────────┼─── Tailnet ──┐
  └──────────────────────────┘              │
                                            ▼
Master + worker node (rift)
  ┌──────────────────────────────────────────────────────┐
  │  mcp-master                                          │
  │    ├── node registry (agents self-register)          │
  │    ├── service placement (tier-aware)                │
  │    ├── edge routing coordinator                      │
  │    └── SQLite state (edge routes, placements)        │
  │                                                      │
  │  mcp-agent                                           │
  │    ├── mcias container                               │
  │    ├── mcns container                                │
  │    ├── metacrypt container                           │
  │    ├── mcr container                                 │
  │    ├── mcq, mcdoc, exo, sgard, kls ...              │
  │    └── mc-proxy (rift)                               │
  └──────────┬──────────────────┬───────────┬────────────┘
             │                  │           │
          Tailnet           Tailnet      Tailnet
             │                  │           │
             ▼                  ▼           ▼
Worker (orion)                  Edge (svc)
  ┌──────────────────┐           ┌─────────────────────┐
  │  mcp-agent       │           │  mcp-agent          │
  │    ├── services  │           │    ├── mc-proxy     │
  │    └── mc-proxy  │           │    └── (routes only)│
  └──────────────────┘           └─────────────────────┘
  NixOS / amd64                  Debian / amd64

Node Roles

Role Purpose Nodes
master Runs mcp-master + mcp-agent. Hosts core infrastructure. Single coordination point. rift
worker Runs mcp-agent. Hosts application services. orion
edge Runs mcp-agent. Terminates public TLS, forwards to internal services. No application containers. svc

Every node runs an mcp-agent. Rift also runs mcp-master. The master's local agent manages the infrastructure services (MCIAS, mcns, metacrypt, mcr) the same way other agents manage application services.

mc-proxy Mesh

Each node runs its own mc-proxy instance. They form a routing mesh:

mc-proxy (rift)
  ├── :443  L7 routes for internal .svc.mcp hostnames
  ├── :8443 L4 passthrough for API servers (MCIAS, metacrypt, mcr)
  └── :9443 L4 passthrough for gRPC services

mc-proxy (orion)
  ├── :443  L7 routes for services hosted on this node
  └── :8443 L4/L7 routes for internal APIs

mc-proxy (svc)
  └── :443  L7 termination for public hostnames
            → forwards to internal .svc.mcp endpoints over Tailnet

Security Model

Authentication and Authorization

All gRPC channels (CLI↔master, master↔agent, agent→master) use TLS 1.3 with MCIAS bearer tokens. Every entity has a distinct MCIAS identity:

Entity MCIAS Identity Account Type
Operator CLI kyle (or personal account) human
mcp-master mcp-master service
Agent on rift agent-rift service
Agent on orion agent-orion service
Agent on svc agent-svc service

RPCs are authorized by caller role, not just authentication:

RPC Category Allowed Callers Rejected
CLI→master (Deploy, Undeploy, Status, Sync) human accounts, mcp-master (for self-management) agent service accounts
Agent→master (Register, Heartbeat) agent-* service accounts human accounts, mcp-master
Master→agent (Deploy, SetupEdgeRoute, HealthCheck) mcp-master only all others

The auth interceptor on both master and agent validates the bearer token via MCIAS, then checks the caller's account type and service name against the RPC's allowed-caller list. Unauthorized calls return PermissionDenied.

Trust Assumptions

The master is a fully trusted component. A compromised master can control the entire fleet: deploy arbitrary containers, exfiltrate data via snapshots, redirect traffic via edge routes. This is inherent to the master/agent topology and acceptable for a single-operator personal platform. Mitigations: the master runs on the operator's always-on machine (rift) behind Tailscale, authenticates to MCIAS with its own service identity, and all communication is TLS 1.3.

TLS Verification

All gRPC connections verify the peer's TLS certificate against the Metacrypt CA cert. Agents configure the CA cert path in their config:

[tls]
ca_cert = "/srv/mcp/certs/metacircular-ca.pem"

When an agent starts before the master is available (e.g., svc's agent starts before rift's boot sequence completes), the TLS connection fails and the agent retries with exponential backoff. The CA cert itself is pre-provisioned on all nodes — it does not depend on Metacrypt being running.

Registration Security

Agents self-register with the master, but registration is identity-bound:

  1. The master extracts the caller's MCIAS service name from the validated token (e.g., agent-rift).
  2. The expected node name is derived by stripping the agent- prefix.
  3. The RegisterRequest.name must match. agent-rift can only register name = "rift". A rogue agent cannot impersonate another node.
  4. The master maintains an allowlist of permitted agent identities:
[registration]
allowed_agents = ["agent-rift", "agent-svc", "agent-orion"]

Registration from unknown identities is rejected. Re-registration from the same identity updates the entry (handles restarts) and logs a warning with the previous address for audit.

Edge Route Validation

When the master sets up an edge route, it validates both ends:

  • Public hostname: must fall under an allowed domain (metacircular.net, wntrmute.net). Validation uses proper domain label matching — evilmetacircular.net is rejected. Implementation: the hostname must equal the allowed domain or be preceded by a . (e.g., mcq.metacircular.net matches, metacircular.net matches, xmetacircular.net does not).

  • Backend hostname: must end with .svc.mcp.metacircular.net (the internal DNS zone). The edge agent resolves it and verifies the result is a Tailnet IP (100.64.0.0/10). Non-Tailnet backends are rejected.

Certificate Issuance Policies

Per-identity restrictions in Metacrypt limit what each agent can issue:

Agent Allowed SANs Denied SANs
agent-rift, agent-orion *.svc.mcp.metacircular.net public domains
agent-svc *.metacircular.net, *.wntrmute.net .svc.mcp. names

This ensures a compromised edge agent cannot issue certs for internal names, and a compromised worker agent cannot issue certs for public names. The Metacrypt CA is not publicly trusted, which limits blast radius further.

Rate Limiting

The master rate-limits agent RPCs:

  • Register: 1 per minute per identity.
  • Heartbeat: 1 per 10 seconds per identity.
  • Maximum registered nodes: 16 (configurable).

Excess calls return ResourceExhausted.

Tailscale ACLs

Network-level restriction (configured in Tailscale admin, not MCP):

  • rift (master): can reach all agent gRPC ports (9444) on all nodes. The master process needs this to forward deploys and set up edge routes.
  • svc: can reach master gRPC (9555), backend service ports (443, 8443, 9443), and Metacrypt (8443). Blocked from MCIAS management, MCR push, and agent gRPC on other nodes.
  • Workers: can reach master gRPC, MCR (pull), Metacrypt, MCIAS. Blocked from other workers' agent ports and svc's agent port.

Node Infrastructure vs Deployed Services

Two categories of software run on the platform:

Node infrastructure runs on every node by definition, not because the master placed it. Managed by systemd, outside the master's deploy/undeploy/placement model:

Component Present on Managed by
mcp-agent all nodes systemd; upgraded via mcp agent upgrade
mc-proxy all nodes systemd; upgraded via binary replacement

The master interacts with node infrastructure (calls agent RPCs, manipulates mc-proxy routes) but does not deploy or place it. Node infrastructure is not tracked in the placements table.

Deployed services are placed by the master on specific nodes. Tracked in the placements table, managed through mcp deploy/undeploy/ migrate.

Snapshot Paths

Node infrastructure is snapshotted per-node (the same component exists on multiple nodes with different data):

/srv/mcp-master/snapshots/
  mc-proxy/rift/2026-04-02T00:00:00Z.tar.zst
  mc-proxy/svc/2026-04-02T00:00:00Z.tar.zst
  mcp-agent/rift/...
  mcp-agent/svc/...
  mcq/2026-04-02T00:00:00Z.tar.zst       # deployed service — single node
  mcias/2026-04-02T00:00:00Z.tar.zst

Service Placement

Deployed services declare a tier that determines where they run:

  • tier = "core" — scheduled on the master node. Used for platform infrastructure: MCIAS, metacrypt, mcr, mcns.
  • tier = "worker" (default) — auto-placed on a worker node. The master selects the node based on container count and health.

Explicit node pinning is still supported via node = "orion" for cases where a service must run on a specific machine. When node is set, it overrides tier.

Placement Algorithm

Worker placement is deliberately simple:

  1. Filter eligible nodes: healthy workers.
  2. Select the node with the fewest running containers.
  3. Break ties alphabetically by node name (deterministic).

All v2 nodes are amd64, so architecture filtering is not needed. Services do not declare resource requirements for v2. The heartbeat reports available resources (CPU, memory, disk) which the master uses for health assessment, but placement is container-count based. Resource- aware bin-packing is future work.

Service Definition

name   = "mcq"
tier   = "worker"                    # default; placed by master
active = true

[[components]]
name    = "mcq"
image   = "mcr.svc.mcp.metacircular.net:8443/mcq:v0.4.0"
volumes = ["/srv/mcq:/srv/mcq"]
cmd     = ["server", "--config", "/srv/mcq/mcq.toml"]

# Internal route: handled by the local node's mc-proxy.
[[components.routes]]
name     = "internal"
port     = 8443
mode     = "l7"

# Public route: master sets up edge routing on svc.
[[components.routes]]
name     = "public"
port     = 8443
mode     = "l7"
hostname = "mcq.metacircular.net"
public   = true

Core service example:

name   = "mcias"
tier   = "core"                      # always on master node
active = true

[[components]]
name    = "mcias"
image   = "mcr.svc.mcp.metacircular.net:8443/mcias:v1.10.5"
volumes = ["/srv/mcias:/srv/mcias"]
cmd     = ["mciassrv", "-config", "/srv/mcias/mcias.toml"]

v1 Compatibility

Existing v1 service definitions with node = "rift" continue to work (explicit pinning). New v2 fields (tier, public) default to their zero values ("worker", false) when absent. The validation rule changes from "node required" to "either node or tier must be set; tier defaults to worker if both are empty."


Proto Definitions

ServiceSpec and RouteSpec Updates

message ServiceSpec {
  string name    = 1;
  bool   active  = 2;
  repeated ComponentSpec components = 3;  // unchanged from v1
  string tier    = 4;  // "core" or "worker" (default: "worker")
  string node    = 5;  // explicit node pin (overrides tier)
  SnapshotConfig snapshot = 6;  // snapshot method and excludes
}

message SnapshotConfig {
  string method            = 1;  // "grpc", "cli", "exec: <cmd>", "full", or "" (default)
  repeated string excludes = 2;  // paths relative to /srv/<service>/ to skip
}

message RouteSpec {
  string name     = 1;
  int32  port     = 2;
  string mode     = 3;  // "l4" or "l7"
  string hostname = 4;
  bool   public   = 5;  // triggers edge routing
}

McpMasterService

service McpMasterService {
  // CLI operations.
  rpc Deploy(MasterDeployRequest) returns (MasterDeployResponse);
  rpc Undeploy(MasterUndeployRequest) returns (MasterUndeployResponse);
  rpc Status(MasterStatusRequest) returns (MasterStatusResponse);
  rpc Sync(MasterSyncRequest) returns (MasterSyncResponse);
  rpc Migrate(MigrateRequest) returns (MigrateResponse);

  // Fleet management.
  rpc ListNodes(ListNodesRequest) returns (ListNodesResponse);

  // Snapshots (CLI-triggered).
  rpc CreateSnapshot(CreateSnapshotRequest) returns (CreateSnapshotResponse);
  rpc ListSnapshots(ListSnapshotsRequest) returns (ListSnapshotsResponse);

  // Agent registration and health (called by agents).
  rpc Register(RegisterRequest) returns (RegisterResponse);
  rpc Heartbeat(HeartbeatRequest) returns (HeartbeatResponse);
}

message MasterDeployRequest {
  ServiceSpec service = 1;
}

message MasterDeployResponse {
  string node    = 1;  // node the service was placed on
  bool   success = 2;  // true only if ALL steps succeeded
  string error   = 3;
  // Per-step results for operator visibility. Partial failure is
  // possible: deploy succeeds but edge routing fails. The CLI shows
  // exactly what worked and what didn't.
  StepResult deploy_result     = 4;
  StepResult edge_route_result = 5;
  StepResult dns_result        = 6;
}

message StepResult {
  string step    = 1;
  bool   success = 2;
  string error   = 3;
}

message MasterUndeployRequest {
  string service_name = 1;
}

message MasterUndeployResponse {
  bool   success = 1;
  string error   = 2;
}

message MasterStatusRequest {
  string service_name = 1;  // empty = all services
}

message MasterStatusResponse {
  repeated ServiceStatus services = 1;
}

message ServiceStatus {
  string name   = 1;
  string node   = 2;
  string tier   = 3;
  string status = 4;  // "running", "stopped", "unhealthy", "unknown"
  repeated EdgeRouteStatus edge_routes = 5;
}

message EdgeRouteStatus {
  string hostname   = 1;
  string edge_node  = 2;
  string cert_expires = 3;
}

message MasterSyncRequest {
  repeated ServiceSpec services = 1;
}

message MasterSyncResponse {
  repeated StepResult results = 1;
}

message ListNodesRequest {}

message ListNodesResponse {
  repeated NodeInfo nodes = 1;
}

message NodeInfo {
  string name     = 1;
  string role     = 2;
  string address  = 3;
  string arch     = 4;
  string status   = 5;  // "healthy", "unhealthy", "unknown"
  int32  containers = 6;
  string last_heartbeat = 7;  // RFC3339
}

message RegisterRequest {
  string name    = 1;
  string role    = 2;
  string address = 3;
  string arch    = 4;
}

message RegisterResponse {
  bool accepted = 1;
}

message HeartbeatRequest {
  string name           = 1;
  int64  cpu_millicores = 2;
  int64  memory_bytes   = 3;
  int64  disk_bytes     = 4;
  int32  containers     = 5;
}

message HeartbeatResponse {
  bool acknowledged = 1;
}

Agent RPC Additions

// Health probe -- called by master on missed heartbeats.
rpc HealthCheck(HealthCheckRequest) returns (HealthCheckResponse);

// Edge routing -- called by master on edge nodes.
rpc SetupEdgeRoute(SetupEdgeRouteRequest) returns (SetupEdgeRouteResponse);
rpc RemoveEdgeRoute(RemoveEdgeRouteRequest) returns (RemoveEdgeRouteResponse);
rpc ListEdgeRoutes(ListEdgeRoutesRequest) returns (ListEdgeRoutesResponse);

message HealthCheckRequest {}

message HealthCheckResponse {
  string status     = 1;  // "healthy" or "degraded"
  int32  containers = 2;
}

message SetupEdgeRouteRequest {
  string hostname         = 1;  // public hostname
  string backend_hostname = 2;  // internal .svc.mcp hostname
  int32  backend_port     = 3;  // port on worker's mc-proxy
  bool   backend_tls      = 4;  // MUST be true; agent rejects false
}

message SetupEdgeRouteResponse {}

message RemoveEdgeRouteRequest {
  string hostname = 1;
}

message RemoveEdgeRouteResponse {}

message ListEdgeRoutesRequest {}

message ListEdgeRoutesResponse {
  repeated EdgeRoute routes = 1;
}

message EdgeRoute {
  string hostname         = 1;
  string backend_hostname = 2;
  int32  backend_port     = 3;
  string cert_serial      = 4;
  string cert_expires     = 5;
}

Agent Registration and Health

Registration

Agents self-register with the master on startup by calling McpMasterService.Register. The master validates the caller's MCIAS identity (see Security Model) and adds the node to its registry (SQLite).

If the master is unreachable at startup, the agent retries with exponential backoff (1s, 2s, 4s, ... capped at 60s). Running containers are unaffected — registration is a management concern, not a runtime one.

Heartbeats

Agents send heartbeats every 30 seconds via McpMasterService.Heartbeat. Each heartbeat includes resource data (CPU, memory, disk, container count). The master derives the agent's node name from the authenticated MCIAS identity (same as registration) — the name field in the heartbeat is verified against the token, not trusted blindly.

If the master has not received a heartbeat from an agent in 90 seconds (3 missed intervals), it probes the agent with HealthCheck. If the probe fails (5-second timeout), the agent is marked unhealthy. Unhealthy nodes are excluded from placement but their services continue running.

When a previously unhealthy agent sends a heartbeat, the master marks it healthy again.

Node Identity

Each agent authenticates to MCIAS as a distinct service user: agent-rift, agent-svc, agent-orion. Benefits:

  • Audit: logs show which node performed an action.
  • Least privilege: edge agents don't need image pull access.
  • Revocation: a compromised node's credentials can be revoked without affecting the fleet.

mcp-master

Responsibilities

  1. Accept CLI commands via gRPC (deploy, undeploy, status, sync).
  2. Maintain node registry from agent self-registration (SQLite).
  3. Place services on nodes based on tier, explicit node, and container count.
  4. Detect public routes (public = true) and coordinate edge routing.
  5. Validate public hostnames against allowed domain list.
  6. Assign edge nodes for public routes (currently always svc).
  7. Coordinate undeploy across nodes.
  8. Aggregate status from all agents for fleet-wide views.

What the Master Does NOT Do

  • Store container state (agents own their registries).
  • Manage container lifecycle directly (agents do this).
  • Run containers (the co-located agent does).
  • Replace the agent on any node.
  • Talk to agents on behalf of other agents.

Master State (SQLite)

The master maintains a SQLite database at /srv/mcp-master/master.db with three tables:

-- Registered nodes. Populated by agent Register RPCs.
-- Rebuilt from agent re-registration on master restart.
CREATE TABLE nodes (
    name        TEXT PRIMARY KEY,
    role        TEXT NOT NULL,
    address     TEXT NOT NULL,
    arch        TEXT NOT NULL,
    status      TEXT NOT NULL DEFAULT 'unknown',
    containers  INTEGER NOT NULL DEFAULT 0,
    last_heartbeat TEXT
);

-- Service placements. Records which node hosts which service.
-- Populated on deploy, removed on undeploy.
CREATE TABLE placements (
    service_name TEXT PRIMARY KEY,
    node         TEXT NOT NULL REFERENCES nodes(name),
    tier         TEXT NOT NULL,
    deployed_at  TEXT NOT NULL
);

-- Edge routes. Records public routes for undeploy cleanup.
CREATE TABLE edge_routes (
    hostname         TEXT PRIMARY KEY,
    service_name     TEXT NOT NULL REFERENCES placements(service_name),
    edge_node        TEXT NOT NULL REFERENCES nodes(name),
    backend_hostname TEXT NOT NULL,
    backend_port     INTEGER NOT NULL,
    created_at       TEXT NOT NULL
);

-- Snapshot metadata. The archive files live on disk at
-- /srv/mcp-master/snapshots/<service>/<timestamp>.tar.zst.
CREATE TABLE snapshots (
    id           INTEGER PRIMARY KEY AUTOINCREMENT,
    service_name TEXT    NOT NULL,
    node         TEXT    NOT NULL,
    filename     TEXT    NOT NULL,
    size_bytes   INTEGER NOT NULL,
    created_at   TEXT    NOT NULL
);
CREATE INDEX idx_snapshots_service ON snapshots(service_name, created_at DESC);

On master restart, the node registry is rebuilt as agents re-register (within 30s via heartbeat). Placements and edge routes persist across restarts. The master reconciles placements against actual agent state on startup (see Reconciliation).

Master Configuration

[server]
grpc_addr = "100.x.x.x:9555"     # master listens on Tailnet
tls_cert  = "/srv/mcp-master/certs/cert.pem"
tls_key   = "/srv/mcp-master/certs/key.pem"

[database]
path = "/srv/mcp-master/master.db"

[mcias]
server_url   = "https://mcias.metacircular.net:8443"
service_name = "mcp-master"

[edge]
allowed_domains = ["metacircular.net", "wntrmute.net"]

[registration]
allowed_agents = ["agent-rift", "agent-svc", "agent-orion"]
max_nodes      = 16

[timeouts]
deploy       = "5m"
edge_route   = "30s"
health_check = "5s"
undeploy     = "2m"
snapshot     = "10m"

# Bootstrap: master's own agent (can't self-register before master starts).
[[nodes]]
name    = "rift"
address = "100.95.252.120:9444"
role    = "master"

Boot Sequencing

The master node runs core infrastructure that other services depend on. On boot, these services must start in dependency order. Only the master needs sequencing -- worker and edge nodes start their agent and wait for registration with the master.

The master's agent config declares boot stages:

[[boot.sequence]]
name     = "foundation"
services = ["mcias", "mcns"]
timeout  = "120s"
health   = "tcp"

[[boot.sequence]]
name     = "core"
services = ["metacrypt", "mcr"]
timeout  = "60s"
health   = "tcp"

[[boot.sequence]]
name     = "management"
services = ["mcp-master"]
timeout  = "30s"
health   = "grpc"

Stage 1 -- Foundation: MCIAS and MCNS start first. Every other service needs authentication (MCIAS) and DNS resolution (MCNS).

Stage 2 -- Core: Metacrypt and MCR start once auth and DNS are available. Agents need Metacrypt for cert provisioning and MCR for image pulls.

Stage 3 -- Management: MCP-Master starts last. It requires all infrastructure services to be running before it can coordinate the fleet.

mcp-master runs as a container managed by the agent, just like any other service. This means updates are a normal mcp deploy (or image bump in the bootstrap config), and the agent handles restarts via podman's --restart unless-stopped policy.

Bootstrap (first boot): On initial cluster setup, no images exist in MCR yet. The boot sequence config references container images, but MCR doesn't start until stage 2. Resolution:

  • Stage 1 and 2 images (MCIAS, MCNS, Metacrypt, MCR) must be pre-staged into the local podman image store before first boot (podman load or podman pull from an external source).
  • Once MCR is running (stage 2), stage 3 (mcp-master) can pull its image from MCR normally.
  • Subsequent boots use cached images. Image updates go through the normal mcp deploy flow (which pulls from MCR).

The boot sequence config contains full service definitions (image, volumes, cmd, routes) — not just service names. This is the only place where service definitions live on the agent rather than being pushed from the CLI via the master.

Health check types:

  • tcp — connect to the container's mapped port. Success = connection accepted. Used for most services.
  • grpc — call the gRPC health endpoint. Used for services with gRPC.
  • http — GET a health endpoint. Future option.

Timeout behavior: Depends on the stage:

  • Foundation (MCIAS, MCNS): failure blocks boot. The agent retries indefinitely with backoff and alerts the operator. All downstream services depend on auth and DNS — proceeding is futile.
  • Core and management: failure logs an error and proceeds. The operator can fix the failed service manually. Partial boot is better than no boot for non-foundation services.

The agent treats boot sequencing as a startup concern only. Once all stages complete, normal operations proceed. If a foundation service crashes at runtime, the agent restarts it independently via the --restart unless-stopped podman policy.

Boot config drift: The boot sequence config contains pinned image versions. When the operator updates a service via mcp deploy, the boot config is NOT automatically updated. On reboot, the agent starts the old version; the master then deploys the current version. This is self-correcting for core and management services, but foundation services (MCIAS, MCNS) run before the master exists. When updating foundation service images, also update the boot sequence config.

Reconciliation

On startup, the master actively probes all nodes it knows about from its persisted nodes table — it does not wait for agents to re-register. This means the master has a fleet-wide view within seconds of starting, rather than waiting up to 30s per agent heartbeat cycle.

The initial probe cycle is a warm-up phase: the master builds its fleet view but does not emit health alerts. Once all known nodes have been probed (or the probe timeout expires), the master transitions to ready and begins normal health alerting. This avoids noisy "unhealthy" warnings for agents that simply haven't started yet.

  1. Probe known nodes: For each node in the nodes table, the master calls HealthCheck (5s timeout). Nodes that respond are marked healthy; nodes that don't respond are marked unhealthy. Agent self-registration still runs in the background and updates addresses or adds new nodes, but reconciliation does not depend on it.
  2. Check placements: For each placement in the database, query the hosting agent's Status RPC (bulk — one call per agent, not per service). If the agent reports a service is not running, mark the placement as stale (log warning, do not auto-redeploy).
  3. Detect orphans: For each service running on an agent that has no matching placement record, log it as an orphan. Orphans may result from failed deploys, manual podman run, or v1 leftovers.
  4. Check edge routes: For each edge route in the database, query the edge agent for route status.
  5. Check snapshot freshness: Flag any service whose latest snapshot is older than 2x the snapshot interval (e.g., older than 48 hours with a 24-hour cycle). Stale snapshots are a disaster recovery risk.
  6. Report: All discrepancies (stale placements, orphans, missing edge routes, unhealthy nodes, stale snapshots) are reported via mcp status and structured logs.

Reconciliation is read-only — it detects drift but does not auto-remediate. The operator reviews mcp status output and takes action. Auto-reconciliation is future work.


Edge Routing

The core v2 feature: when a service declares public = true on a route, the master automatically provisions the edge route.

Deploy Flow with Edge Routing

When the master receives Deploy(mcq):

  1. Place service: Master selects the target node based on tier/node/ container count. For mcq (tier=worker), master picks the least-loaded healthy worker.

  2. Deploy to worker: Master sends Deploy RPC to the worker's agent (timeout: 5m). The agent deploys the container, provisions a TLS cert for mcq.svc.mcp.metacircular.net from Metacrypt, and registers the internal mc-proxy route.

  3. Register DNS: Master registers an A record for the internal hostname (mcq.svc.mcp.metacircular.net) pointing to the worker's Tailnet IP via MCNS. This is the backend address that edge and internal clients resolve.

  4. Detect public routes: Master inspects the service spec for routes with public = true.

  5. Validate hostname: Master checks that mcq.metacircular.net falls under an allowed domain using proper domain label matching.

  6. Check public DNS: Master resolves mcq.metacircular.net to verify it points to the edge node's public IP. Public DNS records are pre-provisioned manually at Hurricane Electric. If the hostname does not resolve, the master warns but continues — the operator may be setting up DNS in parallel.

  7. Validate backend hostname: Master verifies the internal hostname (mcq.svc.mcp.metacircular.net) ends with .svc.mcp.metacircular.net. The internal hostname is derived from the service and component name using the convention <component>.svc.mcp.metacircular.net.

  8. Assign edge node: Master selects an edge node (currently svc).

  9. Set up edge route: Master sends SetupEdgeRoute RPC to svc's agent (timeout: 30s):

    SetupEdgeRoute(
      hostname:         "mcq.metacircular.net"
      backend_hostname: "mcq.svc.mcp.metacircular.net"
      backend_port:     8443
      backend_tls:      true
    )
    
  10. Svc agent provisions: On receiving SetupEdgeRoute, svc's agent: a. Validates that backend_hostname ends with .svc.mcp.metacircular.net. b. Resolves backend_hostname — verifies result is a Tailnet IP (100.64.0.0/10). c. Provisions a TLS certificate from Metacrypt for the public hostname mcq.metacircular.net only. Internal names never appear on edge certs. d. Registers an L7 route in its local mc-proxy: mcq.metacircular.net:443 → <worker-tailnet-ip>:8443 with backend_tls = true.

  11. Master records the edge route in its SQLite database.

  12. Master returns structured result to CLI with per-step status.

Failure handling: If any step fails, the master returns the error to the CLI with the step that failed. If the deploy succeeded but edge routing failed, the service is running internally but not publicly reachable. The operator can retry with mcp deploy (idempotent) or fix the issue and run mcp sync.

If cert provisioning fails during deploy (step 2 or 8), the deploy fails — the agent does not register an mc-proxy route pointing to a nonexistent cert. This prevents the silent TLS failure from v1.

Undeploy Flow

  1. Undeploy on worker first: Master sends Undeploy RPC to the worker agent (timeout: 2m). The agent tears down the container, routes, DNS, and certs. This stops the backend, ensuring no traffic is served during edge cleanup.
  2. Remove edge route: Master sends RemoveEdgeRoute to svc's agent. Svc removes the mc-proxy route and cleans up the cert.
  3. Master removes records from placements and edge_routes tables.

Ordering rationale: undeploy the backend first so that if edge cleanup fails, the service is already stopped and the edge route returns a 502 rather than serving stale content.

Certificate Model

Two separate certs per public service — internal names never appear on edge certs:

Cert Provisioned by SAN Used on
Internal Worker agent → Metacrypt mcq.svc.mcp.metacircular.net Worker's mc-proxy
Public Edge agent → Metacrypt mcq.metacircular.net Edge's mc-proxy

Edge cert renewal is the edge agent's responsibility. The agent runs the same renewWindow check as worker agents, renewing certs before they expire (90-day TTL, renew at 30 days remaining).


Snapshots

The master maintains periodic snapshots of every service's data. Snapshots are the foundation for both migration and disaster recovery — if a node dies, the master can restore a service to a new node from its latest snapshot without the source node being alive.

All nodes have LUKS-encrypted disks. Snapshots are stored on the master's encrypted disk, so service data is encrypted at rest at both source and destination. An existing backup service on rift replicates to external storage, covering the case where rift itself is lost.

Snapshot Mechanism

The service definition declares how the agent should trigger a consistent snapshot via the method field:

[snapshot]
method  = "grpc"                       # preferred for Metacircular services
exclude = ["layers/", "uploads/"]      # paths to skip (optional)

Methods:

Method How it works Best for
grpc Agent calls the standard Snapshot gRPC RPC on the service's gRPC port. The service vacuums databases and confirms. Agent then tars. Metacircular services with gRPC servers
cli Agent runs podman exec <container> <service> snapshot (the engineering standard's snapshot CLI command). Agent then tars. Metacircular services without gRPC
exec: <cmd> Agent runs podman exec <container> <cmd>. Agent then tars. Non-standard services with custom backup scripts
full Agent tars the entire /srv/<service>/ directory, auto-vacuuming any .db files found. Services that need everything backed up
(omitted) Agent collects only *.toml, *.db, and *.pem files from /srv/<service>/ — config, database, and certs. .db files are auto-vacuumed. Default — covers the essentials without configuration

The default (no [snapshot] section) captures the minimum needed to restore a service: config, database, and TLS certs. This keeps snapshot sizes small and predictable. Services that need more data (e.g., file uploads, state directories) opt into full or specify paths explicitly.

exclude works with any method. MCR uses exclude to skip layer blobs (which can be rebuilt from git) while still capturing its database and config.

Database consistency: For grpc and cli methods, the service owns its own vacuum logic. For full and the default, the agent detects .db files and runs VACUUM INTO to a temp copy before including them in the tar. WAL and SHM files are excluded (the vacuumed copy is self-contained).

Standard Snapshot gRPC Service (mcdsl)

The grpc snapshot method uses a standard RPC that Metacircular services implement via the mcdsl/snapshot package — same pattern as mcdsl/health:

service SnapshotService {
  rpc Snapshot(SnapshotRequest) returns (SnapshotResponse);
}

message SnapshotRequest {}

message SnapshotResponse {
  bool   success = 1;
  string error   = 2;
  string path    = 3;  // path to the vacuumed backup (e.g. /srv/mcq/backups/...)
}

Services register the SnapshotService on their gRPC server. The mcdsl/snapshot package provides a default implementation that reads the database path from the service's config, runs VACUUM INTO, and returns the backup path. Services with custom snapshot needs can override the handler.

Service Definition Examples

Metacircular service with gRPC (preferred):

[snapshot]
method = "grpc"

MCR (skip layer blobs):

[snapshot]
method = "grpc"
exclude = ["layers/", "uploads/"]

Non-Metacircular service with custom backup:

[snapshot]
method = "exec: /usr/local/bin/backup.sh"

Service with no snapshot config (default — captures *.toml, *.db, *.pem):

# No [snapshot] section needed

Snapshot Storage

Snapshots are stored as flat files on the master node:

/srv/mcp-master/snapshots/
  mcq/2026-04-01T00:00:00Z.tar.zst            # deployed service
  mcias/2026-04-01T00:00:00Z.tar.zst
  mc-proxy/rift/2026-04-01T00:00:00Z.tar.zst   # node infra — per-node
  mc-proxy/svc/2026-04-01T00:00:00Z.tar.zst

Deployed services use <service>/<timestamp>.tar.zst. Node infrastructure uses <service>/<node>/<timestamp>.tar.zst to avoid collisions between instances on different nodes (see "Node Infrastructure vs Deployed Services").

Format: tar.zst (tar archive with zstandard compression). One file per snapshot, named by UTC timestamp.

Snapshot Scheduling

The master runs a scheduled job that snapshots all services every 24 hours. The master iterates over all placements and for each one:

  1. Acquires a per-service lock (skips if deploy/migrate/undeploy is in progress).
  2. Sends ExportServiceData(service_name) to the hosting agent (timeout: 10m).
  3. The agent runs the snapshot command (if configured), creates a tar.zst archive of /srv/<service>/ (respecting excludes), and streams it back.
  4. The master writes the archive to the snapshots directory.
  5. The master prunes old snapshots (keep last N, configurable).

Scheduled snapshots are live — the service keeps running. Database consistency is ensured by the vacuum step, not by stopping the container. Migration snapshots use a different flow (stop first, then tar) for perfect consistency.

Agent fallback rule: If ExportServiceData is called and the container is not running (migration case), the agent skips the configured snapshot method (grpc/cli/exec) and falls back to a direct tar with auto-vacuum of .db files. This is correct because the container already vacuumed on shutdown (SIGTERM handler).

For v2, the master always requests a full snapshot — no change detection. Intelligence about dirty vs. clean services is future optimization.

Concurrency

The master holds a per-service lock for all operations that touch a service (deploy, undeploy, migrate, snapshot). If a scheduled snapshot overlaps with a deploy or migration, the snapshot waits. This prevents capturing partial state during multi-step operations.

Snapshot RPCs

// Service data export -- called by master on any agent.
// Authorization: mcp-master only.
rpc ExportServiceData(ExportServiceDataRequest)
    returns (stream DataChunk);

// Service data import -- called by master on any agent.
// Authorization: mcp-master only.
rpc ImportServiceData(stream ImportServiceDataChunk)
    returns (ImportServiceDataResponse);

message ExportServiceDataRequest {
  string service_name = 1;
  // Snapshot config is stored in the agent's registry at deploy time.
  // The agent uses its persisted config to determine the snapshot method
  // (grpc, cli, exec, full, default) and exclude patterns.
}

message DataChunk {
  bytes data = 1;
}

message ImportServiceDataChunk {
  // First message sets the service name; subsequent messages carry data.
  string service_name = 1;
  bytes  data         = 2;
  bool   force        = 3;  // overwrite existing /srv/<service>/ (first message only)
}

message ImportServiceDataResponse {
  int64 bytes_written = 1;
}

Note: ExportServiceData/ImportServiceData transfer full directory archives. The existing PushFile/PullFile RPCs transfer individual files and serve a different purpose (config distribution, cert provisioning).

Master Snapshot Config

[snapshots]
dir      = "/srv/mcp-master/snapshots"
interval = "24h"
retain   = 7    # keep last 7 snapshots per service

Service Migration

Services can be migrated between nodes with mcp migrate. This is essential for moving workloads off rift (which starts as both master and worker) onto dedicated workers like orion as they come online.

Migration uses snapshots for data transfer. This means migration works even if the source node is down (disaster recovery).

Constraints

  • Core services cannot be migrated. tier = "core" services are bound to the master node. Moving core services means designating a new master — a manual, deliberate operation outside the scope of mcp migrate.
  • Edge nodes are not migration targets. Edge nodes run mc-proxy only, not application containers.

Migration Flow

mcp migrate mcq --to orion

When the master receives Migrate(mcq, orion):

  1. Validate: Master verifies orion is a healthy worker. Rejects migration of tier = "core" services and migration to edge nodes.

  2. Stop on source (if source is alive): Master sends Stop RPC to the source agent. The agent gracefully stops the container (SIGTERM). The service runs its shutdown handler, which vacuums databases per the engineering standard. If the source is down, skip this step.

  3. Snapshot (if source is alive): Agent tars /srv/<service>/ (now consistent — the service vacuumed on shutdown) and streams it to the master. If the source is down, the master uses the most recent stored snapshot.

  4. Push snapshot to destination: Master streams the snapshot to the destination agent via ImportServiceData. The agent creates /srv/<service>/ (with correct permissions) and extracts the archive.

  5. Deploy on destination: Master sends Deploy RPC to the destination agent (orion). The agent deploys the container using the restored data. Provisions internal TLS cert and registers mc-proxy route on the new node.

  6. Update DNS: Master updates the internal A record (mcq.svc.mcp.metacircular.net) to point to orion's Tailnet IP.

  7. Update edge route (if public): Master sends SetupEdgeRoute to svc's agent with the updated backend. The edge agent updates the mc-proxy route. No new cert needed — the public hostname hasn't changed.

  8. Clean up source (if source is alive): Master sends Undeploy to the source agent to remove the stopped container, old routes, old certs, and old DNS records.

  9. Update placement: Master updates the placements table to reflect the new node. This step runs regardless of whether source cleanup succeeded.

Disaster Recovery

If a node dies, the operator migrates its services to another node:

mcp migrate mcq --to orion          # source is down, uses latest snapshot
mcp migrate --all --from rift --to orion  # evacuate all services

The master detects the source is unreachable (unhealthy in node registry), skips the stop and cleanup steps, and restores from the stored snapshot. Data loss is bounded by the snapshot interval (24 hours).

Batch Migration

Full node evacuation for decommissioning or disaster recovery:

mcp migrate --all --from rift --to orion

The master migrates each service sequentially. Core services are skipped (they cannot be migrated). The operator sees per-service progress. If any migration fails, the master stops and reports which service failed — the operator can fix the issue and resume with --all (already-migrated services are skipped since they no longer have placements on the source node).

Migration Safety

  • The source data is not deleted until step 8 (cleanup). If migration fails mid-transfer, the source still has the complete data and the operator can retry or roll back.
  • The master rejects migration if the destination already has a /srv/<service>/ directory (prevents accidental overwrite). Use --force to override.
  • Downtime window: from stop (step 2) to the new container starting (step 5). For a personal platform this is acceptable.
  • Migration snapshots use stop-then-tar for perfect consistency. Scheduled daily snapshots use live vacuum (no downtime).

Migration Proto

rpc Migrate(MigrateRequest) returns (MigrateResponse);

message MigrateRequest {
  string service_name = 1;
  string target_node  = 2;
  bool   force        = 3;  // overwrite existing /srv/<service>/ on target
  bool   all          = 4;  // migrate all services from source
  string source_node  = 5;  // required when all=true
  // Validation: reject if all=true AND service_name is set (ambiguous).
}

message MigrateResponse {
  repeated StepResult results = 1;
}

// Note: CreateSnapshot/ListSnapshots are master CLI commands.
// The mcdsl SnapshotService.Snapshot RPC is a separate, service-level
// RPC called by agents on individual services.
rpc CreateSnapshot(CreateSnapshotRequest) returns (CreateSnapshotResponse);
rpc ListSnapshots(ListSnapshotsRequest) returns (ListSnapshotsResponse);

message CreateSnapshotRequest {
  string service_name = 1;
}

message CreateSnapshotResponse {
  string filename   = 1;
  int64  size_bytes = 2;
}

message ListSnapshotsRequest {
  string service_name = 1;
}

message ListSnapshotsResponse {
  repeated SnapshotInfo snapshots = 1;
}

message SnapshotInfo {
  string service_name = 1;
  string node         = 2;  // node the snapshot was taken from
  string filename     = 3;
  int64  size_bytes   = 4;
  string created_at   = 5;  // RFC3339
}

CLI

mcp migrate <service> --to <node>              # migrate single service
mcp migrate <service> --to <node> --force      # overwrite existing data
mcp migrate --all --from <node> --to <node>    # evacuate all services
mcp snapshot <service>                         # take an on-demand snapshot
mcp snapshot list <service>                    # list available snapshots

Agent Changes for v2

New RPCs

See Proto Definitions section above for full message definitions.

  • HealthCheck — called by master on missed heartbeats.
  • SetupEdgeRoute — called by master on edge nodes.
  • RemoveEdgeRoute — called by master on edge nodes.
  • ListEdgeRoutes — called by master on edge nodes.

All new RPCs require the caller to be mcp-master (authorization check).

Cert Provisioning on All Agents

All agents need Metacrypt configuration:

[metacrypt]
server_url = "https://metacrypt.svc.mcp.metacircular.net:8443"
ca_cert    = "/srv/mcp/certs/metacircular-ca.pem"
mount      = "pki"
issuer     = "infra"
token_path = "/srv/mcp/metacrypt-token"

Worker agents provision certs for internal hostnames. Edge agents provision certs for public hostnames. Both use the same Metacrypt API but with different identity-scoped policies.

mc-proxy Management

The agent is the sole manager of mc-proxy routes via the gRPC admin API. TOML config is not used for route management — this avoids the database/config divergence problem from v1. mc-proxy's TOML config only sets listener addresses and TLS defaults.

On mc-proxy restart, routes survive in mc-proxy's own SQLite database. If mc-proxy's database is lost, the agent detects missing routes during its monitoring cycle and re-registers them.

Deploy Failure on Cert Error

If cert provisioning fails during deploy, the agent must fail the deploy — do not register an mc-proxy route pointing to a nonexistent cert. Return an error to the master, which reports it to the CLI. The current v1 behavior (log warning, continue) is a bug.


CLI Changes for v2

The CLI gains a [master] section and retains [[nodes]] for direct access:

[master]
address = "100.x.x.x:9555"

# Retained for --direct mode (bypass master when it's down).
[[nodes]]
name    = "rift"
address = "100.95.252.120:9444"

[[nodes]]
name    = "svc"
address = "100.x.x.x:9444"

[mcias]
server_url   = "https://mcias.metacircular.net:8443"
service_name = "mcp"

[auth]
token_path = "/home/kyle/.config/mcp/token"

[services]
dir = "/home/kyle/.config/mcp/services"

By default, all commands go through the master. The --direct flag bypasses the master and dials agents directly (v1 behavior):

mcp deploy mcq              # → master
mcp deploy mcq --direct -n rift  # → agent on rift (v1 mode)
mcp ps                      # → master aggregates all agents
mcp ps --direct             # → each agent individually (v1 mode)

--direct is the escape hatch when the master is down. In direct mode, deploy requires an explicit --node flag (the CLI cannot auto-place without the master).

Sync Semantics

mcp sync is declarative: the service definitions on the operator's workstation are the source of truth. The master converges the fleet:

  • New definitions → deploy.
  • Changed definitions → redeploy.
  • Definitions present in the master's placement table but absent from the sync request → undeploy.

This makes the services directory a complete, auditable declaration of what should be running. Use mcp sync --dry-run to preview what sync would do without executing.

Direct Mode Caveat

Services deployed via --direct (bypassing the master) are invisible to the master — no placement record exists. Reconciliation detects them as orphans. To bring a directly-deployed service under master management, redeploy it through the master.

New Commands

mcp edge list               # list all public edge routes
mcp edge status             # health of edge routes (cert expiry, backend reachable)
mcp node list               # fleet status from master

Service definition files remain on the operator's workstation. The CLI pushes them to the master on mcp deploy and mcp sync.


Agent Upgrades

The fleet is heterogeneous (NixOS + Debian, amd64 + arm64). NixOS flake inputs don't work as a universal update mechanism.

MCP owns the binary at /srv/mcp/mcp-agent on all nodes.

mcp agent upgrade [node]    # cross-compile, SCP, restart via SSH
  • CLI cross-compiles for the target's GOARCH.
  • Copies via SCP to /srv/mcp/mcp-agent.new.
  • Restarts via SSH. The restart command is OS-aware: doas on NixOS (rift, orion), sudo on Debian (svc). Configurable per node.
  • Running containers survive the restart — rootless podman containers are independent of the agent process. --restart unless-stopped means podman handles liveness.
  • The upgrade window (agent down for ~2s) only affects management operations. The master marks the agent as temporarily unhealthy until the next heartbeat.

All nodes: binary at /srv/mcp/mcp-agent, systemd unit mcp-agent.service.


Migration Plan

Phase 1: Agent on svc

Deploy mcp-agent to svc (Debian):

  • Create mcp user, install binary via SCP, configure systemd.
  • Configure with Metacrypt access and mc-proxy gRPC socket access.
  • Migrate existing mc-proxy TOML routes to agent-managed routes: export current routes from mc-proxy SQLite, import via agent AddProxyRoute RPCs.
  • Verify with mcp node list (svc shows up).

Phase 2: Edge routing RPCs

Implement SetupEdgeRoute, RemoveEdgeRoute, ListEdgeRoutes on the agent. Test by calling directly from the CLI (temporary mcp edge setup scaffolding command, removed after phase 3).

Phase 3: Build mcp-master

Core coordination loop. Uses bootstrap [[nodes]] config for agent addresses (dynamic registration comes in phase 4):

  1. gRPC server with McpMasterService.
  2. SQLite database for placements and edge routes.
  3. Accept Deploy / Undeploy from CLI.
  4. Place service on a node (tier / container-count).
  5. Forward deploy to the correct agent.
  6. Register DNS via MCNS.
  7. Detect public = true routes, validate, call SetupEdgeRoute.
  8. Return structured per-step results to CLI.

Phase 4: Agent registration and health

  • Agents self-register on startup (identity-bound).
  • Heartbeat loop (30s interval, resource data).
  • Master probe on missed heartbeats (90s threshold, 5s timeout).
  • Fleet status aggregation for mcp ps and mcp node list.
  • Reconciliation on master startup.
  • Master transitions from bootstrap [[nodes]] to dynamic registry.

Phase 5: Snapshots and migration

  • Implement ExportServiceData / ImportServiceData on agents.
  • Implement mcdsl/snapshot standard gRPC service.
  • Add snapshot scheduling to master (24h cycle, retention pruning).
  • Implement CreateSnapshot, ListSnapshots, Migrate on master.
  • Add mcp snapshot, mcp snapshot list, mcp migrate CLI commands.
  • Test migration between rift and orion.

Phase 6: Cut over

  • Update CLI config to add [master] section.
  • Update service definitions with tier and public fields.
  • Deploy agent to orion.
  • Verify all services via mcp ps and public endpoint tests.
  • Keep [[nodes]] config and --direct flag as escape hatch.

Hostname Convention for Public Services

Services with public routes have two hostnames:

Hostname Purpose Example
<svc>.metacircular.net Public — browser access, SSO login mcq.metacircular.net
<svc>.svc.mcp.metacircular.net Internal — API clients, service-to-service mcq.svc.mcp.metacircular.net

SSO always uses the public hostname. The service's [sso].redirect_uri and the MCIAS SSO client registration both point to the public hostname (e.g., https://mcq.metacircular.net/sso/callback). SSO state cookies are bound to the domain they are set on, so the entire browser-based login flow must stay on a single hostname.

API clients use the internal hostname. Service-to-service calls, CLI tools, and MCP server communication authenticate with bearer tokens (not SSO) and use the internal .svc.mcp. hostname. These do not involve browser cookies and are unaffected by the SSO hostname constraint.

This means:

  • Human users bookmark mcq.metacircular.net, not the .svc.mcp. URL.
  • The web UI's SSO "Sign in" button always initiates the flow on the public hostname.
  • API endpoints on both hostnames accept the same bearer tokens — the hostname distinction is a routing and cookie concern, not an auth concern.

Superseded Documents

docs/edge-routing-design.md is superseded by this document. It used agent-to-agent communication, a single shared cert, private key transmission over gRPC, and an edge field instead of public. None of these design choices carried forward to v2.


Open Questions

  1. Master HA: mcp-master is a single point of failure. For v2, this is acceptable — the operator can use --direct to bypass the master. Future work could add master replication.

  2. Auto-reconciliation: The master detects drift but does not auto-remediate. Future work could add automatic redeploy on drift.

v2 Scope

v2 targets amd64 nodes only: rift (master+worker), orion (worker), svc (edge). All images are single-arch amd64.

Fast-Follow: arm64 Support

Immediate follow-up after v2 to onboard Raspberry Pi workers (hyperborea and others):

  1. MCR manifest list support: Accept and serve OCI image indexes (application/vnd.oci.image.index.v1+json) so a single tag references both amd64 and arm64 variants.
  2. mcp build multi-arch: Build linux/amd64 + linux/arm64 images and push manifest lists to MCR.
  3. Onboard RPi workers: Deploy agents, add to registration allowlist. Placement remains arch-agnostic — podman pulls the correct variant automatically.

What v2 Does NOT Include

These remain future work beyond the arm64 fast-follow:

  • Auto-reconciliation (master-driven redeploy on drift)
  • Zero-downtime live migration (v2 migration stops the service)
  • Web UI for fleet management
  • Observability / log aggregation
  • Object store
  • Multiple edge nodes with load-based assignment
  • Master replication / HA
  • Resource-aware bin-packing (requires resource declarations in service defs)