Files
metacircular/docs/packaging-and-deployment.md
Kyle Isom 979a64a854 Update packaging guide for multi-node fleet topology
Reflect that the platform now spans multiple nodes (rift for compute,
svc for public edge routing, orion provisioned but offline). Add Fleet
Topology section, update deploy steps to include TLS cert provisioning
from Metacrypt CA, DNS registration in MCNS, and gRPC-based mc-proxy
route registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:14:23 -07:00

22 KiB
Raw Blame History

Packaging and Deploying to the Metacircular Platform

This guide provides everything needed to build, package, and deploy a service to the Metacircular platform. It assumes no prior knowledge of the platform's internals.


Platform Overview

Metacircular is a multi-service infrastructure platform. Services are Go binaries running as containers across a fleet of Linux nodes, managed by these core components:

Component Role
MCP (Control Plane) Deploys, monitors, and manages container lifecycle via rootless Podman
MCR (Container Registry) OCI container registry at mcr.svc.mcp.metacircular.net:8443
mc-proxy (TLS Proxy) Routes traffic to services via L4 (SNI passthrough) or L7 (TLS termination)
MCIAS (Identity Service) Central SSO/IAM — all services authenticate through it
MCNS (DNS) Authoritative DNS for *.svc.mcp.metacircular.net

The operator workflow is: build image → push to MCR → write service definition → deploy via MCP. MCP handles port assignment, TLS cert provisioning, route registration, DNS registration, and container lifecycle.

Fleet Topology

The platform runs across multiple nodes connected via Tailnet:

Node Role OS Arch Purpose
rift Compute + core infra NixOS amd64 Runs most services (Metacrypt, MCR, MCNS, etc.)
svc Edge Debian amd64 Public-facing mc-proxy, routes traffic over Tailnet to compute nodes
orion Compute NixOS amd64 Provisioned, currently offline

Node roles:

  • Compute nodes (rift, orion, future RPis) run the full container lifecycle via rootless Podman.
  • Edge nodes (svc) run mc-proxy for public traffic routing only. The MCP agent on edge nodes manages mc-proxy routes but does not run application containers.

Prerequisites

Requirement Details
Go 1.25+
Container engine Docker or Podman (for building images)
mcp CLI Installed on the operator workstation
MCR access Credentials to push images to mcr.svc.mcp.metacircular.net:8443
MCP agent Running on the target node (rift for services, svc for edge routing)
MCIAS account For mcp CLI authentication to the agent

1. Build the Container Image

Dockerfile Pattern

All services use a two-stage Alpine build. This is the standard template:

FROM golang:1.25-alpine AS builder

RUN apk add --no-cache git
WORKDIR /build
COPY go.mod go.sum ./
RUN go mod download
COPY . .

ARG VERSION=dev
RUN CGO_ENABLED=0 go build -trimpath \
    -ldflags="-s -w -X main.version=${VERSION}" \
    -o /<binary> ./cmd/<binary>

FROM alpine:3.21

RUN apk add --no-cache ca-certificates tzdata
COPY --from=builder /<binary> /usr/local/bin/<binary>

WORKDIR /srv/<service>
EXPOSE <ports>

ENTRYPOINT ["<binary>"]
CMD ["server", "--config", "/srv/<service>/<service>.toml"]

Dockerfile Rules

  • CGO_ENABLED=0 — all builds are statically linked. No CGo in production.
  • ca-certificates and tzdata — required in the runtime image for TLS verification and timezone-aware logging.
  • No USER directive — containers run as --user 0:0 under MCP's rootless Podman. UID 0 inside the container maps to the unprivileged mcp host user. A non-root USER directive creates a subordinate UID that cannot access host-mounted volumes.
  • No VOLUME directive — causes layer unpacking failures under rootless Podman. The host volume mount is declared in the service definition, not the image.
  • No adduser/addgroup — unnecessary given the rootless Podman model.
  • WORKDIR /srv/<service> — so relative paths resolve correctly against the mounted data directory.
  • Version injection — pass the git tag via --build-arg VERSION=... so the binary can report its version.
  • Stripped binaries-trimpath -ldflags="-s -w" removes debug symbols and build paths.

Split Binaries

If the service has separate API and web UI binaries, create separate Dockerfiles:

  • Dockerfile.api — builds the API/gRPC server
  • Dockerfile.web — builds the web UI server

Both follow the same template. The web binary communicates with the API server via gRPC (no direct database access).

Makefile Target

Every service includes a make docker target:

docker:
	docker build --build-arg VERSION=$(shell git describe --tags --always --dirty) \
	    -t <service> -f Dockerfile.api .

2. Write a Service Definition

Service definitions are TOML files that tell MCP what to deploy. They live at ~/.config/mcp/services/<service>.toml on the operator workstation.

Minimal Example (Single Component, L7)

name = "myservice"
node = "rift"

[build.images]
myservice = "Dockerfile"

[[components]]
name = "web"
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"

[[components.routes]]
port = 8443
mode = "l7"

API Service Example (L4, Multiple Routes)

name = "myservice"
node = "rift"

[build.images]
myservice = "Dockerfile"

[[components]]
name = "api"
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
volumes = ["/srv/myservice:/srv/myservice"]
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]

[[components.routes]]
name = "rest"
port = 8443
mode = "l4"

[[components.routes]]
name = "grpc"
port = 9443
mode = "l4"

Full Example (API + Web)

name = "myservice"
node = "rift"

[build.images]
myservice = "Dockerfile.api"
myservice-web = "Dockerfile.web"

[[components]]
name = "api"
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
volumes = ["/srv/myservice:/srv/myservice"]
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]

[[components.routes]]
name = "rest"
port = 8443
mode = "l4"

[[components.routes]]
name = "grpc"
port = 9443
mode = "l4"

[[components]]
name = "web"
image = "mcr.svc.mcp.metacircular.net:8443/myservice-web:v1.0.0"
volumes = ["/srv/myservice:/srv/myservice"]
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]

[[components.routes]]
port = 443
mode = "l7"

Conventions

A few fields are derived by the agent at deploy time:

Field Default Override when...
Source path <service> relative to workspace root Directory name differs from service name (use path)
Hostname <service>.svc.mcp.metacircular.net Service needs a public hostname (use route hostname)

All other fields must be explicit in the service definition.

Service Definition Reference

Top-level fields:

Field Required Purpose
name Yes Service name (matches project name)
node Yes Target node to deploy to
active No Whether MCP keeps this running (default: true)
path No Source directory relative to workspace (default: name)

Build fields:

Field Purpose
build.images.<name> Maps build image name to Dockerfile path. The <name> must match the repository name in a component's image field (the part after the last /, before the : tag).

Component fields:

Field Required Purpose
name Yes Component name (e.g. api, web)
image Yes Full image reference (e.g. mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0)
volumes No Volume mounts (list of host:container strings)
cmd No Command override (list of strings)
env No Extra environment variables (list of KEY=VALUE strings)
network No Container network (default: none)
user No Container user (e.g. 0:0)
restart No Restart policy (e.g. unless-stopped)

Route fields (under [[components.routes]]):

Field Purpose
name Route name — determines $PORT_<NAME> env var
port External port on mc-proxy (e.g. 8443, 9443, 443)
mode l4 (TLS passthrough) or l7 (TLS termination by mc-proxy)
hostname Public hostname override

Routing Modes

Mode TLS handled by Use when...
l4 The service itself Service manages its own TLS (API servers, gRPC)
l7 mc-proxy mc-proxy terminates TLS and proxies HTTP to the service (web UIs)

Version Pinning

Component image fields must pin an explicit semver tag (e.g. mcr.svc.mcp.metacircular.net:8443/myservice:v1.1.0). Never use :latest. This ensures deployments are reproducible and mcp status shows the actual running version. The version is extracted from the image tag.


3. Build, Push, and Deploy

Tag the Release

git tag -a v1.0.0 -m "v1.0.0"
git push origin v1.0.0

Build and Push Images

mcp build <service>

This reads the [build.images] section of the service definition, builds each Dockerfile, tags the images with the version from the definition, and pushes them to MCR.

The workspace root is configured in ~/.config/mcp/mcp.toml:

[build]
workspace = "~/src/metacircular"

Each service's source is at <workspace>/<path> (where path defaults to the service name).

Sync and Deploy

# Push all service definitions to agents, auto-build missing images
mcp sync

# Deploy (or redeploy) a specific service
mcp deploy <service>

mcp sync checks whether each component's image tag exists in MCR. If missing and the source tree is available, it builds and pushes automatically.

mcp deploy pulls the image on the target node and creates or recreates the containers.

What Happens During Deploy

  1. Agent assigns a free host port (1000060000) for each declared route.
  2. For L7 routes, agent provisions a TLS certificate from Metacrypt CA (via POST /v1/engine/request). Certs are written to /srv/mc-proxy/certs/<service>.pem and .key. Existing valid certs (more than 30 days from expiry) are reused.
  3. Agent starts containers with $PORT / $PORT_<NAME> environment variables set to the assigned ports.
  4. Agent registers routes with mc-proxy via gRPC (hostname → <node-address>:<port>, mode, TLS cert paths).
  5. Agent registers DNS entries in MCNS for <service>.svc.mcp.metacircular.net.
  6. Agent records the full state in its SQLite registry.

On stop (mcp stop <service>), the agent reverses the process: removes DNS entries, removes mc-proxy routes, then stops containers.


4. Data Directory Convention

All runtime data lives in /srv/<service>/ on the host. This directory is bind-mounted into the container.

/srv/<service>/
├── <service>.toml        # Configuration file
├── <service>.db          # SQLite database (created on first run)
├── certs/                # TLS certificates
│   ├── cert.pem
│   └── key.pem
└── backups/              # Database snapshots

This directory must exist on the target node before the first deploy, owned by the mcp user (which runs rootless Podman). Create it with:

sudo mkdir -p /srv/<service>/certs
sudo chown -R mcp:mcp /srv/<service>

Place the service's TOML configuration and TLS certificates here before deploying.


5. Configuration

Services use TOML configuration with environment variable overrides.

Standard Config Sections

[server]
listen_addr = ":8443"
grpc_addr   = ":9443"
tls_cert    = "/srv/<service>/certs/cert.pem"
tls_key     = "/srv/<service>/certs/key.pem"

[database]
path = "/srv/<service>/<service>.db"

[mcias]
server_url   = "https://mcias.metacircular.net:8443"
ca_cert      = ""
service_name = "<service>"
tags         = []

[log]
level = "info"

For services with SSO-enabled web UIs, add:

[sso]
redirect_uri = "https://<service>.svc.mcp.metacircular.net/sso/callback"

For services with a separate web UI binary, add:

[web]
listen_addr  = "127.0.0.1:8080"
vault_grpc   = "127.0.0.1:9443"
vault_ca_cert = ""

$PORT Convention

When deployed via MCP, the agent assigns host ports and passes them as environment variables. Applications should not hardcode listen addresses — they will be overridden at deploy time.

Env var When set
$PORT Component has a single unnamed route
$PORT_<NAME> Component has named routes

Route names are uppercased: name = "rest"$PORT_REST, name = "grpc"$PORT_GRPC.

Container listen address: Services must bind to 0.0.0.0:$PORT (or :$PORT), not localhost:$PORT. Podman port-forwards go through the container's network namespace — binding to localhost inside the container makes the port unreachable from outside.

Services built with mcdsl v1.1.0+ handle this automatically — config.Load checks $PORT → overrides Server.ListenAddr, and $PORT_GRPC → overrides Server.GRPCAddr. These take precedence over TOML values.

Services not using mcdsl must check these environment variables in their own config loading.

Environment Variable Overrides

Beyond $PORT, services support $SERVICENAME_SECTION_KEY overrides. For example, $MCR_SERVER_LISTEN_ADDR=:9999 overrides [server] listen_addr in MCR's config. $PORT takes precedence over these.


6. Authentication (MCIAS Integration)

Every service delegates authentication to MCIAS. No service maintains its own user database. Services support two login modes: SSO redirect (recommended for web UIs) and direct credentials (fallback / API clients).

SSO Login (Web UIs)

SSO is the preferred login method for web UIs. The flow is an OAuth 2.0-style authorization code exchange:

  1. User visits the service and is redirected to /login.
  2. Login page shows a "Sign in with MCIAS" button.
  3. Click redirects to MCIAS (/sso/authorize), which authenticates the user.
  4. MCIAS redirects back to the service's /sso/callback with an authorization code.
  5. The service exchanges the code for a JWT via a server-to-server call to MCIAS POST /v1/sso/token.
  6. The JWT is stored in a session cookie.

SSO is enabled by adding an [sso] section to the service config and registering the service as an SSO client in MCIAS.

Service config:

[sso]
redirect_uri = "https://<service>.svc.mcp.metacircular.net/sso/callback"

MCIAS config (add to the [[sso_clients]] list):

[[sso_clients]]
client_id    = "<service>"
redirect_uri = "https://<service>.svc.mcp.metacircular.net/sso/callback"
service_name = "<service>"

The redirect_uri must match exactly between the service config and the MCIAS client registration.

When [sso].redirect_uri is empty or absent, the service falls back to the direct credentials form.

Implementation: Services use mcdsl/sso (v1.7.0+) which handles state management, CSRF-safe cookies, and the code exchange. The web server registers three routes:

Route Purpose
GET /login Renders landing page with "Sign in with MCIAS" button
GET /sso/redirect Sets state cookies, redirects to MCIAS
GET /sso/callback Validates state, exchanges code for JWT, sets session

Direct Credentials (API / Fallback)

  1. Client sends credentials to the service's POST /v1/auth/login.
  2. Service forwards them to MCIAS via mcdsl/auth.Authenticator.Login().
  3. MCIAS validates and returns a bearer token.
  4. Subsequent requests include Authorization: Bearer <token>.
  5. Service validates tokens via ValidateToken(), cached for 30s (keyed by SHA-256 of the token).

Web UIs use this mode when SSO is not configured, presenting a username/password/TOTP form instead of the SSO button.

Roles

Role Access
admin Full access, policy bypass
user Access governed by policy rules, default deny
guest Service-dependent restrictions, default deny

Admin detection comes solely from the MCIAS admin role. Services never promote users locally.


7. Networking

Hostnames

Every service gets <service>.svc.mcp.metacircular.net automatically. Public-facing services can declare additional hostnames:

[[components.routes]]
port = 443
mode = "l7"
hostname = "docs.metacircular.net"

TLS

  • Minimum TLS 1.3. No exceptions.
  • L4 services manage their own TLS — certificates go in /srv/<service>/certs/.
  • L7 services have TLS terminated by mc-proxy — certs are stored at /srv/mc-proxy/certs/<service>.pem.
  • Certificate and key paths are required config — the service refuses to start without them.

Container Networking

Containers join the mcpnet Podman network by default. Services communicate with each other over this network or via loopback (when co-located on the same node).


8. Command Reference

Command Purpose
mcp build <service> Build and push images to MCR
mcp sync Push all service definitions to agents; auto-build missing images
mcp deploy <service> Pull image, (re)create containers, register routes
mcp undeploy <service> Full teardown: remove routes, DNS, certs, and containers
mcp stop <service> Remove routes, stop containers
mcp start <service> Start previously stopped containers
mcp restart <service> Restart containers in place
mcp ps List all managed containers and status
mcp status [service] Detailed status for a specific service
mcp logs <service> Stream container logs
mcp edit <service> Edit service definition

9. Complete Walkthrough

Deploying a new service called myservice from scratch:

# 1. Prepare the target node
ssh rift
sudo mkdir -p /srv/myservice/certs
sudo chown -R mcp:mcp /srv/myservice
# Place myservice.toml and TLS certs in /srv/myservice/
exit

# 2. Tag the release
cd ~/src/metacircular/myservice
git tag -a v1.0.0 -m "v1.0.0"
git push origin v1.0.0

# 3. Write the service definition
cat > ~/.config/mcp/services/myservice.toml << 'EOF'
name = "myservice"
node = "rift"

[build.images]
myservice = "Dockerfile.api"

[[components]]
name = "api"
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
volumes = ["/srv/myservice:/srv/myservice"]

[[components.routes]]
name = "rest"
port = 8443
mode = "l4"

[[components.routes]]
name = "grpc"
port = 9443
mode = "l4"
EOF

# 4. Build and push the image
mcp build myservice

# 5. Deploy
mcp deploy myservice

# 6. Verify
mcp status myservice
mcp ps

The service is now running, with mc-proxy routing myservice.svc.mcp.metacircular.net traffic to the agent-assigned ports.


Appendix: Repository Layout

Services follow a standard directory structure:

.
├── cmd/<service>/          CLI entry point (server, subcommands)
├── cmd/<service>-web/      Web UI entry point (if separate)
├── internal/               All service logic (not importable externally)
│   ├── auth/               MCIAS integration
│   ├── config/             TOML config loading
│   ├── db/                 Database setup, migrations
│   ├── server/             REST API server
│   ├── grpcserver/         gRPC server
│   └── webserver/          Web UI server (if applicable)
├── proto/<service>/v1/     Protobuf definitions
├── gen/<service>/v1/       Generated gRPC code
├── web/                    Templates and static assets (embedded)
├── deploy/
│   ├── <service>-rift.toml Reference MCP service definition
│   ├── docker/             Docker Compose files
│   ├── examples/           Example config files
│   └── systemd/            systemd units
├── Dockerfile.api          API server container
├── Dockerfile.web          Web UI container (if applicable)
├── Makefile                Standard build targets
└── <service>.toml.example  Example configuration

Standard Makefile Targets

Target Purpose
make all vet → lint → test → build (the CI pipeline)
make build go build ./...
make test go test ./...
make vet go vet ./...
make lint golangci-lint run ./...
make docker Build the container image
make proto Regenerate gRPC code from .proto files
make devserver Build and run locally against srv/ config

10. Agent Management

MCP manages a fleet of nodes with heterogeneous operating systems and architectures. The agent binary lives at /srv/mcp/mcp-agent on every node — this is a mutable path that MCP controls, regardless of whether the node runs NixOS or Debian.

Node Configuration

Each node in ~/.config/mcp/mcp.toml includes SSH and architecture info for agent management:

[[nodes]]
name = "rift"
address = "100.95.252.120:9444"
ssh = "rift"
arch = "amd64"

[[nodes]]
name = "hyperborea"
address = "100.x.x.x:9444"
ssh = "hyperborea"
arch = "arm64"

Upgrading Agents

After tagging a new MCP release:

# Upgrade all nodes (recommended — prevents version skew)
mcp agent upgrade

# Upgrade a single node
mcp agent upgrade rift

# Check versions across the fleet
mcp agent status

mcp agent upgrade cross-compiles the agent binary for each target architecture, SSHs to each node, atomically replaces the binary, and restarts the systemd service. All nodes should be upgraded together because new CLI versions often depend on new agent RPCs.

Provisioning New Nodes

One-time setup for a new Debian node:

# 1. Provision the node (creates user, dirs, systemd unit, installs binary)
mcp node provision <name>

# 2. Register the node
mcp node add <name> <address>

# 3. Deploy services
mcp deploy <service>

For NixOS nodes, provisioning is handled by the NixOS configuration. The NixOS config creates the mcp user, systemd unit, and directories. The ExecStart path points to /srv/mcp/mcp-agent so that mcp agent upgrade works the same as on Debian nodes.


Appendix: Currently Deployed Services

For reference, these services are operational on the platform:

Service Version Node Purpose
MCIAS v1.9.0 (separate) Identity and access
Metacrypt v1.4.1 rift Cryptographic service, PKI/CA
MC-Proxy v1.2.1 rift, svc TLS proxy and router (svc handles public edge)
MCR v1.2.1 rift Container registry
MCNS v1.1.1 rift Authoritative DNS
MCDoc v0.1.0 rift Documentation server
MCQ v0.4.0 rift Document review queue
MCP v0.7.6 rift, svc Control plane agent