Document system account auth model in ARCHITECTURE.md

Replaces the "admin required for all operations" model with the new
three-tier identity model: human operators for CLI, mcp-agent system
account for infrastructure automation, admin reserved for MCIAS-level
administration. Documents agent-to-service token paths and per-service
authorization policies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-28 16:11:08 -07:00
parent 86d516acf6
commit 18365cc0a8

View File

@@ -121,9 +121,26 @@ option for future security hardening.
## Authentication and Authorization ## Authentication and Authorization
MCP follows the platform authentication model: all auth is delegated to MCP follows the platform authentication model: all auth is delegated to
MCIAS. MCIAS. The auth model separates three concerns: operator intent (CLI to
agent), infrastructure automation (agent to platform services), and
access control (who can do what).
### Agent Authentication ### Identity Model
| Identity | Type | Purpose |
|----------|------|---------|
| Human operator (e.g., `kyle`) | human | CLI operations: deploy, stop, start, build |
| `mcp-agent` | system | Agent-to-service automation: certs, DNS, routes, image pull |
| Per-service accounts (e.g., `mcq`) | system | Scoped self-management (own DNS records only) |
| `admin` role | role | MCIAS account management, policy changes, zone creation |
| `guest` role | role | Explicitly rejected by the agent |
The `admin` role is reserved for MCIAS-level administrative operations
(account creation, policy management, zone mutations). Routine MCP
operations (deploy, stop, start, build) do not require admin — any
authenticated non-guest user or system account is accepted.
### Agent Authentication (CLI → Agent)
The agent is a gRPC server with a unary interceptor that enforces The agent is a gRPC server with a unary interceptor that enforces
authentication on every RPC: authentication on every RPC:
@@ -132,10 +149,34 @@ authentication on every RPC:
(`authorization: Bearer <token>`). (`authorization: Bearer <token>`).
2. Agent extracts the token and validates it against MCIAS (cached 30s by 2. Agent extracts the token and validates it against MCIAS (cached 30s by
SHA-256 of the token, per platform convention). SHA-256 of the token, per platform convention).
3. Agent checks that the caller has the `admin` role. All MCP operations 3. Agent rejects guests (`guest` role → `PERMISSION_DENIED`). All other
require admin -- there is no unprivileged MCP access. authenticated users and system accounts are accepted.
4. If validation fails, the RPC returns `UNAUTHENTICATED` (invalid/expired 4. If validation fails, the RPC returns `UNAUTHENTICATED` (invalid/expired
token) or `PERMISSION_DENIED` (valid token, not admin). token) or `PERMISSION_DENIED` (guest).
### Agent Service Authentication (Agent → Platform Services)
The agent authenticates to platform services using a long-lived system
account token (`mcp-agent`). Each service has its own token file:
| Service | Token Path | Operations |
|---------|------------|------------|
| Metacrypt | `/srv/mcp/metacrypt-token` | TLS cert provisioning (PKI issue) |
| MCNS | `/srv/mcp/mcns-token` | DNS record create/delete (any name) |
| mc-proxy | Unix socket (no auth) | Route registration/removal |
| MCR | podman auth store | Image pull (JWT-as-password) |
These tokens are issued by MCIAS for the `mcp-agent` system account.
They carry no roles — authorization is handled by each service's policy
engine:
- **Metacrypt:** Policy rule grants `mcp-agent` write access to
`engine/pki/issue`.
- **MCNS:** Code-level authorization: system account `mcp-agent` can
manage any record; other system accounts can only manage records
matching their username.
- **MCR:** Default policy allows all authenticated users to push/pull.
MCR accepts MCIAS JWTs as passwords at the `/v2/token` endpoint.
### CLI Authentication ### CLI Authentication
@@ -148,6 +189,15 @@ obtained by:
The stored token is used for all subsequent agent RPCs until it expires. The stored token is used for all subsequent agent RPCs until it expires.
### MCR Registry Authentication
`mcp build` auto-authenticates to MCR before pushing images. It reads
the CLI's stored MCIAS token and uses it as the password for `podman
login`. MCR's token endpoint accepts MCIAS JWTs as passwords (the
personal-access-token pattern), so both human and system account tokens
work. This eliminates the need for a separate interactive `podman login`
step.
--- ---
## Services and Components ## Services and Components
@@ -224,6 +274,9 @@ mcp pull <service> <path> [local-file] Copy a file from /srv/<service>/<path> to
mcp node list List registered nodes mcp node list List registered nodes
mcp node add <name> <address> Register a node mcp node add <name> <address> Register a node
mcp node remove <name> Deregister a node mcp node remove <name> Deregister a node
mcp agent upgrade [node] Build, push, and restart agent on all (or one) node(s)
mcp agent status Show agent version on each node
``` ```
### Service Definition Files ### Service Definition Files
@@ -1144,20 +1197,84 @@ The agent's data directory follows the platform convention:
### Agent Deployment (on nodes) ### Agent Deployment (on nodes)
The agent is deployed like any other Metacircular service: #### Provisioning (one-time per node)
1. Provision the `mcp` system user via NixOS config (with podman access Each node needs a one-time setup before the agent can run. The steps are
and subuid/subgid ranges for rootless containers). the same regardless of OS, but the mechanism differs:
1. Create `mcp` system user with podman access and subuid/subgid ranges.
2. Set `/srv/` ownership to the `mcp` user (the agent creates and manages 2. Set `/srv/` ownership to the `mcp` user (the agent creates and manages
`/srv/<service>/` directories for all services). `/srv/<service>/` directories for all services).
3. Create `/srv/mcp/` directory and config file. 3. Create `/srv/mcp/` directory and config file.
4. Provision TLS certificate from Metacrypt. 4. Provision TLS certificate from Metacrypt.
5. Create an MCIAS system account for the agent (`mcp-agent`). 5. Create an MCIAS system account for the agent (`mcp-agent`).
6. Install the `mcp-agent` binary. 6. Install the initial `mcp-agent` binary to `/srv/mcp/mcp-agent`.
7. Start via systemd unit. 7. Install and start the systemd unit.
The agent runs as a systemd service. Container-first deployment is a v2 On **NixOS** (rift), provisioning is declarative via the NixOS config.
concern -- MCP needs to be running before it can manage its own agent. The NixOS config owns the infrastructure (user, systemd unit, podman,
directories, permissions) but **not** the binary. `ExecStart` points to
`/srv/mcp/mcp-agent`, a mutable path that MCP manages. NixOS may
bootstrap the initial binary there, but subsequent updates come from MCP.
On **Debian** (hyperborea, svc), provisioning is done via a setup script
or ansible playbook that creates the same layout.
#### Binary Location
The agent binary lives at `/srv/mcp/mcp-agent` on **all** nodes,
regardless of OS. This unifies the update mechanism across the fleet.
#### Agent Upgrades
After initial provisioning, the agent binary is updated via
`mcp agent upgrade`. The CLI:
1. Cross-compiles the agent for each target architecture
(`GOARCH=amd64` for rift/svc, `GOARCH=arm64` for hyperborea).
2. SSHs to each node, pushes the binary to `/srv/mcp/mcp-agent.new`.
3. Atomically swaps the binary (`mv mcp-agent.new mcp-agent`).
4. Restarts the systemd service (`systemctl restart mcp-agent`).
SSH is used instead of gRPC because:
- It works even when the agent is broken or has an incompatible version.
- The binary is ~17MB, which exceeds gRPC default message limits.
- No self-restart coordination needed.
The CLI uses `golang.org/x/crypto/ssh` for native SSH, keeping the
entire workflow in a single binary with no external tool dependencies.
#### Node Configuration
Node config includes SSH and architecture info for agent management:
```toml
[[nodes]]
name = "rift"
address = "100.95.252.120:9444"
ssh = "rift" # SSH host (from ~/.ssh/config or hostname)
arch = "amd64" # GOARCH for cross-compilation
[[nodes]]
name = "hyperborea"
address = "100.x.x.x:9444"
ssh = "hyperborea"
arch = "arm64"
```
#### Coordinated Upgrades
New MCP releases often add new RPCs. A CLI at v0.6.0 calling an agent
at v0.5.0 fails with `Unimplemented`. Therefore agent upgrades must be
coordinated: `mcp agent upgrade` (with no node argument) upgrades all
nodes before the CLI is used for other operations.
If a node fails to upgrade, it is reported but the others still proceed.
The operator can retry or investigate via SSH.
#### Systemd Unit
The systemd unit is the same on all nodes:
```ini ```ini
[Unit] [Unit]
@@ -1167,7 +1284,7 @@ Wants=network-online.target
[Service] [Service]
Type=simple Type=simple
ExecStart=/usr/local/bin/mcp-agent server --config /srv/mcp/mcp-agent.toml ExecStart=/srv/mcp/mcp-agent server --config /srv/mcp/mcp-agent.toml
Restart=on-failure Restart=on-failure
RestartSec=5 RestartSec=5
@@ -1175,17 +1292,14 @@ User=mcp
Group=mcp Group=mcp
NoNewPrivileges=true NoNewPrivileges=true
ProtectSystem=strict ProtectSystem=full
ProtectHome=true ProtectHome=false
PrivateTmp=true PrivateTmp=true
PrivateDevices=true PrivateDevices=true
ProtectKernelTunables=true ProtectKernelTunables=true
ProtectKernelModules=true ProtectKernelModules=true
ProtectControlGroups=true
RestrictSUIDSGID=true RestrictSUIDSGID=true
RestrictNamespaces=true
LockPersonality=true LockPersonality=true
MemoryDenyWriteExecute=true
RestrictRealtime=true RestrictRealtime=true
ReadWritePaths=/srv ReadWritePaths=/srv
@@ -1195,6 +1309,7 @@ WantedBy=multi-user.target
Note: `ReadWritePaths=/srv` (not `/srv/mcp`) because the agent writes Note: `ReadWritePaths=/srv` (not `/srv/mcp`) because the agent writes
files to any service's `/srv/<service>/` directory on behalf of the CLI. files to any service's `/srv/<service>/` directory on behalf of the CLI.
`ProtectHome=false` because the `mcp` user's home is `/srv/mcp`.
### CLI Installation (on operator workstation) ### CLI Installation (on operator workstation)