Compare commits
7 Commits
4386fb0896
...
95bec6a095
| Author | SHA1 | Date | |
|---|---|---|---|
| 95bec6a095 | |||
| faf58ceb72 | |||
| bce32654e1 | |||
| 0123e6e29a | |||
| 86bbfa640f | |||
| cadbb3f234 | |||
| a777c3ff8b |
@@ -5,7 +5,7 @@ from its current manually-wired state to fully declarative deployment.
|
||||
It is a living design document — not a spec, not a commitment, but a
|
||||
record of where we are, where we want to be, and what's between.
|
||||
|
||||
Last updated: 2026-03-27 (Phases A + B complete)
|
||||
Last updated: 2026-03-28 (Phases A + B + C + D complete)
|
||||
|
||||
---
|
||||
|
||||
@@ -181,9 +181,9 @@ about one node, one mc-proxy, or loopback-only backends.
|
||||
#### 1. mcdsl: Proper Module Versioning — DONE
|
||||
|
||||
mcdsl is already properly versioned and released:
|
||||
- Tagged releases: `v0.1.0`, `v1.0.0`, `v1.0.1`
|
||||
- Tagged releases: `v0.1.0`, `v1.0.0`, `v1.0.1`, `v1.1.0`, `v1.2.0`
|
||||
- All consuming services import by URL with pinned versions
|
||||
(mcr, mcat, mcns, mc-proxy → `v1.0.0`; metacrypt → `v1.0.1`)
|
||||
(all consuming services on `v1.2.0`)
|
||||
- No `replace` directives anywhere
|
||||
- Docker builds use standard `go mod download`
|
||||
- `uses_mcdsl` eliminated from service definitions and docs
|
||||
@@ -215,18 +215,14 @@ routes during deploy and stop:
|
||||
- L4 routes: TLS passthrough, backend handles its own TLS
|
||||
- Hostnames default to `<service>.svc.mcp.metacircular.net`
|
||||
|
||||
#### 4. MCP Agent: TLS Cert Provisioning
|
||||
#### 4. MCP Agent: TLS Cert Provisioning — DONE
|
||||
|
||||
**Gap**: certs are manually provisioned and placed on disk. There is no
|
||||
automated issuance flow.
|
||||
|
||||
**Work**:
|
||||
- Agent requests certs from Metacrypt CA via its API.
|
||||
- Certs are stored in a standard location
|
||||
(`/srv/mc-proxy/certs/<service>.pem`).
|
||||
- Cert renewal is handled automatically before expiry.
|
||||
|
||||
**Depends on**: Metacrypt cert issuance policy (#7).
|
||||
Agent provisions TLS certificates from Metacrypt CA automatically during
|
||||
deploy for L7 routes:
|
||||
- ACME client library requests certs from Metacrypt CA via its API
|
||||
- Certs stored in `/srv/mc-proxy/certs/<service>.pem`
|
||||
- Provisioning happens during deploy before mc-proxy route registration
|
||||
- L7 routes get agent-provisioned certs; L4 routes use service-managed TLS
|
||||
|
||||
#### 5. mc-proxy: Route Persistence — DONE
|
||||
|
||||
@@ -243,57 +239,49 @@ mc-proxy routes are fully persisted in SQLite and survive restarts:
|
||||
bootstrap before MCP is operational. The gRPC API and mcproxyctl
|
||||
are the primary route management interfaces going forward.
|
||||
|
||||
#### 6. MCP Agent: DNS Registration
|
||||
#### 6. MCP Agent: DNS Registration — DONE
|
||||
|
||||
**Gap**: DNS records are manually configured in MCNS zone files.
|
||||
Agent automatically manages DNS records during deploy and stop:
|
||||
- Deploy: calls MCNS API to create/update A records for
|
||||
`<service>.svc.mcp.metacircular.net` pointing to the node's address.
|
||||
- Stop/undeploy: removes DNS records before stopping containers.
|
||||
- Config: `[mcns]` section in agent config with server URL, CA cert,
|
||||
token path, zone, and node address.
|
||||
- Nil-safe: if MCNS not configured, silently skipped (backward compatible).
|
||||
- Authorization: mcp-agent system account can manage any record name.
|
||||
|
||||
**Work**:
|
||||
- Agent creates/updates A records in MCNS for
|
||||
`<service>.svc.mcp.metacircular.net`.
|
||||
- Agent removes records on service teardown.
|
||||
#### 7. Metacrypt: Automated Cert Issuance Policy — DONE
|
||||
|
||||
**Depends on**: MCNS record management API (#8).
|
||||
MCP agent has MCIAS credentials and Metacrypt policy for automated cert
|
||||
issuance:
|
||||
- MCP agent authenticates to Metacrypt with MCIAS service credentials
|
||||
- Metacrypt policy allows cert issuance for
|
||||
`*.svc.mcp.metacircular.net`
|
||||
- One cert per hostname per service — no wildcard certs
|
||||
|
||||
#### 7. Metacrypt: Automated Cert Issuance Policy
|
||||
#### 8. MCNS: Record Management API — DONE
|
||||
|
||||
**Gap**: no policy exists for automated cert issuance. The MCP agent
|
||||
doesn't have a Metacrypt identity or permissions.
|
||||
|
||||
**Work**:
|
||||
- MCP agent gets an MCIAS service account.
|
||||
- Metacrypt policy allows this account to issue certs scoped to
|
||||
`*.svc.mcp.metacircular.net` (and explicitly listed public
|
||||
hostnames).
|
||||
- No wildcard certs — one cert per hostname per service.
|
||||
|
||||
**Depends on**: MCIAS service account provisioning (exists today, just
|
||||
needs the account created).
|
||||
|
||||
#### 8. MCNS: Record Management API
|
||||
|
||||
**Gap**: MCNS v1.0.0 has REST + gRPC APIs and SQLite storage, but
|
||||
records are currently seeded from migrations (static). The API supports
|
||||
CRUD operations but MCP does not yet call it for dynamic registration.
|
||||
|
||||
**Work**:
|
||||
- MCP agent calls MCNS API to create/update/delete records on
|
||||
deploy/stop.
|
||||
- MCIAS auth scoping to allow MCP agent to manage
|
||||
`*.svc.mcp.metacircular.net` records.
|
||||
|
||||
**Depends on**: MCNS API exists. Remaining work is MCP integration
|
||||
and auth scoping.
|
||||
MCNS provides full CRUD for DNS records via REST and gRPC:
|
||||
- REST: POST/GET/PUT/DELETE on `/v1/zones/{zone}/records`
|
||||
- gRPC: RecordService with ListRecords, CreateRecord, GetRecord,
|
||||
UpdateRecord, DeleteRecord RPCs
|
||||
- SQLite-backed with transactional writes, CNAME exclusivity enforcement,
|
||||
and automatic SOA serial bumping on mutations
|
||||
- Authorization: admin can manage any record, mcp-agent system account
|
||||
can manage any record name, other system accounts scoped to own name
|
||||
- MCP agent uses the REST API to register/deregister records on
|
||||
deploy/stop
|
||||
|
||||
#### 9. Application $PORT Convention — DONE
|
||||
|
||||
mcdsl v1.1.0 adds `$PORT` and `$PORT_GRPC` env var support:
|
||||
mcdsl v1.2.0 added `$PORT` and `$PORT_GRPC` env var support:
|
||||
- `config.Load` checks `$PORT` → overrides `Server.ListenAddr`
|
||||
- `config.Load` checks `$PORT_GRPC` → overrides `Server.GRPCAddr`
|
||||
- Takes precedence over TOML and generic env overrides
|
||||
(`$MCR_SERVER_LISTEN_ADDR`) — agent-assigned ports are authoritative
|
||||
- Handles both `config.Base` embedding (MCR, MCNS, MCAT) and direct
|
||||
`ServerConfig` embedding (Metacrypt) via struct tree walking
|
||||
- MCR, Metacrypt, MCNS upgraded to mcdsl v1.1.0
|
||||
- All consuming services on mcdsl v1.4.0
|
||||
|
||||
---
|
||||
|
||||
@@ -311,33 +299,90 @@ Phase A — Independent groundwork: ✓ COMPLETE
|
||||
Phase B — MCP route registration: ✓ COMPLETE
|
||||
#3 Agent registers routes with mc-proxy ✓ DONE
|
||||
|
||||
Phase C — Automated TLS:
|
||||
#7 Metacrypt cert issuance policy
|
||||
#4 Agent provisions certs
|
||||
Phase C — Automated TLS: ✓ COMPLETE
|
||||
#7 Metacrypt cert issuance policy ✓ DONE
|
||||
#4 Agent provisions certs ✓ DONE
|
||||
(depends on #7)
|
||||
|
||||
Phase D — DNS:
|
||||
#8 MCNS record management API
|
||||
#6 Agent registers DNS
|
||||
Phase D — DNS: ✓ COMPLETE
|
||||
#8 MCNS record management API ✓ DONE
|
||||
#6 Agent registers DNS ✓ DONE
|
||||
(depends on #8)
|
||||
|
||||
Phase E — Multi-node agent management:
|
||||
#10 Agent binary at /srv/mcp/mcp-agent on all nodes
|
||||
#11 mcp agent upgrade (SSH-based cross-compiled push)
|
||||
#12 Node provisioning tooling (Debian + NixOS)
|
||||
(depends on #10)
|
||||
```
|
||||
|
||||
**Phases A and B are complete.** Services can be deployed with
|
||||
agent-assigned ports, `$PORT` env vars, and automatic mc-proxy route
|
||||
registration. No more manual port picking, mcproxyctl, or TOML editing.
|
||||
|
||||
The remaining manual steps are TLS cert provisioning (Phase C) and
|
||||
DNS registration (Phase D).
|
||||
**Phases A, B, C, and D are complete.** Services can be deployed with
|
||||
agent-assigned ports, `$PORT` env vars, automatic mc-proxy route
|
||||
registration, automated TLS cert provisioning from Metacrypt CA, and
|
||||
automatic DNS registration in MCNS. No more manual port picking,
|
||||
mcproxyctl, TOML editing, cert generation, or DNS zone editing.
|
||||
|
||||
### Immediate Next Steps
|
||||
|
||||
1. **Phase C: Automated TLS** — Metacrypt cert issuance policy for MCP
|
||||
agent, then agent provisions certs automatically during deploy.
|
||||
2. **Phase D: DNS** — MCNS record management API integration, then
|
||||
agent registers DNS records during deploy.
|
||||
3. **mcdoc implementation** — fully designed, no platform evolution
|
||||
1. **Phase E: Multi-node agent management** — see below.
|
||||
2. **mcdoc implementation** — fully designed, no platform evolution
|
||||
dependency. Deployable now with the new route system.
|
||||
|
||||
#### 10. Agent Binary Location Convention
|
||||
|
||||
**Gap**: The agent binary is currently NixOS-managed on rift (lives in
|
||||
`/nix/store/`, systemd `ExecStart` points there). This doesn't work for
|
||||
Debian nodes and requires a full `nixos-rebuild` for every MCP release.
|
||||
|
||||
**Work**:
|
||||
- Standardize agent binary at `/srv/mcp/mcp-agent` on all nodes.
|
||||
- NixOS config: change `ExecStart` from nix store path to
|
||||
`/srv/mcp/mcp-agent`. NixOS still owns user, systemd unit, podman,
|
||||
directories — just not the binary version.
|
||||
- Debian nodes: same layout, provisioned by setup script.
|
||||
|
||||
#### 11. Agent Upgrade via SSH Push
|
||||
|
||||
**Gap**: Updating the agent requires manual, OS-specific steps. On
|
||||
NixOS: update flake lock, commit, push, rebuild. On Debian: build, scp,
|
||||
restart. With multiple nodes and architectures (amd64 + arm64), this
|
||||
doesn't scale.
|
||||
|
||||
**Work**:
|
||||
- `mcp agent upgrade [node]` CLI command.
|
||||
- Cross-compiles agent for each target arch (`GOARCH` from node config).
|
||||
- Uses `golang.org/x/crypto/ssh` to push the binary and restart the
|
||||
service. No external tool dependencies.
|
||||
- Node config gains `ssh` (hostname) and `arch` (GOARCH) fields.
|
||||
- Upgrades all nodes by default to prevent version skew. New RPCs cause
|
||||
`Unimplemented` errors if agent and CLI are out of sync.
|
||||
|
||||
**Depends on**: #10 (binary location convention).
|
||||
|
||||
#### 12. Node Provisioning Tooling
|
||||
|
||||
**Gap**: Setting up a new node requires manual steps: create user,
|
||||
create directories, install podman, write config, create systemd unit.
|
||||
Different for NixOS vs Debian.
|
||||
|
||||
**Work**:
|
||||
- Go-based provisioning tool (part of MCP CLI) or standalone script.
|
||||
- `mcp node provision <name>` SSHs to the node and runs setup:
|
||||
create `mcp` user with podman access, create `/srv/mcp/`, write
|
||||
systemd unit, install initial binary, start service.
|
||||
- For NixOS, provisioning remains in the NixOS config (declarative).
|
||||
The provisioning tool targets Debian/generic Linux.
|
||||
|
||||
**Depends on**: #10 (binary location convention), #11 (SSH infra).
|
||||
|
||||
**Current fleet**:
|
||||
|
||||
| Node | OS | Arch | Status |
|
||||
|------|----|------|--------|
|
||||
| rift | NixOS | amd64 | Operational, single MCP agent |
|
||||
| hyperborea | Debian (RPi) | arm64 | Online, needs agent provisioning |
|
||||
| svc | Debian | amd64 | Runs MCIAS, needs agent for public edge services |
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
80
STATUS.md
80
STATUS.md
@@ -1,6 +1,6 @@
|
||||
# Metacircular Platform Status
|
||||
|
||||
Last updated: 2026-03-26
|
||||
Last updated: 2026-03-28
|
||||
|
||||
## Platform Overview
|
||||
|
||||
@@ -8,27 +8,30 @@ One node operational (**rift**), running core infrastructure services as
|
||||
containers fronted by MC-Proxy. MCIAS runs separately (not on rift).
|
||||
Bootstrap phases 0–4 complete (MCIAS, Metacrypt, MC-Proxy, MCR all
|
||||
operational). MCP is deployed and managing all platform containers. MCNS is
|
||||
deployed on rift, serving authoritative DNS.
|
||||
deployed on rift, serving authoritative DNS. Platform evolution Phases A–D
|
||||
complete (automated port assignment, route registration, TLS cert
|
||||
provisioning, and DNS registration). Multi-node deployment is being planned
|
||||
(Phase E).
|
||||
|
||||
## Service Status
|
||||
|
||||
| Service | Version | SDLC Phase | Deployed | Node |
|
||||
|---------|---------|------------|----------|------|
|
||||
| MCIAS | v1.8.0 | Maintenance | Yes | (separate) |
|
||||
| Metacrypt | v1.1.0 | Production | Yes | rift |
|
||||
| MC-Proxy | v1.1.0 | Maintenance | Yes | rift |
|
||||
| MCR | v1.2.0 | Production | Yes | rift |
|
||||
| MCAT | v1.1.0 | Complete | Unknown | — |
|
||||
| MCDSL | v1.2.0 | Stable | N/A (library) | — |
|
||||
| MCNS | v1.1.0 | Production | Yes | rift |
|
||||
| MCP | v0.3.0 | Production | Yes | rift |
|
||||
| MCDeploy | v0.2.0 | Active dev | N/A (CLI tool) | — |
|
||||
| MCIAS | v1.9.0 | Maintenance | Yes | (separate) |
|
||||
| Metacrypt | v1.3.1 | Production | Yes | rift |
|
||||
| MC-Proxy | v1.2.1 | Maintenance | Yes | rift |
|
||||
| MCR | v1.2.1 | Production | Yes | rift |
|
||||
| MCAT | v1.1.1 | Complete | Unknown | — |
|
||||
| MCDSL | v1.4.0 | Stable | N/A (library) | — |
|
||||
| MCNS | v1.1.1 | Production | Yes | rift |
|
||||
| MCP | v0.7.6 | Production | Yes | rift |
|
||||
| MCDoc | v0.1.0 | Active dev | No | — |
|
||||
|
||||
## Service Details
|
||||
|
||||
### MCIAS — Identity and Access Service
|
||||
|
||||
- **Version:** v1.8.0 (client library: clients/go/v0.2.0)
|
||||
- **Version:** v1.9.0 (client library: clients/go/v0.2.0)
|
||||
- **Phase:** Maintenance. Phases 0-14 complete. Feature-complete with active
|
||||
refinement.
|
||||
- **Deployment:** Running in production. All other services authenticate
|
||||
@@ -40,7 +43,7 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
### Metacrypt — Cryptographic Service Engine
|
||||
|
||||
- **Version:** v1.1.0.
|
||||
- **Version:** v1.3.1.
|
||||
- **Phase:** Production. All four engine types implemented (CA, SSH CA, transit,
|
||||
user-to-user). Active work on integration test coverage.
|
||||
- **Deployment:** Running on rift as a container, fronted by MC-Proxy on
|
||||
@@ -52,10 +55,11 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
### MC-Proxy — TLS Proxy and Router
|
||||
|
||||
- **Version:** v1.1.0. Phases 1-8 complete.
|
||||
- **Version:** v1.2.1.
|
||||
- **Phase:** Maintenance. Stable and actively routing traffic on rift.
|
||||
- **Deployment:** Running on rift. Fronts Metacrypt, MCR, and sgard on ports
|
||||
443, 8443, and 9443. Prometheus metrics on 127.0.0.1:9091.
|
||||
443, 8443, and 9443. Prometheus metrics on 127.0.0.1:9091. Routes persisted
|
||||
in SQLite and managed via gRPC API.
|
||||
- **Recent work:** MCR route additions, Nix flake, L7 backend cert handling,
|
||||
Prometheus metrics, L7 policies.
|
||||
- **Artifacts:** systemd units (service + backup timer), Docker Compose
|
||||
@@ -63,7 +67,7 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
### MCR — Container Registry
|
||||
|
||||
- **Version:** v1.2.0. All implementation phases complete.
|
||||
- **Version:** v1.2.1. All implementation phases complete.
|
||||
- **Phase:** Production. Deployed on rift, serving container images.
|
||||
- **Deployment:** Running on rift as two containers (mcr API + mcr-web),
|
||||
fronted by MC-Proxy on ports 443 (web, L7), 8443 (API, L4), and
|
||||
@@ -76,7 +80,7 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
### MCAT — Login Policy Tester
|
||||
|
||||
- **Version:** v1.1.0.
|
||||
- **Version:** v1.1.1.
|
||||
- **Phase:** Complete. Diagnostic tool, not core infrastructure.
|
||||
- **Deployment:** Available for ad-hoc use. Lightweight tool for testing
|
||||
MCIAS login policy rules.
|
||||
@@ -85,20 +89,21 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
### MCDSL — Standard Library
|
||||
|
||||
- **Version:** v1.2.0.
|
||||
- **Version:** v1.4.0.
|
||||
- **Phase:** Stable. All 9 packages implemented and tested. Being adopted
|
||||
across the platform.
|
||||
- **Deployment:** N/A (Go library, imported by other services).
|
||||
- **Packages:** auth, db, config, httpserver, grpcserver, csrf, web, health,
|
||||
archive.
|
||||
- **Adoption:** All services except mcias on v1.2.0. mcias pending.
|
||||
- **Adoption:** All services except mcias on v1.4.0. mcias pending.
|
||||
|
||||
### MCNS — Networking Service
|
||||
|
||||
- **Version:** v1.1.0.
|
||||
- **Version:** v1.1.1.
|
||||
- **Phase:** Production. Custom Go DNS server replacing CoreDNS precursor.
|
||||
- **Deployment:** Running on rift as a container managed by MCP. Serves two
|
||||
authoritative zones plus upstream forwarding.
|
||||
authoritative zones plus upstream forwarding. REST + gRPC APIs with MCIAS
|
||||
auth and name-scoped system account authorization.
|
||||
- **Recent work:** v1.0.0 implementation (custom Go DNS server), engineering
|
||||
review, deployed to rift replacing CoreDNS.
|
||||
- **Artifacts:** Dockerfile, Docker Compose (rift), MCP service definition,
|
||||
@@ -106,29 +111,28 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
### MCP — Control Plane
|
||||
|
||||
- **Version:** v0.3.0.
|
||||
- **Phase:** Production. Phases 0-4 complete. Deployed to rift, managing all
|
||||
platform containers.
|
||||
- **Version:** v0.7.6.
|
||||
- **Phase:** Production. Phases A–D complete (automated port assignment, route
|
||||
registration, TLS cert provisioning, DNS registration).
|
||||
- **Deployment:** Running on rift. Agent as systemd service under `mcp` user
|
||||
with rootless podman. Manages metacrypt, mc-proxy, mcr, and mcns containers.
|
||||
- **Architecture:** Two components — `mcp` CLI (thin client on vade) and
|
||||
`mcp-agent` (per-node daemon with SQLite registry, podman management,
|
||||
monitoring with drift/flap detection). gRPC-only (no REST).
|
||||
- **Recent work:** Full v1 implementation (12 RPCs, 15 CLI commands),
|
||||
deployment to rift, container migration from kyle→mcp user, service
|
||||
definition authoring.
|
||||
monitoring with drift/flap detection). gRPC-only (no REST). 15 RPCs, 17+
|
||||
CLI commands.
|
||||
- **Recent work:** Phase C (automated TLS cert provisioning via Metacrypt CA),
|
||||
Phase D (automated DNS registration via MCNS), undeploy command, logs
|
||||
command, edit command, auto-login to MCR, system account auth model.
|
||||
- **Artifacts:** systemd service (NixOS), TLS cert from Metacrypt, service
|
||||
definition files, design docs.
|
||||
|
||||
### MCDeploy — Deployment CLI
|
||||
### MCDoc — Documentation Server
|
||||
|
||||
- **Version:** v0.2.0.
|
||||
- **Phase:** Active development. Tactical bridge tool for deploying services
|
||||
while MCP is being built.
|
||||
- **Deployment:** N/A (local CLI tool, not a server).
|
||||
- **Recent work:** Initial implementation, Nix flake.
|
||||
- **Description:** Single-binary CLI that shells out to podman/ssh/scp/git
|
||||
for build, push, deploy, cert renewal, and status. TOML-configured.
|
||||
- **Version:** v0.1.0.
|
||||
- **Phase:** Active development.
|
||||
- **Deployment:** Not yet deployed.
|
||||
- **Description:** Documentation server — fetches markdown from Gitea, renders
|
||||
HTML, serves public docs via mc-proxy. No MCIAS auth required.
|
||||
|
||||
## Node Inventory
|
||||
|
||||
@@ -138,6 +142,10 @@ deployed on rift, serving authoritative DNS.
|
||||
|
||||
## Rift Port Map
|
||||
|
||||
Note: Services deployed via MCP receive dynamically assigned host ports
|
||||
(10000–60000). The ports below are for infrastructure services with static
|
||||
assignments.
|
||||
|
||||
| Port | Protocol | Services |
|
||||
|------|----------|----------|
|
||||
| 53 | DNS (LAN + Tailscale) | mcns |
|
||||
|
||||
@@ -183,14 +183,19 @@ delegates authentication to it; no service maintains its own user database.
|
||||
Services validate tokens by calling back to MCIAS (cached 30s by SHA-256 of
|
||||
the token).
|
||||
|
||||
- **Role-based access.** Three roles — `admin` (full access, policy bypass),
|
||||
`user` (policy-governed), `guest` (service-dependent restrictions). Admin
|
||||
detection comes solely from the MCIAS `admin` role; services never promote
|
||||
users locally.
|
||||
- **Role-based access.** Three roles — `admin` (MCIAS account management,
|
||||
policy changes, zone mutations — reserved for human operators), `user`
|
||||
(policy-governed), `guest` (service-dependent restrictions, rejected by MCP
|
||||
agent). Admin detection comes solely from the MCIAS `admin` role; services
|
||||
never promote users locally. Routine operations (deploy, push, DNS updates)
|
||||
do not require admin.
|
||||
|
||||
- **Account types.** Human accounts (interactive users) and system accounts
|
||||
(service-to-service). Both authenticate the same way; system accounts enable
|
||||
automated workflows.
|
||||
(service-to-service). Both produce standard JWTs validated the same way.
|
||||
System accounts carry no roles — their authorization is handled by each
|
||||
service's policy engine (Metacrypt policies, MCNS name-scoped access, MCR
|
||||
default policies). System account tokens are long-lived (365-day default)
|
||||
and do not require passwords for issuance.
|
||||
|
||||
- **Login policy.** Priority-based ACL rules that control who can log into
|
||||
which services. Rules can target roles, account types, service names, and
|
||||
@@ -208,7 +213,7 @@ MCIAS evaluates login policy against the service context, verifies credentials,
|
||||
and returns a bearer token. The MCIAS Go client library
|
||||
(`git.wntrmute.dev/mc/mcias/clients/go`) handles this flow.
|
||||
|
||||
**Status:** Implemented. v1.8.0. Feature-complete with active refinement
|
||||
**Status:** Implemented. v1.9.0. Feature-complete with active refinement
|
||||
(WebAuthn/FIDO2 passkeys, TOTP 2FA, service-context login policies).
|
||||
|
||||
---
|
||||
@@ -259,7 +264,7 @@ core.
|
||||
operations on which engine mounts. Priority-based evaluation, default deny,
|
||||
admin bypass. See Metacrypt's `POLICY.md` for the full model.
|
||||
|
||||
**Status:** Implemented. v1.1.0. All four engine types complete — CA (with ACME
|
||||
**Status:** Implemented. v1.3.1. All four engine types complete — CA (with ACME
|
||||
support), SSH CA, transit encryption, and user-to-user encryption.
|
||||
|
||||
---
|
||||
@@ -278,7 +283,9 @@ serves the container images that MCP deploys across the platform.
|
||||
- **Authenticated access.** No anonymous access. MCR uses the OCI token
|
||||
authentication flow: clients hit `/v2/`, receive a 401 with a token
|
||||
endpoint, authenticate via MCIAS, and use the returned JWT for subsequent
|
||||
requests.
|
||||
requests. The token endpoint accepts both username/password (standard
|
||||
login) and pre-existing MCIAS JWTs as passwords (personal-access-token
|
||||
pattern), enabling non-interactive push/pull for system accounts and CI.
|
||||
|
||||
- **Policy-controlled push/pull.** Fine-grained ACL rules govern who can push
|
||||
to or pull from which repositories. Integrated with MCIAS roles.
|
||||
@@ -290,7 +297,7 @@ serves the container images that MCP deploys across the platform.
|
||||
is scheduled, MCP tells the node's agent which image to pull and where to get
|
||||
it. MCR sits behind an MC-Proxy instance for TLS routing.
|
||||
|
||||
**Status:** Implemented. v1.2.0. All implementation phases complete.
|
||||
**Status:** Implemented. v1.2.1. All implementation phases complete.
|
||||
|
||||
---
|
||||
|
||||
@@ -371,9 +378,13 @@ into DNS records.
|
||||
using internal DNS names automatically resolve to the right place without
|
||||
config changes.
|
||||
|
||||
- **Record management API.** Authenticated via MCIAS. MCP is the primary
|
||||
consumer for dynamic updates. Operators can also manage records directly
|
||||
for static entries (node addresses, aliases).
|
||||
- **Record management API.** Authenticated via MCIAS with name-scoped
|
||||
authorization. Admin can manage all records and zones. The `mcp-agent`
|
||||
system account can create and delete any record. Other system accounts
|
||||
can only manage records matching their own name (e.g., system account
|
||||
`mcq` can manage `mcq.svc.mcp.metacircular.net` but not other records).
|
||||
Human users have read-only access to records. Zone mutations (create,
|
||||
update, delete zones) remain admin-only.
|
||||
|
||||
**How it fits in:** MCNS answers "what is the address of X?" MCP answers "where
|
||||
is service α running?" and pushes the answer to MCNS. This separation means
|
||||
@@ -381,10 +392,11 @@ services can use stable DNS names in their configs (e.g.,
|
||||
`mcias.svc.mcp.metacircular.net` in `[mcias] server_url`) that survive
|
||||
migration without config changes.
|
||||
|
||||
**Status:** Implemented. v1.1.0. Custom Go DNS server deployed on rift,
|
||||
**Status:** Implemented. v1.1.1. Custom Go DNS server deployed on rift,
|
||||
serving two authoritative zones (`svc.mcp.metacircular.net` and
|
||||
`mcp.metacircular.net`) plus upstream forwarding. REST + gRPC APIs with
|
||||
MCIAS auth. Records stored in SQLite.
|
||||
MCIAS auth and name-scoped system account authorization. Records stored
|
||||
in SQLite.
|
||||
|
||||
---
|
||||
|
||||
@@ -409,6 +421,10 @@ each managed node.
|
||||
the initial config, pulls the image from MCR, starts the container, and
|
||||
pushes a DNS update to MCNS (`α.svc.mcp.metacircular.net` → node address).
|
||||
|
||||
- **Undeploy.** Full teardown of a service. Stops the container, removes
|
||||
MC-Proxy routes, deletes DNS records from MCNS, and cleans up the service
|
||||
registry entry. The inverse of deploy.
|
||||
|
||||
- **Migrate.** Move a service from one node to another. MCP snapshots the
|
||||
service's `/srv/<service>/` directory on the source node (as a tar.zst
|
||||
image), transfers it to the destination, extracts it, starts the service,
|
||||
@@ -435,9 +451,17 @@ each managed node.
|
||||
- **Master/agent architecture.** MCP Master runs on the operator's machine.
|
||||
Agents run on every managed node, receiving C2 (command and control) from
|
||||
Master, reporting node status, and managing local workloads. The C2 channel
|
||||
is authenticated via MCIAS. The master does not need to be always-on —
|
||||
agents keep running their workloads independently; the master is needed only
|
||||
to issue new commands.
|
||||
is authenticated via MCIAS — any authenticated non-guest user or system
|
||||
account is accepted (admin role is not required for deploy operations).
|
||||
The master does not need to be always-on — agents keep running their
|
||||
workloads independently; the master is needed only to issue new commands.
|
||||
|
||||
- **System account automation.** The agent uses an `mcp-agent` system account
|
||||
for all service-to-service communication: TLS cert provisioning (Metacrypt),
|
||||
DNS record management (MCNS), and container image pulls (MCR). Each service
|
||||
authorizes the agent through its own policy engine. Per-service system
|
||||
accounts (e.g., `mcq`) can be created for scoped self-management — a service
|
||||
account can only manage its own DNS records, not other services'.
|
||||
|
||||
- **Node management.** Track which nodes are in the platform, their health,
|
||||
available resources, and running workloads.
|
||||
@@ -458,14 +482,15 @@ services it depends on.
|
||||
can deploy them. The systemd unit files exist as a fallback and for bootstrap —
|
||||
the long-term deployment model is MCP-managed containers.
|
||||
|
||||
**Status:** Implemented. v0.4.0. Deployed on rift managing all platform
|
||||
**Status:** Implemented. v0.7.6. Deployed on rift managing all platform
|
||||
containers. Route declarations with automatic port allocation (`$PORT` /
|
||||
`$PORT_<NAME>` env vars passed to containers). MC-Proxy route registration
|
||||
during deploy and stop. Automated TLS cert provisioning for L7 routes via
|
||||
Metacrypt CA (Phase C). Two components — `mcp` CLI (operator workstation) and
|
||||
Metacrypt CA (Phase C). Automated DNS registration in MCNS during deploy
|
||||
and stop (Phase D). Two components — `mcp` CLI (operator workstation) and
|
||||
`mcp-agent` (per-node daemon with SQLite registry, rootless Podman,
|
||||
monitoring with drift/flap detection). gRPC-only (no REST). 12+ RPCs,
|
||||
15+ CLI commands.
|
||||
monitoring with drift/flap detection). gRPC-only (no REST). 15 RPCs,
|
||||
17+ CLI commands.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -123,18 +123,38 @@ Service definitions are TOML files that tell MCP what to deploy. They
|
||||
live at `~/.config/mcp/services/<service>.toml` on the operator
|
||||
workstation.
|
||||
|
||||
### Minimal Example (Single Component)
|
||||
### Minimal Example (Single Component, L7)
|
||||
|
||||
```toml
|
||||
name = "myservice"
|
||||
node = "rift"
|
||||
|
||||
[build.images]
|
||||
myservice = "Dockerfile"
|
||||
|
||||
[[components]]
|
||||
name = "web"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
|
||||
|
||||
[[components.routes]]
|
||||
port = 8443
|
||||
mode = "l7"
|
||||
```
|
||||
|
||||
### API Service Example (L4, Multiple Routes)
|
||||
|
||||
```toml
|
||||
name = "myservice"
|
||||
node = "rift"
|
||||
version = "v1.0.0"
|
||||
|
||||
[build.images]
|
||||
myservice = "Dockerfile"
|
||||
|
||||
[[components]]
|
||||
name = "api"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
|
||||
volumes = ["/srv/myservice:/srv/myservice"]
|
||||
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]
|
||||
|
||||
[[components.routes]]
|
||||
name = "rest"
|
||||
@@ -152,7 +172,6 @@ mode = "l4"
|
||||
```toml
|
||||
name = "myservice"
|
||||
node = "rift"
|
||||
version = "v1.0.0"
|
||||
|
||||
[build.images]
|
||||
myservice = "Dockerfile.api"
|
||||
@@ -160,6 +179,7 @@ myservice-web = "Dockerfile.web"
|
||||
|
||||
[[components]]
|
||||
name = "api"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
|
||||
volumes = ["/srv/myservice:/srv/myservice"]
|
||||
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]
|
||||
|
||||
@@ -175,6 +195,7 @@ mode = "l4"
|
||||
|
||||
[[components]]
|
||||
name = "web"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/myservice-web:v1.0.0"
|
||||
volumes = ["/srv/myservice:/srv/myservice"]
|
||||
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]
|
||||
|
||||
@@ -183,21 +204,16 @@ port = 443
|
||||
mode = "l7"
|
||||
```
|
||||
|
||||
### Convention-Derived Defaults
|
||||
### Conventions
|
||||
|
||||
Most fields are optional — MCP derives them from conventions:
|
||||
A few fields are derived by the agent at deploy time:
|
||||
|
||||
| Field | Default | Override when... |
|
||||
|-------|---------|------------------|
|
||||
| Image name | `<service>` (api), `<service>-<component>` (others) | Image name differs from convention |
|
||||
| Image registry | `mcr.svc.mcp.metacircular.net:8443` (from global MCP config) | Never — always use MCR |
|
||||
| Version | Service-level `version` field | A component needs a different version |
|
||||
| Volumes | `/srv/<service>:/srv/<service>` | Additional mounts are needed |
|
||||
| Network | `mcpnet` | Service needs host networking or a different network |
|
||||
| User | `0:0` | Never change this for standard services |
|
||||
| Restart | `unless-stopped` | Service should not auto-restart |
|
||||
| Source path | `<service>` relative to workspace root | Directory name differs from service name |
|
||||
| Hostname | `<service>.svc.mcp.metacircular.net` | Service needs a public hostname |
|
||||
| Source path | `<service>` relative to workspace root | Directory name differs from service name (use `path`) |
|
||||
| Hostname | `<service>.svc.mcp.metacircular.net` | Service needs a public hostname (use route `hostname`) |
|
||||
|
||||
All other fields must be explicit in the service definition.
|
||||
|
||||
### Service Definition Reference
|
||||
|
||||
@@ -207,7 +223,6 @@ Most fields are optional — MCP derives them from conventions:
|
||||
|-------|----------|---------|
|
||||
| `name` | Yes | Service name (matches project name) |
|
||||
| `node` | Yes | Target node to deploy to |
|
||||
| `version` | Yes | Image version tag (semver, e.g. `v1.0.0`) |
|
||||
| `active` | No | Whether MCP keeps this running (default: `true`) |
|
||||
| `path` | No | Source directory relative to workspace (default: `name`) |
|
||||
|
||||
@@ -215,20 +230,20 @@ Most fields are optional — MCP derives them from conventions:
|
||||
|
||||
| Field | Purpose |
|
||||
|-------|---------|
|
||||
| `build.images.<name>` | Maps image name to Dockerfile path |
|
||||
| `build.images.<name>` | Maps build image name to Dockerfile path. The `<name>` must match the repository name in a component's `image` field (the part after the last `/`, before the `:` tag). |
|
||||
|
||||
**Component fields:**
|
||||
|
||||
| Field | Purpose |
|
||||
|-------|---------|
|
||||
| `name` | Component name (e.g. `api`, `web`) |
|
||||
| `image` | Full image reference override |
|
||||
| `version` | Version override for this component |
|
||||
| `volumes` | Volume mounts (list of `host:container` strings) |
|
||||
| `cmd` | Command override (list of strings) |
|
||||
| `network` | Container network override |
|
||||
| `user` | Container user override |
|
||||
| `restart` | Restart policy override |
|
||||
| Field | Required | Purpose |
|
||||
|-------|----------|---------|
|
||||
| `name` | Yes | Component name (e.g. `api`, `web`) |
|
||||
| `image` | Yes | Full image reference (e.g. `mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0`) |
|
||||
| `volumes` | No | Volume mounts (list of `host:container` strings) |
|
||||
| `cmd` | No | Command override (list of strings) |
|
||||
| `env` | No | Extra environment variables (list of `KEY=VALUE` strings) |
|
||||
| `network` | No | Container network (default: none) |
|
||||
| `user` | No | Container user (e.g. `0:0`) |
|
||||
| `restart` | No | Restart policy (e.g. `unless-stopped`) |
|
||||
|
||||
**Route fields (under `[[components.routes]]`):**
|
||||
|
||||
@@ -248,9 +263,11 @@ Most fields are optional — MCP derives them from conventions:
|
||||
|
||||
### Version Pinning
|
||||
|
||||
Service definitions **must** pin an explicit semver tag (e.g. `v1.1.0`).
|
||||
Never use `:latest`. This ensures deployments are reproducible and
|
||||
`mcp status` shows the actual running version.
|
||||
Component `image` fields **must** pin an explicit semver tag (e.g.
|
||||
`mcr.svc.mcp.metacircular.net:8443/myservice:v1.1.0`). Never use
|
||||
`:latest`. This ensures deployments are reproducible and `mcp status`
|
||||
shows the actual running version. The version is extracted from the
|
||||
image tag.
|
||||
|
||||
---
|
||||
|
||||
@@ -385,12 +402,17 @@ addresses** — they will be overridden at deploy time.
|
||||
|
||||
| Env var | When set |
|
||||
|---------|----------|
|
||||
| `$PORT` | Component has a single route |
|
||||
| `$PORT_<NAME>` | Component has multiple named routes |
|
||||
| `$PORT` | Component has a single unnamed route |
|
||||
| `$PORT_<NAME>` | Component has named routes |
|
||||
|
||||
Route names are uppercased: `name = "rest"` → `$PORT_REST`,
|
||||
`name = "grpc"` → `$PORT_GRPC`.
|
||||
|
||||
**Container listen address:** Services must bind to `0.0.0.0:$PORT`
|
||||
(or `:$PORT`), not `localhost:$PORT`. Podman port-forwards go through
|
||||
the container's network namespace — binding to `localhost` inside the
|
||||
container makes the port unreachable from outside.
|
||||
|
||||
Services built with **mcdsl v1.1.0+** handle this automatically —
|
||||
`config.Load` checks `$PORT` → overrides `Server.ListenAddr`, and
|
||||
`$PORT_GRPC` → overrides `Server.GRPCAddr`. These take precedence over
|
||||
@@ -475,11 +497,14 @@ co-located on the same node).
|
||||
| `mcp build <service>` | Build and push images to MCR |
|
||||
| `mcp sync` | Push all service definitions to agents; auto-build missing images |
|
||||
| `mcp deploy <service>` | Pull image, (re)create containers, register routes |
|
||||
| `mcp undeploy <service>` | Full teardown: remove routes, DNS, certs, and containers |
|
||||
| `mcp stop <service>` | Remove routes, stop containers |
|
||||
| `mcp start <service>` | Start previously stopped containers |
|
||||
| `mcp restart <service>` | Restart containers in place |
|
||||
| `mcp ps` | List all managed containers and status |
|
||||
| `mcp status [service]` | Detailed status for a specific service |
|
||||
| `mcp logs <service>` | Stream container logs |
|
||||
| `mcp edit <service>` | Edit service definition |
|
||||
|
||||
---
|
||||
|
||||
@@ -504,13 +529,14 @@ git push origin v1.0.0
|
||||
cat > ~/.config/mcp/services/myservice.toml << 'EOF'
|
||||
name = "myservice"
|
||||
node = "rift"
|
||||
version = "v1.0.0"
|
||||
|
||||
[build.images]
|
||||
myservice = "Dockerfile.api"
|
||||
|
||||
[[components]]
|
||||
name = "api"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
|
||||
volumes = ["/srv/myservice:/srv/myservice"]
|
||||
|
||||
[[components.routes]]
|
||||
name = "rest"
|
||||
@@ -584,15 +610,84 @@ Services follow a standard directory structure:
|
||||
|
||||
---
|
||||
|
||||
## 10. Agent Management
|
||||
|
||||
MCP manages a fleet of nodes with heterogeneous operating systems and
|
||||
architectures. The agent binary lives at `/srv/mcp/mcp-agent` on every
|
||||
node — this is a mutable path that MCP controls, regardless of whether
|
||||
the node runs NixOS or Debian.
|
||||
|
||||
### Node Configuration
|
||||
|
||||
Each node in `~/.config/mcp/mcp.toml` includes SSH and architecture
|
||||
info for agent management:
|
||||
|
||||
```toml
|
||||
[[nodes]]
|
||||
name = "rift"
|
||||
address = "100.95.252.120:9444"
|
||||
ssh = "rift"
|
||||
arch = "amd64"
|
||||
|
||||
[[nodes]]
|
||||
name = "hyperborea"
|
||||
address = "100.x.x.x:9444"
|
||||
ssh = "hyperborea"
|
||||
arch = "arm64"
|
||||
```
|
||||
|
||||
### Upgrading Agents
|
||||
|
||||
After tagging a new MCP release:
|
||||
|
||||
```bash
|
||||
# Upgrade all nodes (recommended — prevents version skew)
|
||||
mcp agent upgrade
|
||||
|
||||
# Upgrade a single node
|
||||
mcp agent upgrade rift
|
||||
|
||||
# Check versions across the fleet
|
||||
mcp agent status
|
||||
```
|
||||
|
||||
`mcp agent upgrade` cross-compiles the agent binary for each target
|
||||
architecture, SSHs to each node, atomically replaces the binary, and
|
||||
restarts the systemd service. All nodes should be upgraded together
|
||||
because new CLI versions often depend on new agent RPCs.
|
||||
|
||||
### Provisioning New Nodes
|
||||
|
||||
One-time setup for a new Debian node:
|
||||
|
||||
```bash
|
||||
# 1. Provision the node (creates user, dirs, systemd unit, installs binary)
|
||||
mcp node provision <name>
|
||||
|
||||
# 2. Register the node
|
||||
mcp node add <name> <address>
|
||||
|
||||
# 3. Deploy services
|
||||
mcp deploy <service>
|
||||
```
|
||||
|
||||
For NixOS nodes, provisioning is handled by the NixOS configuration.
|
||||
The NixOS config creates the `mcp` user, systemd unit, and directories.
|
||||
The `ExecStart` path points to `/srv/mcp/mcp-agent` so that `mcp agent
|
||||
upgrade` works the same as on Debian nodes.
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Currently Deployed Services
|
||||
|
||||
For reference, these services are operational on the platform:
|
||||
|
||||
| Service | Version | Node | Purpose |
|
||||
|---------|---------|------|---------|
|
||||
| MCIAS | v1.8.0 | (separate) | Identity and access |
|
||||
| Metacrypt | v1.1.0 | rift | Cryptographic service, PKI/CA |
|
||||
| MC-Proxy | v1.1.0 | rift | TLS proxy and router |
|
||||
| MCR | v1.2.0 | rift | Container registry |
|
||||
| MCNS | v1.1.0 | rift | Authoritative DNS |
|
||||
| MCP | v0.3.0 | rift | Control plane agent |
|
||||
| MCIAS | v1.9.0 | (separate) | Identity and access |
|
||||
| Metacrypt | v1.3.1 | rift | Cryptographic service, PKI/CA |
|
||||
| MC-Proxy | v1.2.1 | rift | TLS proxy and router |
|
||||
| MCR | v1.2.1 | rift | Container registry |
|
||||
| MCNS | v1.1.1 | rift | Authoritative DNS |
|
||||
| MCDoc | v0.1.0 | rift | Documentation server |
|
||||
| MCP | v0.7.6 | rift | Control plane agent |
|
||||
|
||||
@@ -1018,6 +1018,13 @@ Write these before writing code. They are the blueprint, not the afterthought.
|
||||
- **Never log secrets.** Keys, passwords, tokens, and plaintext must never
|
||||
appear in log output.
|
||||
|
||||
### CLI Security
|
||||
|
||||
- **Never echo passwords.** Interactive password prompts must suppress
|
||||
terminal echo. Use `mcdsl/terminal.ReadPassword` — it wraps
|
||||
`golang.org/x/term.ReadPassword` with proper prompt and newline handling.
|
||||
Never read passwords with `bufio.Scanner` or `fmt.Scanln`.
|
||||
|
||||
### Web Security
|
||||
|
||||
- CSRF tokens on all mutating requests.
|
||||
|
||||
Reference in New Issue
Block a user