Refine platform evolution and engineering standards
PLATFORM_EVOLUTION.md: rewrite with convention-driven service definitions (derived image names, service-level version, agent defaults), mcdsl as a proper Go module (gap #1), multi-node design considerations, and service discovery open question. engineering-standards.md: add shared libraries section establishing mcdsl as a normally-versioned Go module (no replace directives in committed code), update service definition example to convention- driven minimal format with route declarations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -28,9 +28,12 @@ But the wiring between services is manual:
|
||||
placed in `/srv/mc-proxy/certs/`, and referenced by path in the
|
||||
mc-proxy config.
|
||||
- **DNS**: records are manually configured in MCNS zone files.
|
||||
- **Container networking**: operators specify `network`, `user`, and
|
||||
`restart` policy per component, even though these are almost always
|
||||
the same values.
|
||||
- **Container config boilerplate**: operators specify `network`, `user`,
|
||||
`restart`, full image URLs, and port mappings per component, even
|
||||
though these are almost always the same values.
|
||||
- **mcdsl build wiring**: the shared library requires `replace`
|
||||
directives or sibling directory tricks in Docker builds. It should
|
||||
be a normally-versioned Go module fetched by the toolchain.
|
||||
|
||||
Each new service requires touching 4-5 files across 3-4 repos. The
|
||||
process works but doesn't scale and is error-prone.
|
||||
@@ -43,11 +46,7 @@ want, not **how** to wire it:
|
||||
```toml
|
||||
name = "metacrypt"
|
||||
node = "rift"
|
||||
active = true
|
||||
path = "metacrypt"
|
||||
|
||||
[build]
|
||||
uses_mcdsl = false
|
||||
version = "v1.0.0"
|
||||
|
||||
[build.images]
|
||||
metacrypt = "Dockerfile.api"
|
||||
@@ -55,9 +54,6 @@ metacrypt-web = "Dockerfile.web"
|
||||
|
||||
[[components]]
|
||||
name = "api"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/metacrypt:v1.0.0"
|
||||
volumes = ["/srv/metacrypt:/srv/metacrypt"]
|
||||
cmd = ["server", "--config", "/srv/metacrypt/metacrypt.toml"]
|
||||
|
||||
[[components.routes]]
|
||||
name = "rest"
|
||||
@@ -71,20 +67,30 @@ mode = "l4"
|
||||
|
||||
[[components]]
|
||||
name = "web"
|
||||
image = "mcr.svc.mcp.metacircular.net:8443/metacrypt-web:v1.0.0"
|
||||
volumes = ["/srv/metacrypt:/srv/metacrypt"]
|
||||
cmd = ["server", "--config", "/srv/metacrypt/metacrypt.toml"]
|
||||
|
||||
[[components.routes]]
|
||||
name = "web"
|
||||
port = 443
|
||||
mode = "l7"
|
||||
```
|
||||
|
||||
Everything else is derived from conventions:
|
||||
|
||||
- **Image name**: `<service>` for the first/api component,
|
||||
`<service>-<component>` for others. Resolved against the registry
|
||||
URL from global MCP config (`~/.config/mcp/mcp.toml`).
|
||||
- **Version**: the service-level `version` field applies to all
|
||||
components. Can be overridden per-component when needed.
|
||||
- **Volumes**: `/srv/<service>:/srv/<service>` is the agent default.
|
||||
Only declare additional mounts.
|
||||
- **Network, user, restart**: agent defaults (`mcpnet`, `0:0`,
|
||||
`unless-stopped`). Override only when needed.
|
||||
- **Source path**: defaults to `<service>` relative to the workspace
|
||||
root. Override with `path` if different.
|
||||
|
||||
`mcp deploy metacrypt` does the rest:
|
||||
|
||||
1. Agent assigns a free host port per route (random, check availability,
|
||||
retry on collision).
|
||||
1. Agent assigns a free host port per route (random, check
|
||||
availability, retry on collision).
|
||||
2. Agent requests TLS certs from Metacrypt CA for
|
||||
`metacrypt.svc.mcp.metacircular.net`.
|
||||
3. Agent registers routes with mc-proxy via gRPC (mc-proxy persists
|
||||
@@ -128,16 +134,27 @@ hostname = "docs.metacircular.net" # optional, public DNS
|
||||
If `hostname` is omitted, the route uses the default
|
||||
`<service>.svc.mcp.metacircular.net`.
|
||||
|
||||
### Fields Removed from Service Definitions
|
||||
### Multi-Node Considerations
|
||||
|
||||
These become agent-level defaults or are derived automatically:
|
||||
This design targets single-node (rift) but should not prevent
|
||||
multi-node operation. Key design decisions that keep the door open:
|
||||
|
||||
| Field | Current | Target |
|
||||
|-------|---------|--------|
|
||||
| `ports` | Manual port mapping | Agent-assigned via routes |
|
||||
| `network` | Per-component | Agent default (`mcpnet`) |
|
||||
| `user` | Per-component | Agent default (`0:0`) |
|
||||
| `restart` | Per-component | Agent default (`unless-stopped`) |
|
||||
- **Port assignment is per-agent.** Each node's agent manages its own
|
||||
port space. No cross-node coordination needed.
|
||||
- **Route registration uses the node's address, not `127.0.0.1`.**
|
||||
When mc-proxy and the service are on the same host, the backend is
|
||||
loopback. When they're on different hosts, the backend is the node's
|
||||
network address. The agent registers the appropriate address for its
|
||||
node. The mc-proxy route API already accepts arbitrary backend
|
||||
addresses.
|
||||
- **DNS can have multiple A records.** MCNS can return multiple records
|
||||
for the same hostname (one per node) for simple load distribution.
|
||||
- **The CLI routes to the correct agent via the `node` field.** Adding
|
||||
a second node is `mcp node add orion <address>` and then services
|
||||
can target `node = "orion"`.
|
||||
|
||||
Nothing in the single-node implementation should hardcode assumptions
|
||||
about one node, one mc-proxy, or loopback-only backends.
|
||||
|
||||
---
|
||||
|
||||
@@ -157,11 +174,30 @@ These become agent-level defaults or are derived automatically:
|
||||
| MCNS DNS serving | Working |
|
||||
| MCR container registry | Working |
|
||||
| Service definitions in ~/.config/mcp/services/ | Working |
|
||||
| Image build pipeline (mcdeploy.toml, being folded into MCP) | Working |
|
||||
| Image build pipeline (being folded into MCP) | Working |
|
||||
|
||||
### What needs to change
|
||||
|
||||
#### 1. MCP Agent: Port Assignment
|
||||
#### 1. mcdsl: Proper Module Versioning
|
||||
|
||||
**Gap**: mcdsl is used via `replace` directives and sibling directory
|
||||
hacks. Docker builds require the source tree to be adjacent. This is
|
||||
fragile and violates normal Go module conventions.
|
||||
|
||||
**Work**:
|
||||
- Tag mcdsl releases with semver (e.g., `v1.0.0`, `v1.1.0`).
|
||||
- Remove all `replace` directives from consuming services' `go.mod`
|
||||
files. Services import mcdsl by URL and version like any other
|
||||
dependency.
|
||||
- Docker builds fetch mcdsl via the Go module proxy / Gitea — no local
|
||||
source tree required.
|
||||
- `uses_mcdsl` is eliminated from service definitions and build config.
|
||||
|
||||
**Depends on**: Gitea module hosting working correctly for
|
||||
`git.wntrmute.dev/kyle/mcdsl` (it should already — Go modules over
|
||||
git are standard).
|
||||
|
||||
#### 2. MCP Agent: Port Assignment
|
||||
|
||||
**Gap**: agent doesn't manage host ports. Service definitions specify
|
||||
them manually.
|
||||
@@ -177,19 +213,20 @@ them manually.
|
||||
|
||||
**Depends on**: nothing (can be developed standalone).
|
||||
|
||||
#### 2. MCP Agent: mc-proxy Route Registration
|
||||
#### 3. MCP Agent: mc-proxy Route Registration
|
||||
|
||||
**Gap**: mc-proxy routes are static TOML. The gRPC admin API exists but
|
||||
MCP doesn't use it.
|
||||
|
||||
**Work**:
|
||||
- Agent calls mc-proxy gRPC API to register/remove routes on deploy/stop.
|
||||
- Route registration includes: hostname, host port (agent-assigned),
|
||||
mode (l4/l7), TLS cert paths.
|
||||
- Agent calls mc-proxy gRPC API to register/remove routes on
|
||||
deploy/stop.
|
||||
- Route registration includes: hostname, backend address (node address
|
||||
+ assigned port), mode (l4/l7), TLS cert paths.
|
||||
|
||||
**Depends on**: port assignment (#1), mc-proxy route persistence (#4).
|
||||
**Depends on**: port assignment (#2), mc-proxy route persistence (#5).
|
||||
|
||||
#### 3. MCP Agent: TLS Cert Provisioning
|
||||
#### 4. MCP Agent: TLS Cert Provisioning
|
||||
|
||||
**Gap**: certs are manually provisioned and placed on disk. There is no
|
||||
automated issuance flow.
|
||||
@@ -200,9 +237,9 @@ automated issuance flow.
|
||||
(`/srv/mc-proxy/certs/<service>.pem`).
|
||||
- Cert renewal is handled automatically before expiry.
|
||||
|
||||
**Depends on**: Metacrypt cert issuance policy (#6).
|
||||
**Depends on**: Metacrypt cert issuance policy (#7).
|
||||
|
||||
#### 4. mc-proxy: Route Persistence
|
||||
#### 5. mc-proxy: Route Persistence
|
||||
|
||||
**Gap**: mc-proxy loads routes from TOML on startup. Routes added via
|
||||
gRPC are lost on restart.
|
||||
@@ -210,13 +247,13 @@ gRPC are lost on restart.
|
||||
**Work**:
|
||||
- mc-proxy persists gRPC-managed routes in its SQLite database.
|
||||
- On startup, mc-proxy loads routes from the database.
|
||||
- TOML route config is deprecated (kept for bootstrapping only, e.g.,
|
||||
mc-proxy's own routes before MCP is fully operational).
|
||||
- mcproxyctl becomes the primary route management interface.
|
||||
- TOML route config is vestigial — kept only for mc-proxy's own
|
||||
bootstrap before MCP is operational. The gRPC API and mcproxyctl
|
||||
are the primary route management interfaces going forward.
|
||||
|
||||
**Depends on**: nothing (mc-proxy already has SQLite and gRPC API).
|
||||
|
||||
#### 5. MCP Agent: DNS Registration
|
||||
#### 6. MCP Agent: DNS Registration
|
||||
|
||||
**Gap**: DNS records are manually configured in MCNS zone files.
|
||||
|
||||
@@ -225,9 +262,9 @@ gRPC are lost on restart.
|
||||
`<service>.svc.mcp.metacircular.net`.
|
||||
- Agent removes records on service teardown.
|
||||
|
||||
**Depends on**: MCNS record management API (#7).
|
||||
**Depends on**: MCNS record management API (#8).
|
||||
|
||||
#### 6. Metacrypt: Automated Cert Issuance Policy
|
||||
#### 7. Metacrypt: Automated Cert Issuance Policy
|
||||
|
||||
**Gap**: no policy exists for automated cert issuance. The MCP agent
|
||||
doesn't have a Metacrypt identity or permissions.
|
||||
@@ -235,13 +272,14 @@ doesn't have a Metacrypt identity or permissions.
|
||||
**Work**:
|
||||
- MCP agent gets an MCIAS service account.
|
||||
- Metacrypt policy allows this account to issue certs scoped to
|
||||
`*.svc.mcp.metacircular.net` (and explicitly listed public hostnames).
|
||||
`*.svc.mcp.metacircular.net` (and explicitly listed public
|
||||
hostnames).
|
||||
- No wildcard certs — one cert per hostname per service.
|
||||
|
||||
**Depends on**: MCIAS service account provisioning (exists today, just
|
||||
needs the account created).
|
||||
|
||||
#### 7. MCNS: Record Management API
|
||||
#### 8. MCNS: Record Management API
|
||||
|
||||
**Gap**: MCNS is a CoreDNS precursor serving static zone files. There
|
||||
is no API for dynamic record management.
|
||||
@@ -257,7 +295,7 @@ is no API for dynamic record management.
|
||||
wrapper, not a full service. This may be the right time to build the
|
||||
real MCNS.
|
||||
|
||||
#### 8. Application $PORT Convention
|
||||
#### 9. Application $PORT Convention
|
||||
|
||||
**Gap**: applications read listen addresses from their config files.
|
||||
They don't check `$PORT` env vars.
|
||||
@@ -279,32 +317,38 @@ The dependencies form a rough order:
|
||||
|
||||
```
|
||||
Phase A — Independent groundwork (parallel):
|
||||
#1 MCP agent port assignment
|
||||
#4 mc-proxy route persistence
|
||||
#8 $PORT convention in applications
|
||||
#1 mcdsl proper module versioning
|
||||
#2 MCP agent port assignment
|
||||
#5 mc-proxy route persistence
|
||||
#9 $PORT convention in applications
|
||||
|
||||
Phase B — MCP route registration:
|
||||
#2 Agent registers routes with mc-proxy
|
||||
(depends on #1 + #4)
|
||||
#3 Agent registers routes with mc-proxy
|
||||
(depends on #2 + #5)
|
||||
|
||||
Phase C — Automated TLS:
|
||||
#6 Metacrypt cert issuance policy
|
||||
#3 Agent provisions certs
|
||||
(depends on #6)
|
||||
#7 Metacrypt cert issuance policy
|
||||
#4 Agent provisions certs
|
||||
(depends on #7)
|
||||
|
||||
Phase D — DNS:
|
||||
#7 MCNS record management API
|
||||
#5 Agent registers DNS
|
||||
(depends on #7)
|
||||
#8 MCNS record management API
|
||||
#6 Agent registers DNS
|
||||
(depends on #8)
|
||||
```
|
||||
|
||||
After Phase B, the manual steps are: cert provisioning and DNS. After
|
||||
Phase C, only DNS remains manual. After Phase D, `mcp deploy` is fully
|
||||
declarative.
|
||||
After Phase A, mcdsl builds are clean and services can be deployed
|
||||
with agent-assigned ports (manually registered in mc-proxy).
|
||||
|
||||
Each phase is independently useful. Phase A + B alone eliminates the
|
||||
most common source of manual wiring errors (port assignment and mc-proxy
|
||||
config).
|
||||
After Phase B, the manual steps are: cert provisioning and DNS. This
|
||||
is the biggest quality-of-life improvement — no more manual port
|
||||
picking or mc-proxy TOML editing.
|
||||
|
||||
After Phase C, only DNS remains manual.
|
||||
|
||||
After Phase D, `mcp deploy` is fully declarative.
|
||||
|
||||
Each phase is independently useful and deployable.
|
||||
|
||||
---
|
||||
|
||||
@@ -317,14 +361,16 @@ config).
|
||||
in addition to the `.svc.mcp.metacircular.net` name. Public DNS is
|
||||
managed outside MCNS (Cloudflare? registrar?). How does the agent
|
||||
handle the split between internal and external DNS?
|
||||
- **mc-proxy bootstrap**: mc-proxy itself needs routes to be reachable.
|
||||
If routes are in SQLite, how does mc-proxy start before MCP configures
|
||||
it? A small set of static bootstrap routes (or self-configuration) may
|
||||
be needed.
|
||||
- **Multi-node**: this design assumes single-node (rift). When a second
|
||||
node is added, port assignment is still per-agent, but mc-proxy
|
||||
routing, cert provisioning, and DNS need to account for multiple
|
||||
backends. Not a v1 concern, but worth keeping in mind.
|
||||
- **mc-proxy bootstrap**: mc-proxy itself is a service that needs to be
|
||||
running before other services can be routed. Its own routes (if any)
|
||||
may need to be self-configured or seeded from a minimal static config
|
||||
at first start. Once operational, all route management goes through
|
||||
the gRPC API.
|
||||
- **Rollback**: if cert provisioning fails mid-deploy, does the agent
|
||||
roll back the port assignment and mc-proxy route? What's the failure
|
||||
mode — partial deploy, full rollback, or best-effort?
|
||||
- **Service discovery between components**: currently, components find
|
||||
each other via config (e.g., mcr-web knows mcr-api's gRPC address).
|
||||
With agent-assigned ports, components within a service need to
|
||||
discover each other's ports. The agent could set additional env vars
|
||||
(`$PEER_API_GRPC=127.0.0.1:9217`) or services could query the agent.
|
||||
|
||||
Reference in New Issue
Block a user