70 Commits

Author SHA1 Message Date
5da307cab5 Add Dockerfile and docker-master build target
Two-stage build: golang:1.25-alpine builder, alpine:3.21 runtime.
Produces a minimal container image for mcp-master.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:52:10 -07:00
22a836812f Add public, tier, node fields to ServiceDef
RouteDef gains Public field (bool) for edge routing. ServiceDef gains
Tier field. Node validation relaxed: defaults to tier=worker when both
node and tier are empty (v2 compatibility).

ToProto/FromProto updated to round-trip all new fields. Without this,
public=true in TOML was silently dropped and edge routing never triggered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:42:00 -07:00
9918859705 Resolve node hostname to IP for DNS registration
Node addresses may be Tailscale DNS names (e.g., rift.scylla-hammerhead.ts.net:9444)
but MCNS needs an IPv4 address for A records. The master now resolves
the hostname via net.LookupHost before passing it to the DNS client.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 20:58:21 -07:00
da59d60c2d Add master integration to CLI deploy and undeploy
- CLIConfig gains optional [master] section with address field
- dialMaster() creates McpMasterServiceClient (same TLS/token pattern)
- deploy: routes through master when [master] configured, --direct
  flag bypasses master for v1-style agent deployment
- undeploy: same master/direct routing pattern
- Master responses show per-step results (deploy, dns, edge)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:43:51 -07:00
598ea44e0b Add mcp-master binary and build target
New cmd/mcp-master/ entry point following the agent pattern:
cobra CLI with --config, version, and server commands.

Makefile: add mcp-master target, update all and clean targets.
Example config: deploy/examples/mcp-master.toml with all sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:41:43 -07:00
6fd81cacf2 Add master core: deploy, undeploy, status, placement, DNS
Master struct with Run() lifecycle following the agent pattern exactly:
open DB → bootstrap nodes → create agent pool → DNS client → TLS →
auth interceptor → gRPC server → signal handler.

RPC handlers:
- Deploy: place service (tier-aware), forward to agent, register DNS
  with Tailnet IP, detect public routes, validate against allowed
  domains, coordinate edge routing via SetupEdgeRoute, record placement
  and edge routes in master DB, return structured per-step results.
- Undeploy: undeploy on worker first, then remove edge routes, DNS,
  and DB records. Best-effort cleanup on failure.
- Status: query agents for service status, aggregate with placements
  and edge route info from master DB.
- ListNodes: return all nodes with placement counts.

Placement algorithm: fewest services, ties broken alphabetically.
DNS client: extracted from agent's DNSRegistrar with explicit nodeAddr
parameter (master registers for different nodes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:39:46 -07:00
20735e4b41 Add agent client and connection pool for master
AgentClient wraps a gRPC connection to a single agent with typed
forwarding methods (Deploy, UndeployService, SetupEdgeRoute, etc.).
AgentPool manages connections to multiple agents keyed by node name.

Follows the same TLS 1.3 + token interceptor pattern as cmd/mcp/dial.go
but runs server-side with the master's own MCIAS service token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:35:16 -07:00
3c0b55f9f8 Add master database with nodes, placements, and edge_routes
New internal/masterdb/ package for mcp-master cluster state. Separate
from the agent's registry because the schemas are fundamentally
different (cluster-wide placement vs node-local containers).

Tables: nodes, placements, edge_routes. Full CRUD with tests.
Follows the same Open/migrate pattern as internal/registry/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:26:04 -07:00
78890ed76a Add master config loader
MasterConfig with TOML loading, env overrides (MCP_MASTER_*), defaults,
and validation. Follows the exact pattern of AgentConfig. Includes:
server, database, MCIAS, edge (allowed_domains), registration
(allowed_agents, max_nodes), timeouts, MCNS, bootstrap [[nodes]], and
master service token path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:23:19 -07:00
c5ff5bb63c Add McpMasterService proto and v2 ServiceSpec fields
- New proto/mcp/v1/master.proto: McpMasterService with Deploy, Undeploy,
  Status, ListNodes RPCs and all message types per architecture v2 spec.
- ServiceSpec gains tier (field 5), node (field 6), snapshot (field 7).
- RouteSpec gains public (field 5) for edge routing.
- New SnapshotConfig message (method + excludes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:22:04 -07:00
ddd6f123ab Merge pull request 'Document v2 multi-node architecture in CLAUDE.md' (#3) from claude/update-claude-md-v2-multinode into master 2026-04-02 22:20:19 +00:00
90445507a3 Document v2 multi-node architecture in CLAUDE.md
Add v2 Development section covering multi-node fleet design (master,
agent self-registration, tier-based placement, edge routing). Update
project structure to reflect new agent subsystems (edge_rpc, dns,
proxy, certs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:14:20 -07:00
68d670b3ed Add edge CLI scaffolding for Phase 2 testing
Temporary CLI commands for testing edge routing RPCs directly
(before the master exists):

  mcp edge list -n svc
  mcp edge setup <hostname> -n svc --backend-hostname ... --backend-port ...
  mcp edge remove <hostname> -n svc

Verified end-to-end on svc: setup provisions route in mc-proxy and
persists in agent registry, remove cleans up both, list shows routes
with cert metadata.

Finding: MCNS registers LAN IPs for .svc.mcp. hostnames, not Tailnet
IPs. The v2 master needs to register Tailnet IPs in deploy flow step 3.

These commands will be removed or replaced when the master is built
(Phase 3).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:04:12 -07:00
714320c018 Add edge routing and health check RPCs (Phase 2)
New agent RPCs for v2 multi-node orchestration:

- SetupEdgeRoute: provisions TLS cert from Metacrypt, resolves backend
  hostname to Tailnet IP, validates it's in 100.64.0.0/10, registers
  L7 route in mc-proxy. Rejects backend_tls=false.
- RemoveEdgeRoute: removes mc-proxy route, cleans up TLS cert, removes
  registry entry.
- ListEdgeRoutes: returns all edge routes with cert serial/expiry.
- HealthCheck: returns agent health and container count.

New database table (migration 4): edge_routes stores hostname, backend
info, and cert paths for persistence across agent restarts.

ProxyRouter gains CertPath/KeyPath helpers for consistent cert path
construction.

Security:
- Backend hostname must resolve to a Tailnet IP (100.64.0.0/10)
- backend_tls=false is rejected (no cleartext to backends)
- Cert provisioning failure fails the setup (no route to missing cert)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 13:13:10 -07:00
fa8ba6fac1 Move ARCHITECTURE_V2.md to metacircular docs
The v2 architecture doc is platform-wide (covers master, agents,
edge routing, snapshots, migration across all nodes). Moved to
docs/architecture-v2.md in the metacircular workspace repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 11:09:09 -07:00
f66758b92b Hardcode version in flake.nix
The git fetcher doesn't provide gitDescribe, so the Nix build was
falling through to shortRev and producing commit-hash versions instead
of tag-based ones.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 17:46:12 -07:00
09d0d197c3 Add component-level targeting to start, stop, and restart
Allow start/stop/restart to target a single component via
<service>/<component> syntax, matching deploy/logs/purge. When a
component is specified, start/stop skip toggling the service-level
active flag. Agent-side filtering returns NotFound for unknown
components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 17:26:05 -07:00
52914d50b0 Pass mode, backend-tls, and tls cert/key through route add
The --mode flag was defined but never wired through to the RPC.
Add tls_cert and tls_key fields to AddProxyRouteRequest so L7
routes can be created via mcp route add.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 20:44:44 -07:00
bb4bee51ba Add mono-repo consideration to ARCHITECTURE_V2.md open questions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 20:40:32 -07:00
4ac8a6d60b Add ARCHITECTURE_V2.md for multi-node master/agent topology
Documents the planned v2 architecture: mcp-master on straylight
coordinates deployments across worker (rift) and edge (svc) nodes.
Includes edge routing flow, agent RPCs, migration plan, and
operational issues from v1 that motivate the redesign.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 20:37:24 -07:00
d8f45ca520 Merge explicit ports with route-allocated ports during deploy
Previously, explicit port mappings from the service definition were
ignored when routes were present. Now both are included, allowing
services to have stable external port bindings alongside dynamic
route-allocated ports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:28:40 -07:00
95f86157b4 Add mcp route command for managing mc-proxy routes
New top-level command with list, add, remove subcommands. Supports
-n/--node to target a specific node. Adds AddProxyRoute and
RemoveProxyRoute RPCs to the agent. Moves route listing from
mcp node routes to mcp route list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 19:10:11 -07:00
93e26d3789 Add mcp dns and mcp node routes commands
mcp dns queries MCNS via an agent to list all zones and DNS records.
mcp node routes queries mc-proxy on each node for listener/route status,
matching the mcproxyctl status output format.

New agent RPCs: ListDNSRecords, ListProxyRoutes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 18:51:53 -07:00
3d2edb7c26 Fall back to podman logs when journalctl is inaccessible
Probe journalctl with -n 0 before committing to it. When the journal
is not readable (e.g. rootless podman without user journal storage),
fall back to podman logs instead of streaming the permission error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:54:14 -07:00
bf02935716 Add agent version to mcp node list
Thread the linker-injected version string into the Agent struct and
return it in the NodeStatus RPC. The CLI now dials each node and
displays the agent version alongside name and address.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:49:49 -07:00
c4f0d7be8e Fix mcp logs permission error for rootless podman journald driver
Rootless podman writes container logs to the user journal, but
journalctl without --user only reads the system journal. Add --user
when the agent is running as a non-root user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 16:46:01 -07:00
4d900eafd1 Derive flake version from git rev instead of hardcoding
Eliminates the manual version bump in flake.nix on each release.
Uses self.shortRev (or dirtyShortRev) since self.gitDescribe is not
yet available in this Nix version. Makefile builds still get the full
git describe output via ldflags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:48:44 -07:00
38f9070c24 Add top-level mcp edit command
Shortcut for mcp service edit — opens the service definition in $EDITOR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 22:44:19 -07:00
67d0ab1d9d Bump flake.nix version to 0.7.5
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 19:40:29 -07:00
7383b370f0 Fix mcp ps showing registry version instead of runtime, error on unknown component
mcp ps now uses the actual container image and version from the runtime
instead of the registry, which could be stale after a failed deploy.

Deploy now returns an error when the component filter matches nothing
instead of silently succeeding with zero results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 19:13:02 -07:00
4c847e6de9 Fix extraneous blank lines in mcp logs output
Skip empty lines from the scanner that result from double newlines
(application slog trailing newline + container runtime newline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:22:38 -07:00
14b978861f Add mcp logs command for streaming container logs
New server-streaming Logs RPC streams container output to the CLI.
Supports --tail/-n, --follow/-f, --timestamps/-t, --since.

Detects journald log driver and falls back to journalctl (podman logs
can't read journald outside the originating user session). New containers
default to k8s-file via mcp user's containers.conf.

Also adds stream auth interceptor for the agent gRPC server (required
for streaming RPCs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 17:54:48 -07:00
18365cc0a8 Document system account auth model in ARCHITECTURE.md
Replaces the "admin required for all operations" model with the new
three-tier identity model: human operators for CLI, mcp-agent system
account for infrastructure automation, admin reserved for MCIAS-level
administration. Documents agent-to-service token paths and per-service
authorization policies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 16:11:08 -07:00
86d516acf6 Drop admin requirement from agent interceptor, reject guests
The agent now accepts any authenticated user or system account, except
those with the guest role. Admin is reserved for MCIAS account management
and policy changes, not routine deploy/stop/start operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 16:07:17 -07:00
dd167b8e0b Auto-login to MCR before image push using CLI token
mcp build and mcp deploy (auto-build path) now authenticate to the
container registry using the CLI's stored MCIAS token before pushing.
MCR accepts JWTs as passwords, so this works with both human and
service account tokens. Falls back silently to existing podman auth.

Eliminates the need for a separate interactive `podman login` step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:13:35 -07:00
41437e3730 Use mcdsl/terminal.ReadPassword for secure password input
Replaces raw bufio.Scanner password reading (which echoed to terminal)
with the new mcdsl terminal package that suppresses echo via x/term.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:11:35 -07:00
cedba9bf83 Fix mcp ps uptime: parse StartedAt from podman ps JSON
List() was not extracting the StartedAt field from podman's JSON
output, so LiveCheck always returned zero timestamps and the CLI
showed "-" for every container's uptime.

podman ps --format json includes StartedAt as a Unix timestamp
(int64). Parse it into ContainerInfo.Started so the existing
LiveCheck → CLI uptime display chain works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 22:56:39 -07:00
f06ab9aeb6 Bump flake version to 0.6.0
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:46:23 -07:00
f932dd64cc Add undeploy command: full inverse of deploy
Implements `mcp undeploy <service>` which tears down all infrastructure
for a service: removes mc-proxy routes, DNS records, TLS certificates,
stops and removes containers, releases allocated ports, and marks the
service inactive.

This fills the gap between `stop` (temporary pause) and `purge` (registry
cleanup). Undeploy is the complete teardown that returns the node to the
state before the service was deployed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:45:42 -07:00
b2eaa69619 flake: install shell completions for mcp
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 21:41:25 -07:00
43789dd6be Fix route-based port mapping: use hostPort as container port
allocateRoutePorts() was using the route's port field (the mc-proxy
listener port, e.g. 443) as the container internal port in the podman
port mapping. For L7 routes, apps don't listen on the mc-proxy port —
they read $PORT (set to the assigned host port) and listen on that.

The mapping host:53204 → container:443 fails because nothing listens
on 443 inside the container. Fix: use hostPort as both the host and
container port, so $PORT = host port = container port.

Broke mcdoc in production (manually fixed, now permanently fixed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:50:48 -07:00
2dd0ea93fc Fix ImageExists to use skopeo instead of podman manifest inspect
podman manifest inspect only works for multi-arch manifest lists,
returning exit code 125 for regular single-arch images. Switch to
skopeo inspect which works for both.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:49:48 -07:00
169b3a0d4a Fix EnsureRecord to check all existing records before updating
When multiple A records exist for a service (e.g., LAN and Tailscale
IPs), check all of them for the correct value before attempting an
update. Previously only checked the first record, which could trigger
a 409 conflict if another record already had the target value.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 15:17:19 -07:00
2bda7fc138 Fix DNS record JSON parsing for MCNS response format
MCNS returns records wrapped in {"records": [...]} envelope with
uppercase field names (ID, Name, Type, Value), not bare arrays
with lowercase fields.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 15:12:43 -07:00
76247978c2 Fix protoToComponent to include routes in synced components
Routes from the proto ComponentSpec were dropped during sync, causing
the deploy flow to see empty regRoutes and skip cert provisioning,
route registration, and DNS registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:39:26 -07:00
ca3bc736f6 Bump version to v0.5.0 for Phase D release
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:33:54 -07:00
9d9ad6588e Phase D: Automated DNS registration via MCNS
Add DNSRegistrar that creates/updates/deletes A records in MCNS
during deploy and stop. When a service has routes, the agent ensures
an A record exists in the configured zone pointing to the node's
address. On stop, the record is removed.

- Add MCNSConfig to agent config (server_url, ca_cert, token_path,
  zone, node_addr) with defaults and env overrides
- Add DNSRegistrar (internal/agent/dns.go): REST client for MCNS
  record CRUD, nil-receiver safe
- Wire into deploy flow (EnsureRecord after route registration)
- Wire into stop flow (RemoveRecord before container stop)
- 7 new tests, make all passes with 0 issues

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:33:41 -07:00
e4d131021e Bump version to v0.4.0 for Phase C release
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:47:02 -07:00
8d6c060483 Update mc-proxy dependency to v1.2.0, drop replace directive
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:39:41 -07:00
c7e1232f98 Phase C: Automated TLS cert provisioning for L7 routes
Add CertProvisioner that requests TLS certificates from Metacrypt's CA
API during deploy. When a service has L7 routes, the agent checks for
an existing cert, re-issues if missing or within 30 days of expiry,
and writes chain+key to mc-proxy's cert directory before registering
routes.

- Add MetacryptConfig to agent config (server_url, ca_cert, mount,
  issuer, token_path) with defaults and env overrides
- Add CertProvisioner (internal/agent/certs.go): REST client for
  Metacrypt IssueCert, atomic file writes, cert expiry checking
- Wire into Agent struct and deploy flow (before route registration)
- Add hasL7Routes/l7Hostnames helpers in deploy.go
- Fix pre-existing lint issues: unreachable code in portalloc.go,
  gofmt in servicedef.go, gosec suppressions, golangci v2 config
- Update vendored mc-proxy to fix protobuf init panic
- 10 new tests, make all passes with 0 issues

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:31:11 -07:00
572d2fb196 Regenerate proto files for mc/ module path
Raw descriptor bytes in .pb.go files were corrupted by the sed-based
module path rename (string length changed, breaking protobuf binary
encoding). Regenerated with protoc to fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 02:54:40 -07:00
c6a84a1b80 Bump flake.nix version to match latest tag
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 02:16:45 -07:00
08b3e2a472 Migrate module path from kyle/ to mc/ org
All import paths updated to git.wntrmute.dev/mc/. Bumps mcdsl to v1.2.0,
mc-proxy to v1.1.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 02:07:42 -07:00
6e30cf12f2 Mark Phase B complete in PROGRESS_V1.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:36:50 -07:00
c28562dbcf Merge pull request 'Phase B: Agent registers routes with mc-proxy on deploy' (#2) from phase-b-route-registration into master 2026-03-27 08:36:25 +00:00
84c487e7f8 Phase B: Agent registers routes with mc-proxy on deploy
The agent connects to mc-proxy via Unix socket and automatically
registers/removes routes during deploy and stop. This eliminates
manual mcproxyctl usage or TOML editing.

- New ProxyRouter abstraction wraps mc-proxy client library
- Deploy: after container starts, registers routes with mc-proxy
  using host ports from the registry
- Stop: removes routes from mc-proxy before stopping container
- Config: [mcproxy] section with socket path and cert_dir
- Nil-safe: if mc-proxy socket not configured, route registration
  is silently skipped (backward compatible)
- L7 routes use certs from convention path (<cert_dir>/<service>.pem)
- L4 routes use TLS passthrough (backend_tls=true)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:35:06 -07:00
8b1c89fdc9 Add mcp build command and deploy auto-build
Extends MCP to own the full build-push-deploy lifecycle. When deploying,
the CLI checks whether each component's image tag exists in the registry
and builds/pushes automatically if missing and build config is present.

- Add Build, Push, ImageExists to runtime.Runtime interface (podman impl)
- Add mcp build <service>[/<image>] command
- Add [build] section to CLI config (workspace path)
- Add path and [build.images] to service definitions
- Wire auto-build into mcp deploy before agent RPC
- Update ARCHITECTURE.md with runtime interface and deploy auto-build docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:34:25 -07:00
d7f18a5d90 Add Platform Evolution tracking to PROGRESS_V1.md
Phase A complete: route declarations, port allocation, $PORT env vars.
Phase B in progress: agent mc-proxy route registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:25:26 -07:00
5a802bceb6 Merge pull request 'Add route declarations and automatic port allocation' (#1) from mcp-routes-port-allocation into master 2026-03-27 08:16:20 +00:00
777ba8a0e1 Add route declarations and automatic port allocation to MCP agent
Service definitions can now declare routes per component instead of
manual port mappings:

  [[components.routes]]
  name = "rest"
  port = 8443
  mode = "l4"

The agent allocates free host ports at deploy time and injects
$PORT/$PORT_<NAME> env vars into containers. Backward compatible:
components with old-style ports= work unchanged.

Changes:
- Proto: RouteSpec message, routes + env fields on ComponentSpec
- Servicedef: RouteDef parsing and validation from TOML
- Registry: component_routes table with host_port tracking
- Runtime: Env field on ContainerSpec, -e flag in BuildRunArgs
- Agent: PortAllocator (random 10000-60000, availability check),
  deploy wiring for route→port mapping and env injection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 01:04:47 -07:00
503c52dc26 Update service definition example for convention-driven format
Drop uses_mcdsl, full image URLs, ports, network, user, restart.
Add route declarations and service-level version. Image names and
most config are now derived from conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 00:19:12 -07:00
6465da3547 Add build and release lifecycle to ARCHITECTURE.md
Service definitions now include [build] config (path, uses_mcdsl,
images) so MCP owns the full build-push-deploy lifecycle, replacing
mcdeploy.toml. Documents mcp build, mcp sync auto-build, image
versioning policy (explicit tags, never :latest), and workspace
convention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 23:31:05 -07:00
e18a3647bf Add Nix flake for mcp and mcp-agent
Exposes two packages:
- default (mcp CLI) for operator workstations
- mcp-agent for managed nodes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:46:36 -07:00
1e58dcce27 Implement mcp purge command for registry cleanup
Add PurgeComponent RPC to the agent service that removes stale registry
entries for components that are both gone (observed state is removed,
unknown, or exited) and unwanted (not in any current service definition).
Refuses to purge components with running or stopped containers. When all
components of a service are purged, the service row is deleted too.
Supports --dry-run to preview without modifying the database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:30:45 -07:00
1afbf5e1f6 Add purge design to architecture doc
Purge removes stale registry entries — components that are no longer
in service definitions and have no running container. Designed as an
explicit, safe operation separate from sync: sync is additive (push
desired state), purge is subtractive (remove forgotten entries).

Includes safety rules (refuses to purge running containers), dry-run
mode, agent RPC definition, and rationale for why sync should not be
made destructive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:22:27 -07:00
ea8a42a696 P5.2 + P5.3: Bootstrap docs, README, and RUNBOOK
- docs/bootstrap.md: step-by-step bootstrap procedure with lessons
  learned from the first deployment (NixOS sandbox issues, podman
  rootless setup, container naming, MCR auth workaround)
- README.md: quick-start guide, command reference, doc links
- RUNBOOK.md: operational procedures for operators (health checks,
  common operations, unsealing metacrypt, cert renewal, incident
  response, disaster recovery, file locations)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:32:22 -07:00
ff9bfc5087 Update PROGRESS_V1.md with deployment status and remaining work
Documents Phase 6 (deployment), bugs fixed during rollout,
remaining work organized by priority (operational, quality,
design, infrastructure), and current platform state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:27:30 -07:00
17ac0f3014 Trim whitespace from token file in CLI
Token files with trailing newlines caused gRPC "non-printable ASCII
characters" errors in the authorization header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:19:27 -07:00
7133871be2 Default CLI config path to ~/.config/mcp/mcp.toml
Eliminates the need to pass --config on every command.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:16:34 -07:00
efa32a7712 Fix container name handling for hyphenated service names
Extract ContainerNameFor and SplitContainerName into names.go.
ContainerNameFor handles single-component services where service
name equals component name (e.g., mc-proxy → "mc-proxy" not
"mc-proxy-mc-proxy"). SplitContainerName checks known services
from the registry before falling back to naive split on "-", fixing
mc-proxy being misidentified as service "mc" component "proxy".

Also fixes podman ps JSON parsing (Command field is []string not
string) found during deployment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 15:13:20 -07:00
2369 changed files with 6802850 additions and 444 deletions

View File

@@ -5,6 +5,24 @@ run:
tests: true tests: true
linters: linters:
exclusions:
paths:
- vendor
rules:
# In test files, suppress gosec rules that are false positives:
# G101: hardcoded test credentials
# G304: file paths from variables (t.TempDir paths)
# G306: WriteFile with 0644 (cert files need to be readable)
# G404: weak RNG (not security-relevant in tests)
- path: "_test\\.go"
linters:
- gosec
text: "G101|G304|G306|G404"
# Nil context is acceptable in tests for nil-receiver safety checks.
- path: "_test\\.go"
linters:
- staticcheck
text: "SA1012"
default: none default: none
enable: enable:
- errcheck - errcheck
@@ -69,12 +87,3 @@ formatters:
issues: issues:
max-issues-per-linter: 0 max-issues-per-linter: 0
max-same-issues: 0 max-same-issues: 0
exclusions:
paths:
- vendor
rules:
- path: "_test\\.go"
linters:
- gosec
text: "G101"

View File

@@ -121,9 +121,26 @@ option for future security hardening.
## Authentication and Authorization ## Authentication and Authorization
MCP follows the platform authentication model: all auth is delegated to MCP follows the platform authentication model: all auth is delegated to
MCIAS. MCIAS. The auth model separates three concerns: operator intent (CLI to
agent), infrastructure automation (agent to platform services), and
access control (who can do what).
### Agent Authentication ### Identity Model
| Identity | Type | Purpose |
|----------|------|---------|
| Human operator (e.g., `kyle`) | human | CLI operations: deploy, stop, start, build |
| `mcp-agent` | system | Agent-to-service automation: certs, DNS, routes, image pull |
| Per-service accounts (e.g., `mcq`) | system | Scoped self-management (own DNS records only) |
| `admin` role | role | MCIAS account management, policy changes, zone creation |
| `guest` role | role | Explicitly rejected by the agent |
The `admin` role is reserved for MCIAS-level administrative operations
(account creation, policy management, zone mutations). Routine MCP
operations (deploy, stop, start, build) do not require admin — any
authenticated non-guest user or system account is accepted.
### Agent Authentication (CLI → Agent)
The agent is a gRPC server with a unary interceptor that enforces The agent is a gRPC server with a unary interceptor that enforces
authentication on every RPC: authentication on every RPC:
@@ -132,10 +149,34 @@ authentication on every RPC:
(`authorization: Bearer <token>`). (`authorization: Bearer <token>`).
2. Agent extracts the token and validates it against MCIAS (cached 30s by 2. Agent extracts the token and validates it against MCIAS (cached 30s by
SHA-256 of the token, per platform convention). SHA-256 of the token, per platform convention).
3. Agent checks that the caller has the `admin` role. All MCP operations 3. Agent rejects guests (`guest` role → `PERMISSION_DENIED`). All other
require admin -- there is no unprivileged MCP access. authenticated users and system accounts are accepted.
4. If validation fails, the RPC returns `UNAUTHENTICATED` (invalid/expired 4. If validation fails, the RPC returns `UNAUTHENTICATED` (invalid/expired
token) or `PERMISSION_DENIED` (valid token, not admin). token) or `PERMISSION_DENIED` (guest).
### Agent Service Authentication (Agent → Platform Services)
The agent authenticates to platform services using a long-lived system
account token (`mcp-agent`). Each service has its own token file:
| Service | Token Path | Operations |
|---------|------------|------------|
| Metacrypt | `/srv/mcp/metacrypt-token` | TLS cert provisioning (PKI issue) |
| MCNS | `/srv/mcp/mcns-token` | DNS record create/delete (any name) |
| mc-proxy | Unix socket (no auth) | Route registration/removal |
| MCR | podman auth store | Image pull (JWT-as-password) |
These tokens are issued by MCIAS for the `mcp-agent` system account.
They carry no roles — authorization is handled by each service's policy
engine:
- **Metacrypt:** Policy rule grants `mcp-agent` write access to
`engine/pki/issue`.
- **MCNS:** Code-level authorization: system account `mcp-agent` can
manage any record; other system accounts can only manage records
matching their username.
- **MCR:** Default policy allows all authenticated users to push/pull.
MCR accepts MCIAS JWTs as passwords at the `/v2/token` endpoint.
### CLI Authentication ### CLI Authentication
@@ -148,6 +189,15 @@ obtained by:
The stored token is used for all subsequent agent RPCs until it expires. The stored token is used for all subsequent agent RPCs until it expires.
### MCR Registry Authentication
`mcp build` auto-authenticates to MCR before pushing images. It reads
the CLI's stored MCIAS token and uses it as the password for `podman
login`. MCR's token endpoint accepts MCIAS JWTs as passwords (the
personal-access-token pattern), so both human and system account tokens
work. This eliminates the need for a separate interactive `podman login`
step.
--- ---
## Services and Components ## Services and Components
@@ -192,9 +242,13 @@ for a service by prefix and derive component names automatically
``` ```
mcp login Authenticate to MCIAS, store token mcp login Authenticate to MCIAS, store token
mcp build <service> Build and push images for a service
mcp build <service>/<image> Build and push a single image
mcp deploy <service> Deploy all components from service definition mcp deploy <service> Deploy all components from service definition
mcp deploy <service>/<component> Deploy a single component mcp deploy <service>/<component> Deploy a single component
mcp deploy <service> -f <file> Deploy from explicit file mcp deploy <service> -f <file> Deploy from explicit file
mcp undeploy <service> Full teardown: remove routes, DNS, certs, containers
mcp stop <service> Stop all components, set active=false mcp stop <service> Stop all components, set active=false
mcp start <service> Start all components, set active=true mcp start <service> Start all components, set active=true
mcp restart <service> Restart all components mcp restart <service> Restart all components
@@ -203,10 +257,11 @@ mcp list List services from all agents (registry,
mcp ps Live check: query runtime on all agents, show running mcp ps Live check: query runtime on all agents, show running
containers with uptime and version containers with uptime and version
mcp status [service] Full picture: live query + drift + recent events mcp status [service] Full picture: live query + drift + recent events
mcp sync Push service definitions to agent (update desired mcp sync Push service definitions to agent; build missing
state without deploying) images if source tree is available
mcp adopt <service> Adopt all <service>-* containers into a service mcp adopt <service> Adopt all <service>-* containers into a service
mcp purge [service[/component]] Remove stale registry entries (--dry-run to preview)
mcp service show <service> Print current spec from agent registry mcp service show <service> Print current spec from agent registry
mcp service edit <service> Open service definition in $EDITOR mcp service edit <service> Open service definition in $EDITOR
@@ -219,6 +274,9 @@ mcp pull <service> <path> [local-file] Copy a file from /srv/<service>/<path> to
mcp node list List registered nodes mcp node list List registered nodes
mcp node add <name> <address> Register a node mcp node add <name> <address> Register a node
mcp node remove <name> Deregister a node mcp node remove <name> Deregister a node
mcp agent upgrade [node] Build, push, and restart agent on all (or one) node(s)
mcp agent status Show agent version on each node
``` ```
### Service Definition Files ### Service Definition Files
@@ -234,25 +292,34 @@ Example: `~/.config/mcp/services/metacrypt.toml`
name = "metacrypt" name = "metacrypt"
node = "rift" node = "rift"
active = true active = true
version = "v1.0.0"
[build.images]
metacrypt = "Dockerfile.api"
metacrypt-web = "Dockerfile.web"
[[components]] [[components]]
name = "api" name = "api"
image = "mcr.svc.mcp.metacircular.net:8443/metacrypt:latest"
network = "docker_default"
user = "0:0"
restart = "unless-stopped"
ports = ["127.0.0.1:18443:8443", "127.0.0.1:19443:9443"]
volumes = ["/srv/metacrypt:/srv/metacrypt"] volumes = ["/srv/metacrypt:/srv/metacrypt"]
[[components.routes]]
name = "rest"
port = 8443
mode = "l4"
[[components.routes]]
name = "grpc"
port = 9443
mode = "l4"
[[components]] [[components]]
name = "web" name = "web"
image = "mcr.svc.mcp.metacircular.net:8443/metacrypt-web:latest"
network = "docker_default"
user = "0:0"
restart = "unless-stopped"
ports = ["127.0.0.1:18080:8080"]
volumes = ["/srv/metacrypt:/srv/metacrypt"] volumes = ["/srv/metacrypt:/srv/metacrypt"]
cmd = ["server", "--config", "/srv/metacrypt/metacrypt.toml"] cmd = ["server", "--config", "/srv/metacrypt/metacrypt.toml"]
[[components.routes]]
port = 443
mode = "l7"
``` ```
### Active State ### Active State
@@ -286,6 +353,12 @@ chain:
If neither exists (first deploy, no file), the deploy fails with an error If neither exists (first deploy, no file), the deploy fails with an error
telling the operator to create a service definition. telling the operator to create a service definition.
Before pushing to the agent, the CLI checks that each component's image
tag exists in the registry. If a tag is missing and a `[build]` section
is configured, the CLI builds and pushes the image automatically (same
logic as `mcp sync` auto-build, described below). This makes `mcp deploy`
a single command for the bump-build-push-deploy workflow.
The CLI pushes the resolved spec to the agent. The agent records it in its The CLI pushes the resolved spec to the agent. The agent records it in its
registry and executes the deploy. The service definition file on disk is registry and executes the deploy. The service definition file on disk is
**not** modified -- it represents the operator's declared intent, not the **not** modified -- it represents the operator's declared intent, not the
@@ -333,6 +406,83 @@ Service definition files can be:
- **Generated by converting from mcdeploy.toml** during initial MCP - **Generated by converting from mcdeploy.toml** during initial MCP
migration (one-time). migration (one-time).
### Build Configuration
Service definitions include a `[build]` section that tells MCP how to
build container images from source. This replaces the standalone
`mcdeploy.toml` -- MCP owns the full build-push-deploy lifecycle.
Top-level build fields:
| Field | Purpose |
|-------|---------|
| `path` | Source directory relative to the workspace root |
| `build.uses_mcdsl` | Whether the mcdsl module is needed at build time |
| `build.images.<name>` | Maps each image name to its Dockerfile path |
The workspace root is configured in `~/.config/mcp/mcp.toml`:
```toml
[build]
workspace = "~/src/metacircular"
```
A service with `path = "mcr"` resolves to `~/src/metacircular/mcr`. The
convention assumes `~/src/metacircular/<path>` on operator workstations
(vade, orion). The workspace path can be overridden but the convention
should hold for all standard machines.
### Build and Release Workflow
The standard release workflow for a service:
1. **Tag** the release in git (`git tag -a v1.1.0`).
2. **Build** the images: `mcp build <service>` reads the service
definition, locates the source tree via `path`, and runs `docker
build` using each Dockerfile in `[build.images]`. Images are tagged
with the version from the component `image` field and pushed to MCR.
3. **Update** the service definition: bump the version tag in each
component's `image` field.
4. **Deploy**: `mcp sync` or `mcp deploy <service>`.
#### `mcp build` Resolution
`mcp build <service>` does the following:
1. Read the service definition to find `[build.images]` and `path`.
2. Resolve the source tree: `<workspace>/<path>`.
3. For each image in `[build.images]`:
a. Build with the Dockerfile at `<source>/<dockerfile>`.
b. If `uses_mcdsl = true`, include the mcdsl directory in the build
context (or use a multi-module build strategy).
c. Tag as `<registry>/<image>:<version>` (version extracted from the
matching component's `image` field).
d. Push to MCR.
#### `mcp sync` Auto-Build
`mcp sync` pushes service definitions to agents. Before deploying, it
checks that each component's image tag exists in the registry:
- **Tag exists** → proceed with deploy.
- **Tag missing, source tree available** → build and push automatically,
then deploy.
- **Tag missing, no source tree** → fail with error:
`"mcr:v1.1.0 not found in registry and no source tree at ~/src/metacircular/mcr"`.
This ensures `mcp sync` is a single command for the common case (tag,
update version, sync) while failing clearly when the build environment
is not available.
#### Image Versioning
Service definitions MUST pin explicit version tags (e.g., `v1.1.0`),
never `:latest`. This ensures:
- `mcp status` shows the actual running version.
- Deployments are reproducible.
- Rollbacks are explicit (change the tag back to the previous version).
--- ---
## Agent ## Agent
@@ -357,6 +507,7 @@ import "google/protobuf/timestamp.proto";
service McpAgent { service McpAgent {
// Service lifecycle // Service lifecycle
rpc Deploy(DeployRequest) returns (DeployResponse); rpc Deploy(DeployRequest) returns (DeployResponse);
rpc UndeployService(UndeployRequest) returns (UndeployResponse);
rpc StopService(ServiceRequest) returns (ServiceResponse); rpc StopService(ServiceRequest) returns (ServiceResponse);
rpc StartService(ServiceRequest) returns (ServiceResponse); rpc StartService(ServiceRequest) returns (ServiceResponse);
rpc RestartService(ServiceRequest) returns (ServiceResponse); rpc RestartService(ServiceRequest) returns (ServiceResponse);
@@ -566,6 +717,29 @@ The agent runs as a dedicated `mcp` system user. Podman runs rootless under
this user. All containers are owned by `mcp`. The NixOS configuration this user. All containers are owned by `mcp`. The NixOS configuration
provisions the `mcp` user with podman access. provisions the `mcp` user with podman access.
#### Runtime Interface
The `runtime.Runtime` interface abstracts the container runtime. The agent
(and the CLI, for build operations) use it for all container operations.
| Method | Used by | Purpose |
|--------|---------|---------|
| `Pull(image)` | Agent | `podman pull <image>` |
| `Run(spec)` | Agent | `podman run -d ...` |
| `Stop(name)` | Agent | `podman stop <name>` |
| `Remove(name)` | Agent | `podman rm <name>` |
| `Inspect(name)` | Agent | `podman inspect <name>` |
| `List()` | Agent | `podman ps -a` |
| `Build(image, contextDir, dockerfile)` | CLI | `podman build -t <image> -f <dockerfile> <contextDir>` |
| `Push(image)` | CLI | `podman push <image>` |
| `ImageExists(image)` | CLI | `podman manifest inspect docker://<image>` (checks remote registry) |
The first six methods are used by the agent during deploy and monitoring.
The last three are used by the CLI during `mcp build` and `mcp deploy`
auto-build. They are on the same interface because the CLI uses the local
podman installation directly -- no gRPC RPC needed, since builds happen
on the operator's workstation, not on the deployment node.
#### Deploy Flow #### Deploy Flow
When the agent receives a `Deploy` RPC: When the agent receives a `Deploy` RPC:
@@ -595,6 +769,40 @@ The flags passed to `podman run` are derived from the `ComponentSpec`:
| `volumes` | `-v <mapping>` (repeated) | | `volumes` | `-v <mapping>` (repeated) |
| `cmd` | appended after the image name | | `cmd` | appended after the image name |
#### Undeploy Flow
`mcp undeploy <service>` is the full inverse of deploy. It tears down all
infrastructure associated with a service. When the agent receives an
`UndeployService` RPC:
1. For each component:
a. Remove mc-proxy routes (traffic stops flowing).
b. Remove DNS A records from MCNS.
c. Remove TLS certificate and key files from the mc-proxy cert
directory (for L7 routes).
d. Stop and remove the container.
e. Release allocated host ports back to the port allocator.
f. Update component state to `removed` in the registry.
2. Mark the service as inactive.
3. Return success/failure per component.
The CLI also sets `active = false` in the local service definition file
to keep it in sync with the operator's intent.
Undeploy differs from `stop` in three ways:
| Aspect | `stop` | `undeploy` |
|--------|--------|-----------|
| Container | Stopped (still exists) | Stopped and removed |
| TLS certs | Kept | Removed |
| Ports | Kept allocated | Released |
| Service active | Unchanged | Set to inactive |
After undeploy, the service can be redeployed with `mcp deploy`. The
registry entries are preserved (desired state `removed`) so `mcp status`
and `mcp list` still show the service existed. Use `mcp purge` to clean
up the registry entries if desired.
### File Transfer ### File Transfer
The agent supports single-file push and pull, scoped to a specific The agent supports single-file push and pull, scoped to a specific
@@ -989,20 +1197,84 @@ The agent's data directory follows the platform convention:
### Agent Deployment (on nodes) ### Agent Deployment (on nodes)
The agent is deployed like any other Metacircular service: #### Provisioning (one-time per node)
1. Provision the `mcp` system user via NixOS config (with podman access Each node needs a one-time setup before the agent can run. The steps are
and subuid/subgid ranges for rootless containers). the same regardless of OS, but the mechanism differs:
1. Create `mcp` system user with podman access and subuid/subgid ranges.
2. Set `/srv/` ownership to the `mcp` user (the agent creates and manages 2. Set `/srv/` ownership to the `mcp` user (the agent creates and manages
`/srv/<service>/` directories for all services). `/srv/<service>/` directories for all services).
3. Create `/srv/mcp/` directory and config file. 3. Create `/srv/mcp/` directory and config file.
4. Provision TLS certificate from Metacrypt. 4. Provision TLS certificate from Metacrypt.
5. Create an MCIAS system account for the agent (`mcp-agent`). 5. Create an MCIAS system account for the agent (`mcp-agent`).
6. Install the `mcp-agent` binary. 6. Install the initial `mcp-agent` binary to `/srv/mcp/mcp-agent`.
7. Start via systemd unit. 7. Install and start the systemd unit.
The agent runs as a systemd service. Container-first deployment is a v2 On **NixOS** (rift), provisioning is declarative via the NixOS config.
concern -- MCP needs to be running before it can manage its own agent. The NixOS config owns the infrastructure (user, systemd unit, podman,
directories, permissions) but **not** the binary. `ExecStart` points to
`/srv/mcp/mcp-agent`, a mutable path that MCP manages. NixOS may
bootstrap the initial binary there, but subsequent updates come from MCP.
On **Debian** (hyperborea, svc), provisioning is done via a setup script
or ansible playbook that creates the same layout.
#### Binary Location
The agent binary lives at `/srv/mcp/mcp-agent` on **all** nodes,
regardless of OS. This unifies the update mechanism across the fleet.
#### Agent Upgrades
After initial provisioning, the agent binary is updated via
`mcp agent upgrade`. The CLI:
1. Cross-compiles the agent for each target architecture
(`GOARCH=amd64` for rift/svc, `GOARCH=arm64` for hyperborea).
2. SSHs to each node, pushes the binary to `/srv/mcp/mcp-agent.new`.
3. Atomically swaps the binary (`mv mcp-agent.new mcp-agent`).
4. Restarts the systemd service (`systemctl restart mcp-agent`).
SSH is used instead of gRPC because:
- It works even when the agent is broken or has an incompatible version.
- The binary is ~17MB, which exceeds gRPC default message limits.
- No self-restart coordination needed.
The CLI uses `golang.org/x/crypto/ssh` for native SSH, keeping the
entire workflow in a single binary with no external tool dependencies.
#### Node Configuration
Node config includes SSH and architecture info for agent management:
```toml
[[nodes]]
name = "rift"
address = "100.95.252.120:9444"
ssh = "rift" # SSH host (from ~/.ssh/config or hostname)
arch = "amd64" # GOARCH for cross-compilation
[[nodes]]
name = "hyperborea"
address = "100.x.x.x:9444"
ssh = "hyperborea"
arch = "arm64"
```
#### Coordinated Upgrades
New MCP releases often add new RPCs. A CLI at v0.6.0 calling an agent
at v0.5.0 fails with `Unimplemented`. Therefore agent upgrades must be
coordinated: `mcp agent upgrade` (with no node argument) upgrades all
nodes before the CLI is used for other operations.
If a node fails to upgrade, it is reported but the others still proceed.
The operator can retry or investigate via SSH.
#### Systemd Unit
The systemd unit is the same on all nodes:
```ini ```ini
[Unit] [Unit]
@@ -1012,7 +1284,7 @@ Wants=network-online.target
[Service] [Service]
Type=simple Type=simple
ExecStart=/usr/local/bin/mcp-agent server --config /srv/mcp/mcp-agent.toml ExecStart=/srv/mcp/mcp-agent server --config /srv/mcp/mcp-agent.toml
Restart=on-failure Restart=on-failure
RestartSec=5 RestartSec=5
@@ -1020,17 +1292,14 @@ User=mcp
Group=mcp Group=mcp
NoNewPrivileges=true NoNewPrivileges=true
ProtectSystem=strict ProtectSystem=full
ProtectHome=true ProtectHome=false
PrivateTmp=true PrivateTmp=true
PrivateDevices=true PrivateDevices=true
ProtectKernelTunables=true ProtectKernelTunables=true
ProtectKernelModules=true ProtectKernelModules=true
ProtectControlGroups=true
RestrictSUIDSGID=true RestrictSUIDSGID=true
RestrictNamespaces=true
LockPersonality=true LockPersonality=true
MemoryDenyWriteExecute=true
RestrictRealtime=true RestrictRealtime=true
ReadWritePaths=/srv ReadWritePaths=/srv
@@ -1040,6 +1309,7 @@ WantedBy=multi-user.target
Note: `ReadWritePaths=/srv` (not `/srv/mcp`) because the agent writes Note: `ReadWritePaths=/srv` (not `/srv/mcp`) because the agent writes
files to any service's `/srv/<service>/` directory on behalf of the CLI. files to any service's `/srv/<service>/` directory on behalf of the CLI.
`ProtectHome=false` because the `mcp` user's home is `/srv/mcp`.
### CLI Installation (on operator workstation) ### CLI Installation (on operator workstation)
@@ -1084,6 +1354,102 @@ container, the effective host UID depends on the mapping. Files in
configuration should provision appropriate subuid/subgid ranges when configuration should provision appropriate subuid/subgid ranges when
creating the `mcp` user. creating the `mcp` user.
**Dockerfile convention**: Do not use `USER`, `VOLUME`, or `adduser`
directives in production Dockerfiles. The `user` field in the service
definition (typically `"0:0"`) controls the runtime user, and host
volumes provide the data directories. A non-root `USER` in the
Dockerfile maps to a subordinate UID under rootless podman that cannot
access files owned by the `mcp` user on the host.
#### Infrastructure Boot Order and Circular Dependencies
MCR (container registry) and MCNS (DNS) are both deployed as containers
via MCP, but MCP itself depends on them:
- **MCR** is reachable through mc-proxy (L4 passthrough on `:8443`).
The agent pulls images from MCR during `mcp deploy`.
- **MCNS** serves DNS for internal zones. Tailscale and the overlay
network depend on DNS resolution.
This creates circular dependencies during cold-start or recovery:
```
mcp deploy → agent pulls image → needs MCR → needs mc-proxy
mcp deploy → agent dials MCR → DNS resolves hostname → needs MCNS
```
**Cold-start procedure** (no containers running):
1. **Build images on the operator workstation** for mc-proxy, MCR, and
MCNS. Transfer to rift via `podman save` / `scp` / `podman load`
since the registry is not yet available:
```
docker save <image> -o /tmp/image.tar
scp /tmp/image.tar <rift-lan-ip>:/tmp/
# on rift, as mcp user:
podman load -i /tmp/image.tar
```
Use the LAN IP for scp, not a DNS name (DNS is not running yet).
2. **Start MCNS first** (DNS must come up before anything that resolves
hostnames). Run directly with podman since the MCP agent cannot reach
the registry yet:
```
podman run -d --name mcns --restart unless-stopped \
--sysctl net.ipv4.ip_unprivileged_port_start=53 \
-p <lan-ip>:53:53/tcp -p <lan-ip>:53:53/udp \
-p <overlay-ip>:53:53/tcp -p <overlay-ip>:53:53/udp \
-v /srv/mcns:/srv/mcns \
<mcns-image> server --config /srv/mcns/mcns.toml
```
3. **Start mc-proxy** (registry traffic routes through it):
```
podman run -d --name mc-proxy --network host \
--restart unless-stopped \
-v /srv/mc-proxy:/srv/mc-proxy \
<mc-proxy-image> server --config /srv/mc-proxy/mc-proxy.toml
```
4. **Start MCR** (API server, then web UI):
```
podman run -d --name mcr-api --network mcpnet \
--restart unless-stopped \
-p 127.0.0.1:28443:8443 -p 127.0.0.1:29443:9443 \
-v /srv/mcr:/srv/mcr \
<mcr-image> server --config /srv/mcr/mcr.toml
```
5. **Push images to MCR** from the operator workstation now that the
registry is reachable:
```
docker push <registry>/<image>:<tag>
```
6. **Start the MCP agent** (systemd service). It can now reach MCR for
image pulls.
7. **`mcp adopt`** the manually-started containers to bring them under
MCP management. Then `mcp service export` to generate service
definition files.
From this point, `mcp deploy` works normally. The manually-started
containers are replaced by MCP-managed ones on the next deploy.
**Recovery procedure** (mc-proxy or MCNS crashed):
If mc-proxy or MCNS goes down, the agent cannot pull images (registry
unreachable or DNS broken). Recovery:
1. Check if the required image is cached locally:
`podman images | grep <service>`
2. If cached, start the container directly with `podman run` (same
flags as the cold-start procedure above).
3. If not cached, transfer the image from the operator workstation via
`podman save` / `scp` / `podman load` using the LAN IP.
4. Once the infrastructure service is running, `mcp deploy` resumes
normal operation for other services.
--- ---
## Security Model ## Security Model
@@ -1133,6 +1499,7 @@ mcp/
│ ├── mcp/ CLI │ ├── mcp/ CLI
│ │ ├── main.go │ │ ├── main.go
│ │ ├── login.go │ │ ├── login.go
│ │ ├── build.go build and push images
│ │ ├── deploy.go │ │ ├── deploy.go
│ │ ├── lifecycle.go stop, start, restart │ │ ├── lifecycle.go stop, start, restart
│ │ ├── status.go list, ps, status │ │ ├── status.go list, ps, status
@@ -1195,6 +1562,147 @@ mcp/
--- ---
## Registry Cleanup: Purge
### Problem
The agent's registry accumulates stale entries over time. A component
that was replaced (e.g., `mcns/coredns` → `mcns/mcns`) or a service
that was decommissioned remains in the registry indefinitely with
`observed=removed` or `observed=unknown`. There is no mechanism to tell
the agent "this component no longer exists and should not be tracked."
This causes:
- Perpetual drift alerts for components that will never return.
- Noise in `mcp status` and `mcp list` output.
- Confusion about what the agent is actually responsible for.
The existing `mcp sync` compares local service definitions against the
agent's registry and updates desired state for components that are
defined. But it does not remove components or services that are *absent*
from the local definitions — sync is additive, not declarative.
### Design: `mcp purge`
Purge removes registry entries that are both **unwanted** (not in any
current service definition) and **gone** (no corresponding container in
the runtime). It is the garbage collector for the registry.
```
mcp purge [--dry-run] Purge all stale entries
mcp purge <service> [--dry-run] Purge stale entries for one service
mcp purge <service>/<component> [--dry-run] Purge a specific component
```
#### Semantics
Purge operates on the agent's registry, not on containers. It never
stops or removes running containers. The rules:
1. **Component purge**: a component is eligible for purge when:
- Its observed state is `removed`, `unknown`, or `exited`, AND
- It is not present in any current service definition file
(i.e., `mcp sync` would not recreate it).
Purging a component deletes its registry entry (from `components`,
`component_ports`, `component_volumes`, `component_cmd`) and its
event history.
2. **Service purge**: a service is eligible for purge when all of its
components have been purged (or it has no components). Purging a
service deletes its `services` row.
3. **Safety**: purge refuses to remove a component whose observed state
is `running` or `stopped` (i.e., a container still exists in the
runtime). This prevents accidentally losing track of live containers.
The operator must `mcp stop` and wait for the container to be removed
before purging, or manually remove it via podman.
4. **Dry run**: `--dry-run` lists what would be purged without modifying
the registry. This is the default-safe way to preview the operation.
#### Interaction with Sync
`mcp sync` pushes desired state from service definitions. `mcp purge`
removes entries that sync would never touch. They are complementary:
- `sync` answers: "what should exist?" (additive)
- `purge` answers: "what should be forgotten?" (subtractive)
A full cleanup is: `mcp sync && mcp purge`.
An alternative design would make `mcp sync` itself remove entries not
present in service definitions (fully declarative sync). This was
rejected because:
- Sync currently only operates on services that have local definition
files. A service without a local file is left untouched — this is
desirable when multiple operators or workstations manage different
services.
- Making sync destructive increases the blast radius of a missing file
(accidentally deleting the local `mcr.toml` would cause sync to
purge MCR from the registry).
- Purge as a separate, explicit command with `--dry-run` gives the
operator clear control over what gets cleaned up.
#### Agent RPC
```protobuf
rpc PurgeComponent(PurgeRequest) returns (PurgeResponse);
message PurgeRequest {
string service = 1; // service name (empty = all services)
string component = 2; // component name (empty = all eligible in service)
bool dry_run = 3; // preview only, do not modify registry
}
message PurgeResponse {
repeated PurgeResult results = 1;
}
message PurgeResult {
string service = 1;
string component = 2;
bool purged = 3; // true if removed (or would be, in dry-run)
string reason = 4; // why eligible, or why refused
}
```
The CLI sends the set of currently-defined service/component names
alongside the purge request so the agent can determine what is "not in
any current service definition" without needing access to the CLI's
filesystem.
#### Example
After replacing `mcns/coredns` with `mcns/mcns`:
```
$ mcp purge --dry-run
would purge mcns/coredns (observed=removed, not in service definitions)
$ mcp purge
purged mcns/coredns
$ mcp status
SERVICE COMPONENT DESIRED OBSERVED VERSION
mc-proxy mc-proxy running running latest
mcns mcns running running v1.0.0
mcr api running running latest
mcr web running running latest
metacrypt api running running latest
metacrypt web running running latest
```
#### Registry Auth
Purge also cleans up after the `mcp adopt` workflow. When containers are
adopted and later removed (replaced by a proper deploy), the adopted
entries linger. Purge removes them once the containers are gone and the
service definition no longer references them.
---
## Future Work (v2+) ## Future Work (v2+)
These are explicitly out of scope for v1 but inform the design: These are explicitly out of scope for v1 but inform the design:

View File

@@ -12,6 +12,21 @@ MCP has two components:
Services have one or more components (containers). Container naming: `<service>-<component>`. Services have one or more components (containers). Container naming: `<service>-<component>`.
## v2 Development (Multi-Node)
MCP v2 extends the single-node agent model to a multi-node fleet with a central master process. See the root repo's `docs/phase-e-plan.md` and `docs/architecture-v2.md` for the full design.
**Current state:**
- **svc** is operational as an edge node (manages mc-proxy routing only, no containers)
- **rift** runs the agent with full container management
- **orion** is provisioned but offline for maintenance
**Key v2 concepts (in development):**
- **mcp-master** — central orchestrator on rift. Accepts CLI commands, dispatches to agents, maintains node registry, coordinates edge routing.
- **Agent self-registration** — agents register with the master on startup (name, role, address, arch). No static node config required after bootstrap.
- **Tier-based placement** — `tier = "core"` runs on the master node, `tier = "worker"` (default) is auto-placed on a worker with capacity, `node = "<name>"` overrides for pinned services.
- **Edge routing** — `public = true` on routes declares intent; the master assigns the route to an edge node (currently svc).
## Build Commands ## Build Commands
```bash ```bash
@@ -33,7 +48,7 @@ Run a single test: `go test ./internal/registry/ -run TestComponentCRUD`
- `cmd/mcp/` — CLI entry point - `cmd/mcp/` — CLI entry point
- `cmd/mcp-agent/` — Agent entry point - `cmd/mcp-agent/` — Agent entry point
- `internal/agent/` — Agent core (deploy, lifecycle, sync, adopt, status, files) - `internal/agent/` — Agent core (deploy, lifecycle, sync, adopt, status, files, edge_rpc, dns, proxy, certs)
- `internal/runtime/` — Container runtime abstraction (podman) - `internal/runtime/` — Container runtime abstraction (podman)
- `internal/registry/` — SQLite registry (services, components, events) - `internal/registry/` — SQLite registry (services, components, events)
- `internal/monitor/` — Monitoring subsystem (watch loop, alerting) - `internal/monitor/` — Monitoring subsystem (watch loop, alerting)
@@ -55,4 +70,4 @@ Run a single test: `go test ./internal/registry/ -run TestComponentCRUD`
## Module Path ## Module Path
`git.wntrmute.dev/kyle/mcp` `git.wntrmute.dev/mc/mcp`

22
Dockerfile.master Normal file
View File

@@ -0,0 +1,22 @@
FROM golang:1.25-alpine AS builder
ARG VERSION=dev
WORKDIR /build
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
-o /mcp-master ./cmd/mcp-master
FROM alpine:3.21
RUN apk add --no-cache ca-certificates tzdata
COPY --from=builder /mcp-master /usr/local/bin/mcp-master
WORKDIR /srv/mcp-master
EXPOSE 9555
ENTRYPOINT ["mcp-master"]
CMD ["server", "--config", "/srv/mcp-master/mcp-master.toml"]

View File

@@ -8,6 +8,9 @@ mcp:
mcp-agent: mcp-agent:
CGO_ENABLED=0 go build $(LDFLAGS) -o mcp-agent ./cmd/mcp-agent CGO_ENABLED=0 go build $(LDFLAGS) -o mcp-agent ./cmd/mcp-agent
mcp-master:
CGO_ENABLED=0 go build $(LDFLAGS) -o mcp-master ./cmd/mcp-master
build: build:
go build ./... go build ./...
@@ -21,15 +24,20 @@ lint:
golangci-lint run ./... golangci-lint run ./...
proto: proto:
protoc --go_out=. --go_opt=module=git.wntrmute.dev/kyle/mcp \ protoc --go_out=. --go_opt=module=git.wntrmute.dev/mc/mcp \
--go-grpc_out=. --go-grpc_opt=module=git.wntrmute.dev/kyle/mcp \ --go-grpc_out=. --go-grpc_opt=module=git.wntrmute.dev/mc/mcp \
proto/mcp/v1/*.proto proto/mcp/v1/*.proto
proto-lint: proto-lint:
buf lint buf lint
buf breaking --against '.git#branch=master,subdir=proto' buf breaking --against '.git#branch=master,subdir=proto'
clean: docker-master:
rm -f mcp mcp-agent podman build -f Dockerfile.master \
--build-arg VERSION=$(shell git describe --tags --always --dirty) \
-t mcr.svc.mcp.metacircular.net:8443/mcp-master:$(shell git describe --tags --always --dirty) .
all: vet lint test mcp mcp-agent clean:
rm -f mcp mcp-agent mcp-master
all: vet lint test mcp mcp-agent mcp-master

View File

@@ -47,5 +47,109 @@
## Phase 5: Integration and Polish ## Phase 5: Integration and Polish
- [ ] **P5.1** Integration test suite - [ ] **P5.1** Integration test suite
- [ ] **P5.2** Bootstrap procedure test - [x] **P5.2** Bootstrap procedure — documented in `docs/bootstrap.md`
- [ ] **P5.3** Documentation (CLAUDE.md, README.md, RUNBOOK.md) - [x] **P5.3** Documentation CLAUDE.md, README.md, RUNBOOK.md
## Phase 6: Deployment (completed 2026-03-26)
- [x] **P6.1** NixOS config for mcp user (rootless podman, subuid/subgid, systemd service)
- [x] **P6.2** TLS cert provisioned from Metacrypt (DNS + IP SANs)
- [x] **P6.3** MCIAS system account (mcp-agent with admin role)
- [x] **P6.4** Container migration (metacrypt, mc-proxy, mcr, mcns → mcp user)
- [x] **P6.5** MCP bootstrap (adopt, sync, export service definitions)
- [x] **P6.6** Service definitions completed with full container specs
## Deployment Bugs Fixed During Rollout
- podman ps JSON: `Command` field is `[]string` not `string`
- Container name handling: `splitContainerName` naive split broke `mc-proxy`
→ extracted `ContainerNameFor`/`SplitContainerName` with registry-aware lookup
- CLI default config path: `~/.config/mcp/mcp.toml`
- Token file whitespace: trim newlines before sending in gRPC metadata
- NixOS systemd sandbox: `ProtectHome` blocks `/run/user`, `ProtectSystem=strict`
blocks podman runtime dir → relaxed to `ProtectSystem=full`, `ProtectHome=false`
- Agent needs `PATH`, `HOME`, `XDG_RUNTIME_DIR` in systemd environment
## Platform Evolution (see PLATFORM_EVOLUTION.md)
### Phase A — COMPLETE (2026-03-27)
- [x] Route declarations in service definitions (`[[components.routes]]`)
- [x] Automatic port allocation by agent (10000-60000, mutex-serialized)
- [x] `$PORT` / `$PORT_<NAME>` env var injection into containers
- [x] Proto: `RouteSpec` message, `routes` + `env` on `ComponentSpec`
- [x] Registry: `component_routes` table with `host_port` tracking
- [x] Backward compatible: old-style `ports` strings still work
### Phase B — COMPLETE (2026-03-27)
- [x] Agent connects to mc-proxy via Unix socket on deploy
- [x] Agent calls `AddRoute` to register routes with mc-proxy
- [x] Agent calls `RemoveRoute` on service stop/teardown
- [x] Agent config: `[mcproxy] socket` and `cert_dir` fields
- [x] TLS certs: pre-provisioned at convention path (Phase C automates)
- [x] Nil-safe: if socket not configured, route registration silently skipped
## Remaining Work
### Operational — Next Priority
- [ ] **MCR auth for mcp user** — podman pull from MCR requires OCI token
auth. Currently using image save/load workaround. Need either: OCI token
flow support in the agent, or podman login with service account credentials.
- [ ] **Vade DNS routing** — Tailscale MagicDNS intercepts `*.svc.mcp.metacircular.net`
queries on vade, preventing hostname-based TLS connections. CLI currently
uses IP address directly. Fix: Tailscale DNS configuration or split-horizon
setup on vade.
- [ ] **Service export completeness**`mcp service export` only captures
name + image from the registry. Should include full spec (network, ports,
volumes, user, restart, cmd). Requires the agent's `ListServices` response
to include full `ComponentSpec` data, not just `ComponentInfo`.
### Quality
- [ ] **P5.1** Integration test suite — end-to-end CLI → agent → podman tests
- [ ] **P5.2** Bootstrap procedure test — documented and verified
- [ ] **README.md** — quick-start guide
- [ ] **RUNBOOK.md** — operational procedures (unseal metacrypt, restart
services, disaster recovery)
### Design
- [ ] **Self-management** — how MCP updates mc-proxy and its own agent without
circular dependency. Likely answer: NixOS manages the agent and mc-proxy
binaries; MCP manages their containers. Or: staged restart with health
checks.
- [ ] **ARCHITECTURE.md proto naming** — update spec to match buf-lint-compliant
message names (StopServiceRequest vs ServiceRequest, AdoptContainers vs
AdoptContainer).
- [ ] **mcdsl DefaultPath helper**`DefaultPath(name) string` for consistent
config file discovery across all services. Root: /srv, /etc. User: XDG, /srv.
- [ ] **Engineering standards update** — document REST+gRPC parity exception
for infrastructure services (MCP agent).
### Infrastructure
- [ ] **Certificate renewal** — MCP-managed cert renewal before expiry.
Agent cert expires 2026-06-24. Need automated renewal via Metacrypt ACME
or REST API.
- [ ] **Monitor alerting** — configure alert_command on rift (ntfy, webhook,
or custom script) for drift/flap notifications.
- [ ] **Backup timer** — install mcp-agent-backup timer via NixOS config.
## Current State (2026-03-26)
MCP is deployed and operational on rift. The agent runs as a systemd service
under the `mcp` user with rootless podman. All platform services (metacrypt,
mc-proxy, mcr, mcns) are managed by MCP with complete service definitions.
```
$ mcp status
SERVICE COMPONENT DESIRED OBSERVED VERSION
mc-proxy mc-proxy running running latest
mcns coredns running running 1.12.1
mcr api running running latest
mcr web running running latest
metacrypt api running running latest
metacrypt web running running latest
```

View File

@@ -32,7 +32,7 @@ else builds on.
structure, and configure tooling. structure, and configure tooling.
**Deliverables:** **Deliverables:**
- `go.mod` with module path `git.wntrmute.dev/kyle/mcp` - `go.mod` with module path `git.wntrmute.dev/mc/mcp`
- `Makefile` with standard targets (build, test, vet, lint, proto, - `Makefile` with standard targets (build, test, vet, lint, proto,
proto-lint, clean, all) proto-lint, clean, all)
- `.golangci.yaml` with platform-standard linter config - `.golangci.yaml` with platform-standard linter config

119
README.md Normal file
View File

@@ -0,0 +1,119 @@
# MCP — Metacircular Control Plane
MCP is the orchestrator for the [Metacircular](https://metacircular.net)
platform. It manages container lifecycle, tracks what services run where,
and transfers files between the operator's workstation and managed nodes.
## Architecture
**CLI** (`mcp`) — thin client on the operator's workstation. Reads local
service definition files, pushes intent to agents, queries status.
**Agent** (`mcp-agent`) — per-node daemon. Manages containers via rootless
podman, stores a SQLite registry of desired/observed state, monitors for
drift, and alerts the operator.
## Quick Start
### Build
```bash
make all # vet, lint, test, build
make mcp # CLI only
make mcp-agent # agent only
```
### Install the CLI
```bash
cp mcp ~/.local/bin/
mkdir -p ~/.config/mcp/services
```
Create `~/.config/mcp/mcp.toml`:
```toml
[services]
dir = "/home/<user>/.config/mcp/services"
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcp"
[auth]
token_path = "/home/<user>/.config/mcp/token"
[[nodes]]
name = "rift"
address = "100.95.252.120:9444"
```
### Authenticate
```bash
mcp login
```
### Check status
```bash
mcp status # full picture: services, drift, events
mcp ps # live container check with uptime
mcp list # quick registry query
```
### Deploy a service
Write a service definition in `~/.config/mcp/services/<name>.toml`:
```toml
name = "myservice"
node = "rift"
active = true
[[components]]
name = "api"
image = "mcr.svc.mcp.metacircular.net:8443/myservice:v1.0.0"
network = "mcpnet"
user = "0:0"
restart = "unless-stopped"
ports = ["127.0.0.1:8443:8443"]
volumes = ["/srv/myservice:/srv/myservice"]
cmd = ["server", "--config", "/srv/myservice/myservice.toml"]
```
Then deploy:
```bash
mcp deploy myservice
```
## Commands
| Command | Description |
|---------|-------------|
| `mcp login` | Authenticate to MCIAS |
| `mcp deploy <service>[/<component>]` | Deploy from service definition |
| `mcp stop <service>` | Stop all components |
| `mcp start <service>` | Start all components |
| `mcp restart <service>` | Restart all components |
| `mcp list` | List services (registry) |
| `mcp ps` | Live container check |
| `mcp status [service]` | Full status with drift and events |
| `mcp sync` | Push all service definitions |
| `mcp adopt <service>` | Adopt running containers |
| `mcp service show <service>` | Print spec from agent |
| `mcp service edit <service>` | Edit definition in $EDITOR |
| `mcp service export <service>` | Export agent spec to file |
| `mcp push <file> <service> [path]` | Push file to node |
| `mcp pull <service> <path> [file]` | Pull file from node |
| `mcp node list` | List nodes |
| `mcp node add <name> <addr>` | Add a node |
| `mcp node remove <name>` | Remove a node |
## Documentation
- [ARCHITECTURE.md](ARCHITECTURE.md) — design specification
- [RUNBOOK.md](RUNBOOK.md) — operational procedures
- [PROJECT_PLAN_V1.md](PROJECT_PLAN_V1.md) — implementation plan
- [PROGRESS_V1.md](PROGRESS_V1.md) — progress and remaining work

305
RUNBOOK.md Normal file
View File

@@ -0,0 +1,305 @@
# MCP Runbook
Operational procedures for the Metacircular Control Plane. Written for
operators at 3 AM.
## Service Overview
MCP manages container lifecycle on Metacircular nodes. Two components:
- **mcp-agent** — systemd service on each node (rift). Manages containers
via rootless podman, stores registry in SQLite, monitors for drift.
- **mcp** — CLI on the operator's workstation (vade). Pushes desired state,
queries status.
## Health Checks
### Quick status
```bash
mcp status
```
Shows all services, desired vs observed state, drift, and recent events.
No drift = healthy.
### Agent process
```bash
ssh rift "doas systemctl status mcp-agent"
ssh rift "doas journalctl -u mcp-agent --since '10 min ago' --no-pager"
```
### Individual service
```bash
mcp status metacrypt
```
## Common Operations
### Check what's running
```bash
mcp ps # live check with uptime
mcp list # from registry (no runtime query)
mcp status # full picture with drift and events
```
### Restart a service
```bash
mcp restart metacrypt
```
Restarts all components. Does not change the `active` flag. Metacrypt
will need to be unsealed after restart.
### Stop a service
```bash
mcp stop metacrypt
```
Sets `active = false` in the service definition file and stops all
containers. The agent will not restart them.
### Start a stopped service
```bash
mcp start metacrypt
```
Sets `active = true` and starts all containers.
### Deploy an update
Edit the service definition to update the image tag, then deploy:
```bash
mcp service edit metacrypt # opens in $EDITOR
mcp deploy metacrypt # deploys all components
mcp deploy metacrypt/web # deploy just the web component
```
### Push a config file to a node
```bash
mcp push metacrypt.toml metacrypt # → /srv/metacrypt/metacrypt.toml
mcp push cert.pem metacrypt certs/cert.pem # → /srv/metacrypt/certs/cert.pem
```
### Pull a file from a node
```bash
mcp pull metacrypt metacrypt.toml ./local-copy.toml
```
### Sync desired state
Push all service definitions to the agent without deploying:
```bash
mcp sync
```
### View service definition
```bash
mcp service show metacrypt # from agent registry
cat ~/.config/mcp/services/metacrypt.toml # local file
```
### Export service definition from agent
```bash
mcp service export metacrypt
```
Writes the agent's current spec to the local service definition file.
## Unsealing Metacrypt
Metacrypt starts sealed after any restart. Unseal via the API:
```bash
curl -sk -X POST https://metacrypt.svc.mcp.metacircular.net:8443/v1/unseal \
-H "Content-Type: application/json" \
-d '{"password":"<unseal-password>"}'
```
Or via the web UI at `https://metacrypt.svc.mcp.metacircular.net`.
**Important:** Restarting metacrypt-api requires unsealing. To avoid this
when updating just the UI, deploy only the web component:
```bash
mcp deploy metacrypt/web
```
## Agent Management
### Restart the agent
```bash
ssh rift "doas systemctl restart mcp-agent"
```
Containers keep running — the agent is stateless w.r.t. container
lifecycle. Podman's restart policy keeps containers up.
### View agent logs
```bash
ssh rift "doas journalctl -u mcp-agent -f" # follow
ssh rift "doas journalctl -u mcp-agent --since today" # today's logs
```
### Agent database backup
```bash
ssh rift "doas -u mcp /usr/local/bin/mcp-agent snapshot --config /srv/mcp/mcp-agent.toml"
```
Backups go to `/srv/mcp/backups/`.
### Update the agent binary
```bash
# On vade, in the mcp repo:
make clean && make mcp-agent
scp mcp-agent rift:/tmp/
ssh rift "doas systemctl stop mcp-agent && \
doas cp /tmp/mcp-agent /usr/local/bin/mcp-agent && \
doas systemctl start mcp-agent"
```
### Update the CLI binary
```bash
make clean && make mcp
cp mcp ~/.local/bin/
```
## Node Management
### List nodes
```bash
mcp node list
```
### Add a node
```bash
mcp node add <name> <address:port>
```
### Remove a node
```bash
mcp node remove <name>
```
## TLS Certificate Renewal
The agent's TLS cert is at `/srv/mcp/certs/cert.pem`. Check expiry:
```bash
ssh rift "openssl x509 -in /srv/mcp/certs/cert.pem -noout -enddate"
```
To renew (requires a Metacrypt token):
```bash
export METACRYPT_TOKEN="<token>"
ssh rift "curl -sk -X POST https://127.0.0.1:18443/v1/engine/request \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer $METACRYPT_TOKEN' \
-d '{
\"mount\": \"pki\",
\"operation\": \"issue\",
\"path\": \"web\",
\"data\": {
\"issuer\": \"web\",
\"common_name\": \"mcp-agent.svc.mcp.metacircular.net\",
\"profile\": \"server\",
\"dns_names\": [\"mcp-agent.svc.mcp.metacircular.net\"],
\"ip_addresses\": [\"100.95.252.120\", \"192.168.88.181\"],
\"ttl\": \"2160h\"
}
}'" > /tmp/cert-response.json
# Extract and install cert+key from the JSON response, then:
ssh rift "doas systemctl restart mcp-agent"
```
## Incident Procedures
### Service not running (drift detected)
1. `mcp status` — identify which service/component drifted.
2. Check agent logs: `ssh rift "doas journalctl -u mcp-agent --since '10 min ago'"`
3. Check container logs: `ssh rift "doas -u mcp podman logs <container-name>"`
4. Restart: `mcp restart <service>`
5. If metacrypt: unseal after restart.
### Agent unreachable
1. Check if the agent process is running: `ssh rift "doas systemctl status mcp-agent"`
2. If stopped: `ssh rift "doas systemctl start mcp-agent"`
3. Check logs for crash reason: `ssh rift "doas journalctl -u mcp-agent -n 50"`
4. Containers keep running independently — podman's restart policy handles them.
### Token expired
MCP CLI shows `UNAUTHENTICATED` or `PERMISSION_DENIED`:
1. Check token: the mcp-agent service account token is at `~/.config/mcp/token`
2. Validate: `curl -sk -X POST -H "Authorization: Bearer $(cat ~/.config/mcp/token)" https://mcias.metacircular.net:8443/v1/token/validate`
3. If expired: generate a new service account token from MCIAS admin dashboard.
### Database corruption
The agent's SQLite database is at `/srv/mcp/mcp.db`:
1. Stop the agent: `ssh rift "doas systemctl stop mcp-agent"`
2. Restore from backup: `ssh rift "doas -u mcp cp /srv/mcp/backups/<latest>.db /srv/mcp/mcp.db"`
3. Start the agent: `ssh rift "doas systemctl start mcp-agent"`
4. Run `mcp sync` to re-push desired state.
If no backup exists, delete the database and re-bootstrap:
1. `ssh rift "doas -u mcp rm /srv/mcp/mcp.db"`
2. `ssh rift "doas systemctl start mcp-agent"` (creates fresh database)
3. `mcp sync` (pushes all service definitions)
### Disaster recovery (rift lost)
1. Provision new machine, connect to overlay network.
2. Apply NixOS config (creates mcp user, installs agent).
3. Install mcp-agent binary.
4. Restore `/srv/` from backups (each service's backup timer creates daily snapshots).
5. Provision TLS cert from Metacrypt.
6. Start agent: `doas systemctl start mcp-agent`
7. `mcp sync` from vade to push service definitions.
8. Unseal Metacrypt.
## File Locations
### On rift (agent)
| Path | Purpose |
|------|---------|
| `/srv/mcp/mcp-agent.toml` | Agent config |
| `/srv/mcp/mcp.db` | Registry database |
| `/srv/mcp/certs/` | Agent TLS cert and key |
| `/srv/mcp/backups/` | Database snapshots |
| `/srv/<service>/` | Service data directories |
### On vade (CLI)
| Path | Purpose |
|------|---------|
| `~/.config/mcp/mcp.toml` | CLI config |
| `~/.config/mcp/token` | MCIAS bearer token |
| `~/.config/mcp/services/` | Service definition files |

View File

@@ -5,8 +5,8 @@ import (
"log" "log"
"os" "os"
"git.wntrmute.dev/kyle/mcp/internal/agent" "git.wntrmute.dev/mc/mcp/internal/agent"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )
@@ -38,7 +38,7 @@ func main() {
if err != nil { if err != nil {
return fmt.Errorf("load config: %w", err) return fmt.Errorf("load config: %w", err)
} }
return agent.Run(cfg) return agent.Run(cfg, version)
}, },
}) })

View File

@@ -7,7 +7,7 @@ import (
"path/filepath" "path/filepath"
"time" "time"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra" "github.com/spf13/cobra"
_ "modernc.org/sqlite" _ "modernc.org/sqlite"
) )

49
cmd/mcp-master/main.go Normal file
View File

@@ -0,0 +1,49 @@
package main
import (
"fmt"
"log"
"os"
"git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/mc/mcp/internal/master"
"github.com/spf13/cobra"
)
var (
version = "dev"
cfgPath string
)
func main() {
root := &cobra.Command{
Use: "mcp-master",
Short: "Metacircular Control Plane master",
}
root.PersistentFlags().StringVarP(&cfgPath, "config", "c", "", "config file path")
root.AddCommand(&cobra.Command{
Use: "version",
Short: "Print version",
Run: func(cmd *cobra.Command, args []string) {
fmt.Println(version)
},
})
root.AddCommand(&cobra.Command{
Use: "server",
Short: "Start the master server",
RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadMasterConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
return master.Run(cfg, version)
},
})
if err := root.Execute(); err != nil {
log.Fatal(err)
os.Exit(1)
}
}

View File

@@ -4,8 +4,8 @@ import (
"context" "context"
"fmt" "fmt"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )

206
cmd/mcp/build.go Normal file
View File

@@ -0,0 +1,206 @@
package main
import (
"context"
"fmt"
"path/filepath"
"strings"
"github.com/spf13/cobra"
"git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/mc/mcp/internal/runtime"
"git.wntrmute.dev/mc/mcp/internal/servicedef"
)
func buildCmd() *cobra.Command {
return &cobra.Command{
Use: "build <service>[/<image>]",
Short: "Build and push images for a service",
Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
serviceName, imageFilter := parseServiceArg(args[0])
def, err := loadServiceDef(cmd, cfg, serviceName)
if err != nil {
return err
}
rt := &runtime.Podman{}
return buildServiceImages(cmd.Context(), cfg, def, rt, imageFilter)
},
}
}
// buildServiceImages builds and pushes images for a service definition.
// If imageFilter is non-empty, only the matching image is built.
func buildServiceImages(ctx context.Context, cfg *config.CLIConfig, def *servicedef.ServiceDef, rt *runtime.Podman, imageFilter string) error {
if def.Build == nil || len(def.Build.Images) == 0 {
return fmt.Errorf("service %q has no [build.images] configuration", def.Name)
}
if def.Path == "" {
return fmt.Errorf("service %q has no path configured", def.Name)
}
if cfg.Build.Workspace == "" {
return fmt.Errorf("build.workspace is not configured in %s", cfgPath)
}
sourceDir := filepath.Join(cfg.Build.Workspace, def.Path)
// Auto-login to the registry using the CLI's stored MCIAS token.
// MCR accepts JWTs as passwords, so this works for both human and
// service account tokens. Failures are non-fatal — existing podman
// auth may suffice.
if token, err := auth.LoadToken(cfg.Auth.TokenPath); err == nil && token != "" {
registry := extractRegistry(def)
if registry != "" {
_ = rt.Login(ctx, registry, "mcp", token)
}
}
for imageName, dockerfile := range def.Build.Images {
if imageFilter != "" && imageName != imageFilter {
continue
}
imageRef := findImageRef(def, imageName)
if imageRef == "" {
return fmt.Errorf("no component references image %q in service %q", imageName, def.Name)
}
fmt.Printf("building %s from %s\n", imageRef, dockerfile)
if err := rt.Build(ctx, imageRef, sourceDir, dockerfile); err != nil {
return fmt.Errorf("build %s: %w", imageRef, err)
}
fmt.Printf("pushing %s\n", imageRef)
if err := rt.Push(ctx, imageRef); err != nil {
return fmt.Errorf("push %s: %w", imageRef, err)
}
}
if imageFilter != "" {
if _, ok := def.Build.Images[imageFilter]; !ok {
return fmt.Errorf("image %q not found in [build.images] for service %q", imageFilter, def.Name)
}
}
return nil
}
// findImageRef finds the full image reference for a build image name by
// matching it against component image fields. The image name from
// [build.images] matches the repository name in the component's image
// reference (the path segment after the last slash, before the tag).
func findImageRef(def *servicedef.ServiceDef, imageName string) string {
for _, c := range def.Components {
repoName := extractRepoName(c.Image)
if repoName == imageName {
return c.Image
}
}
return ""
}
// extractRegistry returns the registry host from the first component's
// image reference (e.g., "mcr.svc.mcp.metacircular.net:8443" from
// "mcr.svc.mcp.metacircular.net:8443/mcq:v0.1.1"). Returns empty
// string if no slash is found.
func extractRegistry(def *servicedef.ServiceDef) string {
for _, c := range def.Components {
if i := strings.LastIndex(c.Image, "/"); i > 0 {
return c.Image[:i]
}
}
return ""
}
// extractRepoName returns the repository name from an image reference.
// Examples:
//
// "mcr.svc.mcp.metacircular.net:8443/mcr:v1.1.0" -> "mcr"
// "mcr.svc.mcp.metacircular.net:8443/mcr-web:v1.2.0" -> "mcr-web"
// "mcr-web:v1.2.0" -> "mcr-web"
// "mcr-web" -> "mcr-web"
func extractRepoName(image string) string {
// Strip registry prefix (everything up to and including the last slash).
name := image
if i := strings.LastIndex(image, "/"); i >= 0 {
name = image[i+1:]
}
// Strip tag.
if i := strings.LastIndex(name, ":"); i >= 0 {
name = name[:i]
}
return name
}
// ensureImages checks that all component images exist in the registry.
// If an image is missing and the service has build configuration, it
// builds and pushes the image. Returns nil if all images are available.
func ensureImages(ctx context.Context, cfg *config.CLIConfig, def *servicedef.ServiceDef, rt *runtime.Podman, component string) error {
if def.Build == nil || len(def.Build.Images) == 0 {
return nil // no build config, skip auto-build
}
registryLoginDone := false
for _, c := range def.Components {
if component != "" && c.Name != component {
continue
}
repoName := extractRepoName(c.Image)
dockerfile, ok := def.Build.Images[repoName]
if !ok {
continue // no Dockerfile for this image, skip
}
exists, err := rt.ImageExists(ctx, c.Image)
if err != nil {
return fmt.Errorf("check image %s: %w", c.Image, err)
}
if exists {
continue
}
// Image missing — build and push.
if def.Path == "" {
return fmt.Errorf("image %s not found in registry and service %q has no path configured", c.Image, def.Name)
}
if cfg.Build.Workspace == "" {
return fmt.Errorf("image %s not found in registry and build.workspace is not configured", c.Image)
}
sourceDir := filepath.Join(cfg.Build.Workspace, def.Path)
// Auto-login to registry before first push.
if !registryLoginDone {
if token, err := auth.LoadToken(cfg.Auth.TokenPath); err == nil && token != "" {
registry := extractRegistry(def)
if registry != "" {
_ = rt.Login(ctx, registry, "mcp", token)
}
}
registryLoginDone = true
}
fmt.Printf("image %s not found, building from %s\n", c.Image, dockerfile)
if err := rt.Build(ctx, c.Image, sourceDir, dockerfile); err != nil {
return fmt.Errorf("auto-build %s: %w", c.Image, err)
}
fmt.Printf("pushing %s\n", c.Image)
if err := rt.Push(ctx, c.Image); err != nil {
return fmt.Errorf("auto-push %s: %w", c.Image, err)
}
}
return nil
}

View File

@@ -8,12 +8,15 @@ import (
"github.com/spf13/cobra" "github.com/spf13/cobra"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/servicedef" "git.wntrmute.dev/mc/mcp/internal/runtime"
"git.wntrmute.dev/mc/mcp/internal/servicedef"
) )
func deployCmd() *cobra.Command { func deployCmd() *cobra.Command {
var direct bool
cmd := &cobra.Command{ cmd := &cobra.Command{
Use: "deploy <service>[/<component>]", Use: "deploy <service>[/<component>]",
Short: "Deploy service from service definition", Short: "Deploy service from service definition",
@@ -31,8 +34,20 @@ func deployCmd() *cobra.Command {
return err return err
} }
// Auto-build missing images if the service has build config.
rt := &runtime.Podman{}
if err := ensureImages(cmd.Context(), cfg, def, rt, component); err != nil {
return err
}
spec := servicedef.ToProto(def) spec := servicedef.ToProto(def)
// Route through master if configured and not in direct mode.
if cfg.Master != nil && cfg.Master.Address != "" && !direct {
return deployViaMaster(cfg, spec)
}
// Direct mode: deploy to agent.
address, err := findNodeAddress(cfg, def.Node) address, err := findNodeAddress(cfg, def.Node)
if err != nil { if err != nil {
return err return err
@@ -57,9 +72,48 @@ func deployCmd() *cobra.Command {
}, },
} }
cmd.Flags().StringP("file", "f", "", "service definition file") cmd.Flags().StringP("file", "f", "", "service definition file")
cmd.Flags().BoolVar(&direct, "direct", false, "bypass master, deploy directly to agent (v1 mode)")
return cmd return cmd
} }
func deployViaMaster(cfg *config.CLIConfig, spec *mcpv1.ServiceSpec) error {
client, conn, err := dialMaster(cfg.Master.Address, cfg)
if err != nil {
return fmt.Errorf("dial master: %w", err)
}
defer func() { _ = conn.Close() }()
resp, err := client.Deploy(context.Background(), &mcpv1.MasterDeployRequest{
Service: spec,
})
if err != nil {
return fmt.Errorf("master deploy: %w", err)
}
fmt.Printf(" %s: placed on %s\n", spec.GetName(), resp.GetNode())
if r := resp.GetDeployResult(); r != nil {
printStepResult("deploy", r)
}
if r := resp.GetDnsResult(); r != nil {
printStepResult("dns", r)
}
if r := resp.GetEdgeRouteResult(); r != nil {
printStepResult("edge", r)
}
if !resp.GetSuccess() {
return fmt.Errorf("deploy failed: %s", resp.GetError())
}
return nil
}
func printStepResult(name string, r *mcpv1.StepResult) {
if r.GetSuccess() {
fmt.Printf(" %s: ok\n", name)
} else {
fmt.Printf(" %s: FAILED — %s\n", name, r.GetError())
}
}
// parseServiceArg splits a "service/component" argument into its parts. // parseServiceArg splits a "service/component" argument into its parts.
func parseServiceArg(arg string) (service, component string) { func parseServiceArg(arg string) (service, component string) {
parts := strings.SplitN(arg, "/", 2) parts := strings.SplitN(arg, "/", 2)
@@ -120,6 +174,7 @@ func serviceSpecFromInfo(info *mcpv1.ServiceInfo) *mcpv1.ServiceSpec {
spec := &mcpv1.ServiceSpec{ spec := &mcpv1.ServiceSpec{
Name: info.GetName(), Name: info.GetName(),
Active: info.GetActive(), Active: info.GetActive(),
Comment: info.GetComment(),
} }
for _, c := range info.GetComponents() { for _, c := range info.GetComponents() {
spec.Components = append(spec.Components, &mcpv1.ComponentSpec{ spec.Components = append(spec.Components, &mcpv1.ComponentSpec{

View File

@@ -6,9 +6,10 @@ import (
"crypto/x509" "crypto/x509"
"fmt" "fmt"
"os" "os"
"strings"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"google.golang.org/grpc" "google.golang.org/grpc"
"google.golang.org/grpc/credentials" "google.golang.org/grpc/credentials"
"google.golang.org/grpc/metadata" "google.golang.org/grpc/metadata"
@@ -42,6 +43,7 @@ func dialAgent(address string, cfg *config.CLIConfig) (mcpv1.McpAgentServiceClie
address, address,
grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)), grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)),
grpc.WithUnaryInterceptor(tokenInterceptor(token)), grpc.WithUnaryInterceptor(tokenInterceptor(token)),
grpc.WithStreamInterceptor(streamTokenInterceptor(token)),
) )
if err != nil { if err != nil {
return nil, nil, fmt.Errorf("dial %q: %w", address, err) return nil, nil, fmt.Errorf("dial %q: %w", address, err)
@@ -50,6 +52,43 @@ func dialAgent(address string, cfg *config.CLIConfig) (mcpv1.McpAgentServiceClie
return mcpv1.NewMcpAgentServiceClient(conn), conn, nil return mcpv1.NewMcpAgentServiceClient(conn), conn, nil
} }
// dialMaster connects to the master at the given address and returns a gRPC
// client for the McpMasterService.
func dialMaster(address string, cfg *config.CLIConfig) (mcpv1.McpMasterServiceClient, *grpc.ClientConn, error) {
tlsConfig := &tls.Config{
MinVersion: tls.VersionTLS13,
}
if cfg.MCIAS.CACert != "" {
caCert, err := os.ReadFile(cfg.MCIAS.CACert) //nolint:gosec // trusted config path
if err != nil {
return nil, nil, fmt.Errorf("read CA cert %q: %w", cfg.MCIAS.CACert, err)
}
pool := x509.NewCertPool()
if !pool.AppendCertsFromPEM(caCert) {
return nil, nil, fmt.Errorf("invalid CA cert %q", cfg.MCIAS.CACert)
}
tlsConfig.RootCAs = pool
}
token, err := loadBearerToken(cfg)
if err != nil {
return nil, nil, fmt.Errorf("load token: %w", err)
}
conn, err := grpc.NewClient(
address,
grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)),
grpc.WithUnaryInterceptor(tokenInterceptor(token)),
grpc.WithStreamInterceptor(streamTokenInterceptor(token)),
)
if err != nil {
return nil, nil, fmt.Errorf("dial master %q: %w", address, err)
}
return mcpv1.NewMcpMasterServiceClient(conn), conn, nil
}
// tokenInterceptor returns a gRPC client interceptor that attaches the // tokenInterceptor returns a gRPC client interceptor that attaches the
// bearer token to outgoing RPC metadata. // bearer token to outgoing RPC metadata.
func tokenInterceptor(token string) grpc.UnaryClientInterceptor { func tokenInterceptor(token string) grpc.UnaryClientInterceptor {
@@ -59,6 +98,15 @@ func tokenInterceptor(token string) grpc.UnaryClientInterceptor {
} }
} }
// streamTokenInterceptor returns a gRPC client stream interceptor that
// attaches the bearer token to outgoing stream metadata.
func streamTokenInterceptor(token string) grpc.StreamClientInterceptor {
return func(ctx context.Context, desc *grpc.StreamDesc, cc *grpc.ClientConn, method string, streamer grpc.Streamer, opts ...grpc.CallOption) (grpc.ClientStream, error) {
ctx = metadata.AppendToOutgoingContext(ctx, "authorization", "Bearer "+token)
return streamer(ctx, desc, cc, method, opts...)
}
}
// loadBearerToken reads the token from file or env var. // loadBearerToken reads the token from file or env var.
func loadBearerToken(cfg *config.CLIConfig) (string, error) { func loadBearerToken(cfg *config.CLIConfig) (string, error) {
if token := os.Getenv("MCP_TOKEN"); token != "" { if token := os.Getenv("MCP_TOKEN"); token != "" {
@@ -68,5 +116,5 @@ func loadBearerToken(cfg *config.CLIConfig) (string, error) {
if err != nil { if err != nil {
return "", fmt.Errorf("read token from %q: %w (run 'mcp login' first)", cfg.Auth.TokenPath, err) return "", fmt.Errorf("read token from %q: %w (run 'mcp login' first)", cfg.Auth.TokenPath, err)
} }
return string(token), nil return strings.TrimSpace(string(token)), nil
} }

87
cmd/mcp/dns.go Normal file
View File

@@ -0,0 +1,87 @@
package main
import (
"context"
"fmt"
"os"
"time"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra"
)
func dnsCmd() *cobra.Command {
return &cobra.Command{
Use: "dns",
Short: "List all DNS zones and records from MCNS",
RunE: runDNS,
}
}
func runDNS(_ *cobra.Command, _ []string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
// DNS is centralized — query the first reachable agent.
resp, nodeName, err := queryDNS(cfg)
if err != nil {
return err
}
if len(resp.GetZones()) == 0 {
fmt.Println("no DNS zones configured")
return nil
}
_ = nodeName
for i, zone := range resp.GetZones() {
if i > 0 {
fmt.Println()
}
fmt.Printf("ZONE: %s\n", zone.GetName())
if len(zone.GetRecords()) == 0 {
fmt.Println(" (no records)")
continue
}
w := newTable()
_, _ = fmt.Fprintln(w, " NAME\tTYPE\tVALUE\tTTL")
for _, r := range zone.GetRecords() {
_, _ = fmt.Fprintf(w, " %s\t%s\t%s\t%d\n",
r.GetName(), r.GetType(), r.GetValue(), r.GetTtl())
}
_ = w.Flush()
}
return nil
}
// queryDNS tries each configured agent and returns the first successful
// DNS listing. DNS is centralized so any agent with MCNS configured works.
func queryDNS(cfg *config.CLIConfig) (*mcpv1.ListDNSRecordsResponse, string, error) {
for _, node := range cfg.Nodes {
client, conn, err := dialAgent(node.Address, cfg)
if err != nil {
_, _ = fmt.Fprintf(os.Stderr, "warning: %s: %v\n", node.Name, err)
continue
}
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
resp, err := client.ListDNSRecords(ctx, &mcpv1.ListDNSRecordsRequest{})
cancel()
_ = conn.Close()
if err != nil {
_, _ = fmt.Fprintf(os.Stderr, "warning: %s: list DNS: %v\n", node.Name, err)
continue
}
return resp, node.Name, nil
}
return nil, "", fmt.Errorf("no reachable agent with DNS configured")
}

180
cmd/mcp/edge.go Normal file
View File

@@ -0,0 +1,180 @@
package main
import (
"context"
"fmt"
"time"
"github.com/spf13/cobra"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
)
func edgeCmd() *cobra.Command {
var nodeName string
cmd := &cobra.Command{
Use: "edge",
Short: "Manage edge routes (scaffolding — will be replaced by master)",
}
list := &cobra.Command{
Use: "list",
Short: "List edge routes on a node",
RunE: func(_ *cobra.Command, _ []string) error {
if nodeName == "" {
return fmt.Errorf("--node is required")
}
return runEdgeList(nodeName)
},
}
var (
backendHostname string
backendPort int
)
setup := &cobra.Command{
Use: "setup <hostname>",
Short: "Set up an edge route (provisions cert, registers mc-proxy route)",
Args: cobra.ExactArgs(1),
RunE: func(_ *cobra.Command, args []string) error {
if nodeName == "" {
return fmt.Errorf("--node is required")
}
if backendHostname == "" {
return fmt.Errorf("--backend-hostname is required")
}
if backendPort == 0 {
return fmt.Errorf("--backend-port is required")
}
return runEdgeSetup(nodeName, args[0], backendHostname, backendPort)
},
}
setup.Flags().StringVar(&backendHostname, "backend-hostname", "", "internal .svc.mcp hostname")
setup.Flags().IntVar(&backendPort, "backend-port", 0, "port on worker's mc-proxy")
remove := &cobra.Command{
Use: "remove <hostname>",
Short: "Remove an edge route",
Args: cobra.ExactArgs(1),
RunE: func(_ *cobra.Command, args []string) error {
if nodeName == "" {
return fmt.Errorf("--node is required")
}
return runEdgeRemove(nodeName, args[0])
},
}
cmd.PersistentFlags().StringVarP(&nodeName, "node", "n", "", "target node (required)")
cmd.AddCommand(list, setup, remove)
return cmd
}
func runEdgeList(nodeName string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
address, err := findNodeAddress(cfg, nodeName)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.ListEdgeRoutes(ctx, &mcpv1.ListEdgeRoutesRequest{})
if err != nil {
return fmt.Errorf("list edge routes: %w", err)
}
if len(resp.GetRoutes()) == 0 {
fmt.Printf("No edge routes on %s\n", nodeName)
return nil
}
fmt.Printf("Edge routes on %s:\n", nodeName)
for _, r := range resp.GetRoutes() {
expires := r.GetCertExpires()
if expires == "" {
expires = "unknown"
}
fmt.Printf(" %s → %s:%d cert_expires=%s\n",
r.GetHostname(), r.GetBackendHostname(), r.GetBackendPort(), expires)
}
return nil
}
func runEdgeSetup(nodeName, hostname, backendHostname string, backendPort int) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
address, err := findNodeAddress(cfg, nodeName)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
_, err = client.SetupEdgeRoute(ctx, &mcpv1.SetupEdgeRouteRequest{
Hostname: hostname,
BackendHostname: backendHostname,
BackendPort: int32(backendPort), //nolint:gosec // port is a small positive integer
BackendTls: true,
})
if err != nil {
return fmt.Errorf("setup edge route: %w", err)
}
fmt.Printf("edge route established: %s → %s:%d on %s\n", hostname, backendHostname, backendPort, nodeName)
return nil
}
func runEdgeRemove(nodeName, hostname string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
address, err := findNodeAddress(cfg, nodeName)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
_, err = client.RemoveEdgeRoute(ctx, &mcpv1.RemoveEdgeRouteRequest{
Hostname: hostname,
})
if err != nil {
return fmt.Errorf("remove edge route: %w", err)
}
fmt.Printf("edge route removed: %s on %s\n", hostname, nodeName)
return nil
}

12
cmd/mcp/edit.go Normal file
View File

@@ -0,0 +1,12 @@
package main
import "github.com/spf13/cobra"
func editCmd() *cobra.Command {
return &cobra.Command{
Use: "edit <service>",
Short: "Open service definition in $EDITOR",
Args: cobra.ExactArgs(1),
RunE: runServiceEdit,
}
}

View File

@@ -4,8 +4,8 @@ import (
"fmt" "fmt"
"os" "os"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
) )
// findNodeAddress looks up a node by name in the CLI config and returns // findNodeAddress looks up a node by name in the CLI config and returns

View File

@@ -7,15 +7,15 @@ import (
"github.com/spf13/cobra" "github.com/spf13/cobra"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/servicedef" "git.wntrmute.dev/mc/mcp/internal/servicedef"
) )
func stopCmd() *cobra.Command { func stopCmd() *cobra.Command {
return &cobra.Command{ return &cobra.Command{
Use: "stop <service>", Use: "stop <service>[/<component>]",
Short: "Stop all components, set active=false", Short: "Stop components (or all), set active=false",
Args: cobra.ExactArgs(1), Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error { RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath) cfg, err := config.LoadCLIConfig(cfgPath)
@@ -23,7 +23,7 @@ func stopCmd() *cobra.Command {
return fmt.Errorf("load config: %w", err) return fmt.Errorf("load config: %w", err)
} }
serviceName := args[0] serviceName, component := parseServiceArg(args[0])
defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml") defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml")
def, err := servicedef.Load(defPath) def, err := servicedef.Load(defPath)
@@ -31,11 +31,14 @@ func stopCmd() *cobra.Command {
return fmt.Errorf("load service def: %w", err) return fmt.Errorf("load service def: %w", err)
} }
// Only flip active=false when stopping the whole service.
if component == "" {
active := false active := false
def.Active = &active def.Active = &active
if err := servicedef.Write(defPath, def); err != nil { if err := servicedef.Write(defPath, def); err != nil {
return fmt.Errorf("write service def: %w", err) return fmt.Errorf("write service def: %w", err)
} }
}
address, err := findNodeAddress(cfg, def.Node) address, err := findNodeAddress(cfg, def.Node)
if err != nil { if err != nil {
@@ -50,6 +53,7 @@ func stopCmd() *cobra.Command {
resp, err := client.StopService(context.Background(), &mcpv1.StopServiceRequest{ resp, err := client.StopService(context.Background(), &mcpv1.StopServiceRequest{
Name: serviceName, Name: serviceName,
Component: component,
}) })
if err != nil { if err != nil {
return fmt.Errorf("stop service: %w", err) return fmt.Errorf("stop service: %w", err)
@@ -63,8 +67,8 @@ func stopCmd() *cobra.Command {
func startCmd() *cobra.Command { func startCmd() *cobra.Command {
return &cobra.Command{ return &cobra.Command{
Use: "start <service>", Use: "start <service>[/<component>]",
Short: "Start all components, set active=true", Short: "Start components (or all), set active=true",
Args: cobra.ExactArgs(1), Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error { RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath) cfg, err := config.LoadCLIConfig(cfgPath)
@@ -72,7 +76,7 @@ func startCmd() *cobra.Command {
return fmt.Errorf("load config: %w", err) return fmt.Errorf("load config: %w", err)
} }
serviceName := args[0] serviceName, component := parseServiceArg(args[0])
defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml") defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml")
def, err := servicedef.Load(defPath) def, err := servicedef.Load(defPath)
@@ -80,11 +84,14 @@ func startCmd() *cobra.Command {
return fmt.Errorf("load service def: %w", err) return fmt.Errorf("load service def: %w", err)
} }
// Only flip active=true when starting the whole service.
if component == "" {
active := true active := true
def.Active = &active def.Active = &active
if err := servicedef.Write(defPath, def); err != nil { if err := servicedef.Write(defPath, def); err != nil {
return fmt.Errorf("write service def: %w", err) return fmt.Errorf("write service def: %w", err)
} }
}
address, err := findNodeAddress(cfg, def.Node) address, err := findNodeAddress(cfg, def.Node)
if err != nil { if err != nil {
@@ -99,6 +106,7 @@ func startCmd() *cobra.Command {
resp, err := client.StartService(context.Background(), &mcpv1.StartServiceRequest{ resp, err := client.StartService(context.Background(), &mcpv1.StartServiceRequest{
Name: serviceName, Name: serviceName,
Component: component,
}) })
if err != nil { if err != nil {
return fmt.Errorf("start service: %w", err) return fmt.Errorf("start service: %w", err)
@@ -112,8 +120,8 @@ func startCmd() *cobra.Command {
func restartCmd() *cobra.Command { func restartCmd() *cobra.Command {
return &cobra.Command{ return &cobra.Command{
Use: "restart <service>", Use: "restart <service>[/<component>]",
Short: "Restart all components", Short: "Restart components (or all)",
Args: cobra.ExactArgs(1), Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error { RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath) cfg, err := config.LoadCLIConfig(cfgPath)
@@ -121,7 +129,7 @@ func restartCmd() *cobra.Command {
return fmt.Errorf("load config: %w", err) return fmt.Errorf("load config: %w", err)
} }
serviceName := args[0] serviceName, component := parseServiceArg(args[0])
defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml") defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml")
def, err := servicedef.Load(defPath) def, err := servicedef.Load(defPath)
@@ -142,6 +150,7 @@ func restartCmd() *cobra.Command {
resp, err := client.RestartService(context.Background(), &mcpv1.RestartServiceRequest{ resp, err := client.RestartService(context.Background(), &mcpv1.RestartServiceRequest{
Name: serviceName, Name: serviceName,
Component: component,
}) })
if err != nil { if err != nil {
return fmt.Errorf("restart service: %w", err) return fmt.Errorf("restart service: %w", err)

View File

@@ -8,8 +8,9 @@ import (
"github.com/spf13/cobra" "github.com/spf13/cobra"
"git.wntrmute.dev/kyle/mcp/internal/auth" "git.wntrmute.dev/mc/mcdsl/terminal"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/mc/mcp/internal/config"
) )
func loginCmd() *cobra.Command { func loginCmd() *cobra.Command {
@@ -33,14 +34,11 @@ func loginCmd() *cobra.Command {
} }
username := strings.TrimSpace(scanner.Text()) username := strings.TrimSpace(scanner.Text())
fmt.Print("Password: ") password, err := terminal.ReadPassword("Password: ")
if !scanner.Scan() { if err != nil {
if err := scanner.Err(); err != nil {
return fmt.Errorf("read password: %w", err) return fmt.Errorf("read password: %w", err)
} }
return fmt.Errorf("read password: unexpected end of input") password = strings.TrimSpace(password)
}
password := strings.TrimSpace(scanner.Text())
token, err := auth.Login(cfg.MCIAS.ServerURL, cfg.MCIAS.CACert, username, password) token, err := auth.Login(cfg.MCIAS.ServerURL, cfg.MCIAS.CACert, username, password)
if err != nil { if err != nil {

81
cmd/mcp/logs.go Normal file
View File

@@ -0,0 +1,81 @@
package main
import (
"fmt"
"io"
"os"
"github.com/spf13/cobra"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
)
func logsCmd() *cobra.Command {
var (
tail int
follow bool
timestamps bool
since string
)
cmd := &cobra.Command{
Use: "logs <service>[/<component>]",
Short: "Show container logs",
Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
serviceName, component := parseServiceArg(args[0])
def, err := loadServiceDef(cmd, cfg, serviceName)
if err != nil {
return err
}
address, err := findNodeAddress(cfg, def.Node)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
stream, err := client.Logs(cmd.Context(), &mcpv1.LogsRequest{
Service: serviceName,
Component: component,
Tail: int32(tail),
Follow: follow,
Timestamps: timestamps,
Since: since,
})
if err != nil {
return fmt.Errorf("logs: %w", err)
}
for {
resp, err := stream.Recv()
if err == io.EOF {
return nil
}
if err != nil {
return fmt.Errorf("recv: %w", err)
}
_, _ = os.Stdout.Write(resp.Data)
}
},
}
cmd.Flags().IntVarP(&tail, "tail", "n", 0, "number of lines from end (0 = all)")
cmd.Flags().BoolVarP(&follow, "follow", "f", false, "follow log output")
cmd.Flags().BoolVarP(&timestamps, "timestamps", "t", false, "show timestamps")
cmd.Flags().StringVar(&since, "since", "", "show logs since (e.g., 2h, 2026-03-28T00:00:00Z)")
return cmd
}

View File

@@ -4,6 +4,7 @@ import (
"fmt" "fmt"
"log" "log"
"os" "os"
"path/filepath"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )
@@ -18,7 +19,11 @@ func main() {
Use: "mcp", Use: "mcp",
Short: "Metacircular Control Plane CLI", Short: "Metacircular Control Plane CLI",
} }
root.PersistentFlags().StringVarP(&cfgPath, "config", "c", "", "config file path") defaultCfg := ""
if home, err := os.UserHomeDir(); err == nil {
defaultCfg = filepath.Join(home, ".config", "mcp", "mcp.toml")
}
root.PersistentFlags().StringVarP(&cfgPath, "config", "c", defaultCfg, "config file path")
root.AddCommand(&cobra.Command{ root.AddCommand(&cobra.Command{
Use: "version", Use: "version",
@@ -29,7 +34,9 @@ func main() {
}) })
root.AddCommand(loginCmd()) root.AddCommand(loginCmd())
root.AddCommand(buildCmd())
root.AddCommand(deployCmd()) root.AddCommand(deployCmd())
root.AddCommand(undeployCmd())
root.AddCommand(stopCmd()) root.AddCommand(stopCmd())
root.AddCommand(startCmd()) root.AddCommand(startCmd())
root.AddCommand(restartCmd()) root.AddCommand(restartCmd())
@@ -42,6 +49,12 @@ func main() {
root.AddCommand(pushCmd()) root.AddCommand(pushCmd())
root.AddCommand(pullCmd()) root.AddCommand(pullCmd())
root.AddCommand(nodeCmd()) root.AddCommand(nodeCmd())
root.AddCommand(purgeCmd())
root.AddCommand(logsCmd())
root.AddCommand(editCmd())
root.AddCommand(dnsCmd())
root.AddCommand(routeCmd())
root.AddCommand(edgeCmd())
if err := root.Execute(); err != nil { if err := root.Execute(); err != nil {
log.Fatal(err) log.Fatal(err)

View File

@@ -1,13 +1,16 @@
package main package main
import ( import (
"context"
"fmt" "fmt"
"os" "os"
"text/tabwriter" "text/tabwriter"
"time"
toml "github.com/pelletier/go-toml/v2" toml "github.com/pelletier/go-toml/v2"
"git.wntrmute.dev/kyle/mcp/internal/config" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )
@@ -48,13 +51,35 @@ func runNodeList(_ *cobra.Command, _ []string) error {
} }
w := tabwriter.NewWriter(os.Stdout, 0, 4, 2, ' ', 0) w := tabwriter.NewWriter(os.Stdout, 0, 4, 2, ' ', 0)
_, _ = fmt.Fprintln(w, "NAME\tADDRESS") _, _ = fmt.Fprintln(w, "NAME\tADDRESS\tVERSION")
for _, n := range cfg.Nodes { for _, n := range cfg.Nodes {
_, _ = fmt.Fprintf(w, "%s\t%s\n", n.Name, n.Address) ver := queryAgentVersion(cfg, n.Address)
_, _ = fmt.Fprintf(w, "%s\t%s\t%s\n", n.Name, n.Address, ver)
} }
return w.Flush() return w.Flush()
} }
// queryAgentVersion dials the agent and returns its version, or an error indicator.
func queryAgentVersion(cfg *config.CLIConfig, address string) string {
client, conn, err := dialAgent(address, cfg)
if err != nil {
return "error"
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.NodeStatus(ctx, &mcpv1.NodeStatusRequest{})
if err != nil {
return "error"
}
if resp.AgentVersion == "" {
return "unknown"
}
return resp.AgentVersion
}
func runNodeAdd(_ *cobra.Command, args []string) error { func runNodeAdd(_ *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath) cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil { if err != nil {

119
cmd/mcp/purge.go Normal file
View File

@@ -0,0 +1,119 @@
package main
import (
"context"
"fmt"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/mc/mcp/internal/servicedef"
"github.com/spf13/cobra"
)
func purgeCmd() *cobra.Command {
cmd := &cobra.Command{
Use: "purge [service[/component]]",
Short: "Remove stale registry entries for gone, undefined components",
Long: `Purge removes registry entries that are both unwanted (not in any
current service definition) and gone (no corresponding container in the
runtime). It never stops or removes running containers.
Use --dry-run to preview what would be purged.`,
Args: cobra.MaximumNArgs(1),
RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
dryRun, _ := cmd.Flags().GetBool("dry-run")
var service, component string
if len(args) == 1 {
service, component = parseServiceArg(args[0])
}
// Load all local service definitions to build the set of
// currently-defined service/component pairs.
definedComponents := buildDefinedComponents(cfg)
// Build node address lookup.
nodeAddr := make(map[string]string, len(cfg.Nodes))
for _, n := range cfg.Nodes {
nodeAddr[n.Name] = n.Address
}
// If a specific service was given and we can find its node,
// only talk to that node. Otherwise, talk to all nodes.
targetNodes := cfg.Nodes
if service != "" {
if nodeName, nodeAddr, err := findServiceNode(cfg, service); err == nil {
targetNodes = []config.NodeConfig{{Name: nodeName, Address: nodeAddr}}
}
}
anyResults := false
for _, node := range targetNodes {
client, conn, err := dialAgent(node.Address, cfg)
if err != nil {
return fmt.Errorf("dial %s: %w", node.Name, err)
}
defer func() { _ = conn.Close() }()
resp, err := client.PurgeComponent(context.Background(), &mcpv1.PurgeRequest{
Service: service,
Component: component,
DryRun: dryRun,
DefinedComponents: definedComponents,
})
if err != nil {
return fmt.Errorf("purge on %s: %w", node.Name, err)
}
for _, r := range resp.GetResults() {
anyResults = true
if r.GetPurged() {
if dryRun {
fmt.Printf("would purge %s/%s (%s)\n", r.GetService(), r.GetComponent(), r.GetReason())
} else {
fmt.Printf("purged %s/%s (%s)\n", r.GetService(), r.GetComponent(), r.GetReason())
}
} else {
fmt.Printf("skipped %s/%s (%s)\n", r.GetService(), r.GetComponent(), r.GetReason())
}
}
}
if !anyResults {
fmt.Println("nothing to purge")
}
return nil
},
}
cmd.Flags().Bool("dry-run", false, "preview what would be purged without modifying the registry")
return cmd
}
// buildDefinedComponents reads all local service definition files and returns
// a list of "service/component" strings for every defined component.
func buildDefinedComponents(cfg *config.CLIConfig) []string {
defs, err := servicedef.LoadAll(cfg.Services.Dir)
if err != nil {
// If we can't read service definitions, return an empty list.
// The agent will treat every component as undefined, which is the
// most conservative behavior (everything eligible gets purged).
return nil
}
var defined []string
for _, def := range defs {
for _, comp := range def.Components {
defined = append(defined, def.Name+"/"+comp.Name)
}
}
return defined
}

225
cmd/mcp/route.go Normal file
View File

@@ -0,0 +1,225 @@
package main
import (
"context"
"fmt"
"os"
"time"
"github.com/spf13/cobra"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
)
func routeCmd() *cobra.Command {
var nodeName string
cmd := &cobra.Command{
Use: "route",
Short: "Manage mc-proxy routes",
}
list := &cobra.Command{
Use: "list",
Short: "List mc-proxy routes",
RunE: func(_ *cobra.Command, _ []string) error {
return runRouteList(nodeName)
},
}
var (
routeMode string
backendTLS bool
tlsCert string
tlsKey string
)
add := &cobra.Command{
Use: "add <listener> <hostname> <backend>",
Short: "Add a route to mc-proxy",
Long: "Add a route. Example: mcp route add -n rift :443 mcq.svc.mcp.metacircular.net 127.0.0.1:48080 --mode l7 --tls-cert /srv/mc-proxy/certs/mcq.pem --tls-key /srv/mc-proxy/certs/mcq.key",
Args: cobra.ExactArgs(3),
RunE: func(_ *cobra.Command, args []string) error {
return runRouteAdd(nodeName, args, routeMode, backendTLS, tlsCert, tlsKey)
},
}
add.Flags().StringVar(&routeMode, "mode", "l4", "route mode (l4 or l7)")
add.Flags().BoolVar(&backendTLS, "backend-tls", false, "re-encrypt traffic to backend")
add.Flags().StringVar(&tlsCert, "tls-cert", "", "path to TLS cert on the node (required for l7)")
add.Flags().StringVar(&tlsKey, "tls-key", "", "path to TLS key on the node (required for l7)")
remove := &cobra.Command{
Use: "remove <listener> <hostname>",
Short: "Remove a route from mc-proxy",
Long: "Remove a route. Example: mcp route remove -n rift :443 mcq.metacircular.net",
Args: cobra.ExactArgs(2),
RunE: func(_ *cobra.Command, args []string) error {
return runRouteRemove(nodeName, args)
},
}
cmd.PersistentFlags().StringVarP(&nodeName, "node", "n", "", "target node (required)")
cmd.AddCommand(list, add, remove)
return cmd
}
func runRouteList(nodeName string) error {
if nodeName == "" {
return runRouteListAll()
}
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
address, err := findNodeAddress(cfg, nodeName)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.ListProxyRoutes(ctx, &mcpv1.ListProxyRoutesRequest{})
if err != nil {
return fmt.Errorf("list routes: %w", err)
}
printRoutes(nodeName, resp)
return nil
}
func runRouteListAll() error {
first := true
return forEachNode(func(node config.NodeConfig, client mcpv1.McpAgentServiceClient) error {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.ListProxyRoutes(ctx, &mcpv1.ListProxyRoutesRequest{})
if err != nil {
_, _ = fmt.Fprintf(os.Stderr, "warning: %s: list routes: %v\n", node.Name, err)
return nil
}
if !first {
fmt.Println()
}
first = false
printRoutes(node.Name, resp)
return nil
})
}
func printRoutes(nodeName string, resp *mcpv1.ListProxyRoutesResponse) {
fmt.Printf("NODE: %s\n", nodeName)
fmt.Printf("mc-proxy %s\n", resp.GetVersion())
if resp.GetStartedAt() != nil {
uptime := time.Since(resp.GetStartedAt().AsTime()).Truncate(time.Second)
fmt.Printf("uptime: %s\n", uptime)
}
fmt.Printf("connections: %d\n", resp.GetTotalConnections())
fmt.Println()
for _, ls := range resp.GetListeners() {
fmt.Printf(" %s routes=%d active=%d\n",
ls.GetAddr(), ls.GetRouteCount(), ls.GetActiveConnections())
for _, r := range ls.GetRoutes() {
mode := r.GetMode()
if mode == "" {
mode = "l4"
}
extra := ""
if r.GetBackendTls() {
extra = " (re-encrypt)"
}
fmt.Printf(" %s %s → %s%s\n", mode, r.GetHostname(), r.GetBackend(), extra)
}
}
}
func runRouteAdd(nodeName string, args []string, mode string, backendTLS bool, tlsCert, tlsKey string) error {
if nodeName == "" {
return fmt.Errorf("--node is required")
}
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
address, err := findNodeAddress(cfg, nodeName)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_, err = client.AddProxyRoute(ctx, &mcpv1.AddProxyRouteRequest{
ListenerAddr: args[0],
Hostname: args[1],
Backend: args[2],
Mode: mode,
BackendTls: backendTLS,
TlsCert: tlsCert,
TlsKey: tlsKey,
})
if err != nil {
return fmt.Errorf("add route: %w", err)
}
fmt.Printf("Added route: %s %s → %s on %s (%s)\n", mode, args[1], args[2], args[0], nodeName)
return nil
}
func runRouteRemove(nodeName string, args []string) error {
if nodeName == "" {
return fmt.Errorf("--node is required")
}
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
address, err := findNodeAddress(cfg, nodeName)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_, err = client.RemoveProxyRoute(ctx, &mcpv1.RemoveProxyRouteRequest{
ListenerAddr: args[0],
Hostname: args[1],
})
if err != nil {
return fmt.Errorf("remove route: %w", err)
}
fmt.Printf("Removed route: %s from %s (%s)\n", args[1], args[0], nodeName)
return nil
}

View File

@@ -7,9 +7,9 @@ import (
"os/exec" "os/exec"
"path/filepath" "path/filepath"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/servicedef" "git.wntrmute.dev/mc/mcp/internal/servicedef"
toml "github.com/pelletier/go-toml/v2" toml "github.com/pelletier/go-toml/v2"
"github.com/spf13/cobra" "github.com/spf13/cobra"
"google.golang.org/grpc" "google.golang.org/grpc"

View File

@@ -7,8 +7,8 @@ import (
"text/tabwriter" "text/tabwriter"
"time" "time"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )

View File

@@ -4,9 +4,9 @@ import (
"context" "context"
"fmt" "fmt"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/servicedef" "git.wntrmute.dev/mc/mcp/internal/servicedef"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )

View File

@@ -7,8 +7,8 @@ import (
"os" "os"
"path/filepath" "path/filepath"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"github.com/spf13/cobra" "github.com/spf13/cobra"
) )

94
cmd/mcp/undeploy.go Normal file
View File

@@ -0,0 +1,94 @@
package main
import (
"context"
"fmt"
"path/filepath"
"github.com/spf13/cobra"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/mc/mcp/internal/servicedef"
)
func undeployCmd() *cobra.Command {
var direct bool
cmd := &cobra.Command{
Use: "undeploy <service>",
Short: "Fully undeploy a service: remove routes, DNS, certs, and containers",
Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error {
cfg, err := config.LoadCLIConfig(cfgPath)
if err != nil {
return fmt.Errorf("load config: %w", err)
}
serviceName := args[0]
defPath := filepath.Join(cfg.Services.Dir, serviceName+".toml")
def, err := servicedef.Load(defPath)
if err != nil {
return fmt.Errorf("load service def: %w", err)
}
// Set active=false in the local file.
active := false
def.Active = &active
if err := servicedef.Write(defPath, def); err != nil {
return fmt.Errorf("write service def: %w", err)
}
// Route through master if configured and not in direct mode.
if cfg.Master != nil && cfg.Master.Address != "" && !direct {
return undeployViaMaster(cfg, serviceName)
}
address, err := findNodeAddress(cfg, def.Node)
if err != nil {
return err
}
client, conn, err := dialAgent(address, cfg)
if err != nil {
return fmt.Errorf("dial agent: %w", err)
}
defer func() { _ = conn.Close() }()
resp, err := client.UndeployService(context.Background(), &mcpv1.UndeployServiceRequest{
Name: serviceName,
})
if err != nil {
return fmt.Errorf("undeploy service: %w", err)
}
printComponentResults(resp.GetResults())
return nil
},
}
cmd.Flags().BoolVar(&direct, "direct", false, "bypass master, undeploy directly via agent")
return cmd
}
func undeployViaMaster(cfg *config.CLIConfig, serviceName string) error {
client, conn, err := dialMaster(cfg.Master.Address, cfg)
if err != nil {
return fmt.Errorf("dial master: %w", err)
}
defer func() { _ = conn.Close() }()
resp, err := client.Undeploy(context.Background(), &mcpv1.MasterUndeployRequest{
ServiceName: serviceName,
})
if err != nil {
return fmt.Errorf("master undeploy: %w", err)
}
if resp.GetSuccess() {
fmt.Printf(" %s: undeployed\n", serviceName)
} else {
return fmt.Errorf("undeploy failed: %s", resp.GetError())
}
return nil
}

View File

@@ -0,0 +1,94 @@
# MCP Master configuration
#
# Default location: /srv/mcp-master/mcp-master.toml
# Override with: mcp-master server --config /path/to/mcp-master.toml
# ------------------------------------------------------------------
# gRPC server
# ------------------------------------------------------------------
[server]
# Listen address for the gRPC server. Bind to the Tailnet interface.
grpc_addr = "100.95.252.120:9555"
tls_cert = "/srv/mcp-master/certs/cert.pem"
tls_key = "/srv/mcp-master/certs/key.pem"
# ------------------------------------------------------------------
# Database
# ------------------------------------------------------------------
[database]
path = "/srv/mcp-master/master.db"
# ------------------------------------------------------------------
# MCIAS (for validating inbound CLI/agent tokens)
# ------------------------------------------------------------------
[mcias]
server_url = "https://mcias.metacircular.net:8443"
ca_cert = "/srv/mcp-master/certs/ca.pem"
service_name = "mcp-master"
# ------------------------------------------------------------------
# Master identity (for dialing agents)
# ------------------------------------------------------------------
[master]
# Path to the MCIAS service token file used by the master to
# authenticate to agents when forwarding deploys and edge routes.
service_token_path = "/srv/mcp-master/mcias-token"
# CA cert for verifying agent TLS certificates.
ca_cert = "/srv/mcp-master/certs/ca.pem"
# ------------------------------------------------------------------
# Edge routing
# ------------------------------------------------------------------
[edge]
# Public hostnames in service definitions must fall under one of these
# domains. Validation uses proper domain label matching.
allowed_domains = ["metacircular.net", "wntrmute.net"]
# ------------------------------------------------------------------
# Agent registration
# ------------------------------------------------------------------
[registration]
# MCIAS service identities permitted to register.
allowed_agents = ["agent-rift", "agent-svc", "agent-orion"]
# Maximum registered nodes.
max_nodes = 16
# ------------------------------------------------------------------
# Timeouts
# ------------------------------------------------------------------
[timeouts]
deploy = "5m"
edge_route = "30s"
health_check = "5s"
undeploy = "2m"
snapshot = "10m"
# ------------------------------------------------------------------
# DNS (MCNS)
# ------------------------------------------------------------------
[mcns]
server_url = "https://mcns.svc.mcp.metacircular.net:8443"
ca_cert = "/srv/mcp-master/certs/ca.pem"
token_path = "/srv/mcp-master/mcns-token"
zone = "svc.mcp.metacircular.net"
# ------------------------------------------------------------------
# Logging
# ------------------------------------------------------------------
[log]
level = "info"
# ------------------------------------------------------------------
# Bootstrap nodes
# ------------------------------------------------------------------
[[nodes]]
name = "rift"
address = "100.95.252.120:9444"
role = "master"
[[nodes]]
name = "svc"
address = "100.106.232.4:9555"
role = "edge"

View File

@@ -11,6 +11,8 @@ RestartSec=5
User=mcp User=mcp
Group=mcp Group=mcp
Environment=HOME=/srv/mcp
Environment=XDG_RUNTIME_DIR=/run/user/%U
NoNewPrivileges=true NoNewPrivileges=true
ProtectSystem=strict ProtectSystem=strict

198
docs/bootstrap.md Normal file
View File

@@ -0,0 +1,198 @@
# MCP Bootstrap Procedure
How to bring MCP up on a node for the first time, including migrating
existing containers from another user's podman instance.
## Prerequisites
- NixOS configuration applied with `configs/mcp.nix` (creates `mcp` user
with rootless podman, subuid/subgid, systemd service)
- MCIAS system account with `admin` role (for token validation and cert
provisioning)
- Metacrypt running (for TLS certificate issuance)
## Step 1: Provision TLS Certificate
Issue a cert from Metacrypt with DNS and IP SANs:
```bash
export METACRYPT_TOKEN="<admin-token>"
# From a machine that can reach Metacrypt (e.g., via loopback on rift):
curl -sk -X POST https://127.0.0.1:18443/v1/engine/request \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $METACRYPT_TOKEN" \
-d '{
"mount": "pki",
"operation": "issue",
"path": "web",
"data": {
"issuer": "web",
"common_name": "mcp-agent.svc.mcp.metacircular.net",
"profile": "server",
"dns_names": ["mcp-agent.svc.mcp.metacircular.net"],
"ip_addresses": ["<tailscale-ip>", "<lan-ip>"],
"ttl": "2160h"
}
}' > cert-response.json
# Extract cert and key from the JSON response and install:
doas cp cert.pem /srv/mcp/certs/cert.pem
doas cp key.pem /srv/mcp/certs/key.pem
doas chown mcp:mcp /srv/mcp/certs/cert.pem /srv/mcp/certs/key.pem
doas chmod 600 /srv/mcp/certs/cert.pem /srv/mcp/certs/key.pem
```
## Step 2: Add DNS Record
Add an A record for `mcp-agent.svc.mcp.metacircular.net` pointing to the
node's IP in the MCNS zone file, bump the serial, restart CoreDNS.
## Step 3: Write Agent Config
Create `/srv/mcp/mcp-agent.toml`:
```toml
[server]
grpc_addr = "<tailscale-ip>:9444"
tls_cert = "/srv/mcp/certs/cert.pem"
tls_key = "/srv/mcp/certs/key.pem"
[database]
path = "/srv/mcp/mcp.db"
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcp-agent"
[agent]
node_name = "<node-name>"
container_runtime = "podman"
[monitor]
interval = "60s"
alert_command = []
cooldown = "15m"
flap_threshold = 3
flap_window = "10m"
retention = "30d"
[log]
level = "info"
```
## Step 4: Install Agent Binary
```bash
scp mcp-agent <node>:/tmp/
ssh <node> "doas cp /tmp/mcp-agent /usr/local/bin/mcp-agent"
```
## Step 5: Start the Agent
```bash
ssh <node> "doas systemctl start mcp-agent"
ssh <node> "doas systemctl status mcp-agent"
```
## Step 6: Configure CLI
On the operator's workstation, create `~/.config/mcp/mcp.toml` and save
the MCIAS admin service account token to `~/.config/mcp/token`.
## Step 7: Migrate Containers (if existing)
If containers are running under another user (e.g., `kyle`), migrate them
to the `mcp` user's podman. Process each service in dependency order:
**Dependency order:** Metacrypt → MC-Proxy → MCR → MCNS
For each service:
```bash
# 1. Stop containers under the old user
ssh <node> "podman stop <container> && podman rm <container>"
# 2. Transfer ownership of data directory
ssh <node> "doas chown -R mcp:mcp /srv/<service>"
# 3. Transfer images to mcp's podman
ssh <node> "podman save <image> -o /tmp/<service>.tar"
ssh <node> "doas su -l -s /bin/sh mcp -c 'XDG_RUNTIME_DIR=/run/user/<uid> podman load -i /tmp/<service>.tar'"
# 4. Start containers under mcp (with new naming convention)
ssh <node> "doas su -l -s /bin/sh mcp -c 'XDG_RUNTIME_DIR=/run/user/<uid> podman run -d \
--name <service>-<component> \
--network mcpnet \
--restart unless-stopped \
--user 0:0 \
-p <ports> \
-v /srv/<service>:/srv/<service> \
<image> <cmd>'"
```
**Container naming convention:** `<service>-<component>` (e.g.,
`metacrypt-api`, `metacrypt-web`, `mc-proxy`).
**Network:** Services whose components need to communicate (metacrypt
api↔web, mcr api↔web) must be on the same podman network with DNS
enabled. Create with `podman network create mcpnet`.
**Config updates:** If service configs reference container names for
inter-component communication (e.g., `vault_grpc = "metacrypt:9443"`),
update them to use the new names (e.g., `vault_grpc = "metacrypt-api:9443"`).
**Unseal Metacrypt** after migration — it starts sealed.
## Step 8: Adopt Containers
```bash
mcp adopt metacrypt
mcp adopt mc-proxy
mcp adopt mcr
mcp adopt mcns
```
## Step 9: Export and Complete Service Definitions
```bash
mcp service export metacrypt
mcp service export mc-proxy
mcp service export mcr
mcp service export mcns
```
The exported files will have name + image only. Edit each file to add the
full container spec: network, ports, volumes, user, restart, cmd.
Then sync to push the complete specs:
```bash
mcp sync
```
## Step 10: Verify
```bash
mcp status
```
All services should show `desired: running`, `observed: running`, no drift.
## Lessons Learned (from first deployment, 2026-03-26)
- **NixOS systemd sandbox**: `ProtectHome=true` blocks `/run/user` which
rootless podman needs. Use `ProtectHome=false`. `ProtectSystem=strict`
also blocks it; use `full` instead.
- **PATH**: the agent's systemd unit needs `PATH=/run/current-system/sw/bin`
to find podman.
- **XDG_RUNTIME_DIR**: must be set to `/run/user/<uid>` for rootless podman.
Pin the UID in NixOS config to avoid drift.
- **Podman ps JSON**: the `Command` field is `[]string`, not `string`.
- **Container naming**: `mc-proxy` (service with hyphen) breaks naive split
on `-`. The agent uses registry-aware splitting.
- **Token whitespace**: token files with trailing newlines cause gRPC header
errors. The CLI trims whitespace.
- **MCR auth**: rootless podman under a new user can't pull from MCR without
OCI token auth. Workaround: `podman save` + `podman load` to transfer
images.

27
flake.lock generated Normal file
View File

@@ -0,0 +1,27 @@
{
"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1774388614,
"narHash": "sha256-tFwzTI0DdDzovdE9+Ras6CUss0yn8P9XV4Ja6RjA+nU=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "1073dad219cb244572b74da2b20c7fe39cb3fa9e",
"type": "github"
},
"original": {
"owner": "NixOS",
"ref": "nixos-25.11",
"repo": "nixpkgs",
"type": "github"
}
},
"root": {
"inputs": {
"nixpkgs": "nixpkgs"
}
}
},
"root": "root",
"version": 7
}

56
flake.nix Normal file
View File

@@ -0,0 +1,56 @@
{
description = "mcp - Metacircular Control Plane";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.11";
};
outputs =
{ self, nixpkgs }:
let
system = "x86_64-linux";
pkgs = nixpkgs.legacyPackages.${system};
version = "0.8.3";
in
{
packages.${system} = {
default = pkgs.buildGoModule {
pname = "mcp";
inherit version;
src = ./.;
vendorHash = null;
subPackages = [
"cmd/mcp"
];
ldflags = [
"-s"
"-w"
"-X main.version=${version}"
];
postInstall = ''
mkdir -p $out/share/zsh/site-functions
mkdir -p $out/share/bash-completion/completions
mkdir -p $out/share/fish/vendor_completions.d
$out/bin/mcp completion zsh > $out/share/zsh/site-functions/_mcp
$out/bin/mcp completion bash > $out/share/bash-completion/completions/mcp
$out/bin/mcp completion fish > $out/share/fish/vendor_completions.d/mcp.fish
'';
};
mcp-agent = pkgs.buildGoModule {
pname = "mcp-agent";
inherit version;
src = ./.;
vendorHash = null;
subPackages = [
"cmd/mcp-agent"
];
ldflags = [
"-s"
"-w"
"-X main.version=${version}"
];
};
};
};
}

849
gen/mcp/v1/master.pb.go Normal file
View File

@@ -0,0 +1,849 @@
// McpMasterService: Multi-node orchestration for the Metacircular platform.
// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
// protoc-gen-go v1.36.11
// protoc v6.32.1
// source: proto/mcp/v1/master.proto
package mcpv1
import (
protoreflect "google.golang.org/protobuf/reflect/protoreflect"
protoimpl "google.golang.org/protobuf/runtime/protoimpl"
reflect "reflect"
sync "sync"
unsafe "unsafe"
)
const (
// Verify that this generated code is sufficiently up-to-date.
_ = protoimpl.EnforceVersion(20 - protoimpl.MinVersion)
// Verify that runtime/protoimpl is sufficiently up-to-date.
_ = protoimpl.EnforceVersion(protoimpl.MaxVersion - 20)
)
type MasterDeployRequest struct {
state protoimpl.MessageState `protogen:"open.v1"`
Service *ServiceSpec `protobuf:"bytes,1,opt,name=service,proto3" json:"service,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *MasterDeployRequest) Reset() {
*x = MasterDeployRequest{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[0]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *MasterDeployRequest) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*MasterDeployRequest) ProtoMessage() {}
func (x *MasterDeployRequest) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[0]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use MasterDeployRequest.ProtoReflect.Descriptor instead.
func (*MasterDeployRequest) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{0}
}
func (x *MasterDeployRequest) GetService() *ServiceSpec {
if x != nil {
return x.Service
}
return nil
}
type MasterDeployResponse struct {
state protoimpl.MessageState `protogen:"open.v1"`
Node string `protobuf:"bytes,1,opt,name=node,proto3" json:"node,omitempty"` // node the service was placed on
Success bool `protobuf:"varint,2,opt,name=success,proto3" json:"success,omitempty"` // true only if ALL steps succeeded
Error string `protobuf:"bytes,3,opt,name=error,proto3" json:"error,omitempty"`
// Per-step results for operator visibility.
DeployResult *StepResult `protobuf:"bytes,4,opt,name=deploy_result,json=deployResult,proto3" json:"deploy_result,omitempty"`
EdgeRouteResult *StepResult `protobuf:"bytes,5,opt,name=edge_route_result,json=edgeRouteResult,proto3" json:"edge_route_result,omitempty"`
DnsResult *StepResult `protobuf:"bytes,6,opt,name=dns_result,json=dnsResult,proto3" json:"dns_result,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *MasterDeployResponse) Reset() {
*x = MasterDeployResponse{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[1]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *MasterDeployResponse) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*MasterDeployResponse) ProtoMessage() {}
func (x *MasterDeployResponse) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[1]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use MasterDeployResponse.ProtoReflect.Descriptor instead.
func (*MasterDeployResponse) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{1}
}
func (x *MasterDeployResponse) GetNode() string {
if x != nil {
return x.Node
}
return ""
}
func (x *MasterDeployResponse) GetSuccess() bool {
if x != nil {
return x.Success
}
return false
}
func (x *MasterDeployResponse) GetError() string {
if x != nil {
return x.Error
}
return ""
}
func (x *MasterDeployResponse) GetDeployResult() *StepResult {
if x != nil {
return x.DeployResult
}
return nil
}
func (x *MasterDeployResponse) GetEdgeRouteResult() *StepResult {
if x != nil {
return x.EdgeRouteResult
}
return nil
}
func (x *MasterDeployResponse) GetDnsResult() *StepResult {
if x != nil {
return x.DnsResult
}
return nil
}
type StepResult struct {
state protoimpl.MessageState `protogen:"open.v1"`
Step string `protobuf:"bytes,1,opt,name=step,proto3" json:"step,omitempty"`
Success bool `protobuf:"varint,2,opt,name=success,proto3" json:"success,omitempty"`
Error string `protobuf:"bytes,3,opt,name=error,proto3" json:"error,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *StepResult) Reset() {
*x = StepResult{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[2]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *StepResult) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*StepResult) ProtoMessage() {}
func (x *StepResult) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[2]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use StepResult.ProtoReflect.Descriptor instead.
func (*StepResult) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{2}
}
func (x *StepResult) GetStep() string {
if x != nil {
return x.Step
}
return ""
}
func (x *StepResult) GetSuccess() bool {
if x != nil {
return x.Success
}
return false
}
func (x *StepResult) GetError() string {
if x != nil {
return x.Error
}
return ""
}
type MasterUndeployRequest struct {
state protoimpl.MessageState `protogen:"open.v1"`
ServiceName string `protobuf:"bytes,1,opt,name=service_name,json=serviceName,proto3" json:"service_name,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *MasterUndeployRequest) Reset() {
*x = MasterUndeployRequest{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[3]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *MasterUndeployRequest) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*MasterUndeployRequest) ProtoMessage() {}
func (x *MasterUndeployRequest) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[3]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use MasterUndeployRequest.ProtoReflect.Descriptor instead.
func (*MasterUndeployRequest) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{3}
}
func (x *MasterUndeployRequest) GetServiceName() string {
if x != nil {
return x.ServiceName
}
return ""
}
type MasterUndeployResponse struct {
state protoimpl.MessageState `protogen:"open.v1"`
Success bool `protobuf:"varint,1,opt,name=success,proto3" json:"success,omitempty"`
Error string `protobuf:"bytes,2,opt,name=error,proto3" json:"error,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *MasterUndeployResponse) Reset() {
*x = MasterUndeployResponse{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[4]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *MasterUndeployResponse) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*MasterUndeployResponse) ProtoMessage() {}
func (x *MasterUndeployResponse) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[4]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use MasterUndeployResponse.ProtoReflect.Descriptor instead.
func (*MasterUndeployResponse) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{4}
}
func (x *MasterUndeployResponse) GetSuccess() bool {
if x != nil {
return x.Success
}
return false
}
func (x *MasterUndeployResponse) GetError() string {
if x != nil {
return x.Error
}
return ""
}
type MasterStatusRequest struct {
state protoimpl.MessageState `protogen:"open.v1"`
ServiceName string `protobuf:"bytes,1,opt,name=service_name,json=serviceName,proto3" json:"service_name,omitempty"` // empty = all services
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *MasterStatusRequest) Reset() {
*x = MasterStatusRequest{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[5]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *MasterStatusRequest) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*MasterStatusRequest) ProtoMessage() {}
func (x *MasterStatusRequest) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[5]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use MasterStatusRequest.ProtoReflect.Descriptor instead.
func (*MasterStatusRequest) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{5}
}
func (x *MasterStatusRequest) GetServiceName() string {
if x != nil {
return x.ServiceName
}
return ""
}
type MasterStatusResponse struct {
state protoimpl.MessageState `protogen:"open.v1"`
Services []*ServiceStatus `protobuf:"bytes,1,rep,name=services,proto3" json:"services,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *MasterStatusResponse) Reset() {
*x = MasterStatusResponse{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[6]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *MasterStatusResponse) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*MasterStatusResponse) ProtoMessage() {}
func (x *MasterStatusResponse) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[6]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use MasterStatusResponse.ProtoReflect.Descriptor instead.
func (*MasterStatusResponse) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{6}
}
func (x *MasterStatusResponse) GetServices() []*ServiceStatus {
if x != nil {
return x.Services
}
return nil
}
type ServiceStatus struct {
state protoimpl.MessageState `protogen:"open.v1"`
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
Node string `protobuf:"bytes,2,opt,name=node,proto3" json:"node,omitempty"`
Tier string `protobuf:"bytes,3,opt,name=tier,proto3" json:"tier,omitempty"`
Status string `protobuf:"bytes,4,opt,name=status,proto3" json:"status,omitempty"` // "running", "stopped", "unhealthy", "unknown"
EdgeRoutes []*EdgeRouteStatus `protobuf:"bytes,5,rep,name=edge_routes,json=edgeRoutes,proto3" json:"edge_routes,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *ServiceStatus) Reset() {
*x = ServiceStatus{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[7]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *ServiceStatus) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*ServiceStatus) ProtoMessage() {}
func (x *ServiceStatus) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[7]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use ServiceStatus.ProtoReflect.Descriptor instead.
func (*ServiceStatus) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{7}
}
func (x *ServiceStatus) GetName() string {
if x != nil {
return x.Name
}
return ""
}
func (x *ServiceStatus) GetNode() string {
if x != nil {
return x.Node
}
return ""
}
func (x *ServiceStatus) GetTier() string {
if x != nil {
return x.Tier
}
return ""
}
func (x *ServiceStatus) GetStatus() string {
if x != nil {
return x.Status
}
return ""
}
func (x *ServiceStatus) GetEdgeRoutes() []*EdgeRouteStatus {
if x != nil {
return x.EdgeRoutes
}
return nil
}
type EdgeRouteStatus struct {
state protoimpl.MessageState `protogen:"open.v1"`
Hostname string `protobuf:"bytes,1,opt,name=hostname,proto3" json:"hostname,omitempty"`
EdgeNode string `protobuf:"bytes,2,opt,name=edge_node,json=edgeNode,proto3" json:"edge_node,omitempty"`
CertExpires string `protobuf:"bytes,3,opt,name=cert_expires,json=certExpires,proto3" json:"cert_expires,omitempty"` // RFC3339
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *EdgeRouteStatus) Reset() {
*x = EdgeRouteStatus{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[8]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *EdgeRouteStatus) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*EdgeRouteStatus) ProtoMessage() {}
func (x *EdgeRouteStatus) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[8]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use EdgeRouteStatus.ProtoReflect.Descriptor instead.
func (*EdgeRouteStatus) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{8}
}
func (x *EdgeRouteStatus) GetHostname() string {
if x != nil {
return x.Hostname
}
return ""
}
func (x *EdgeRouteStatus) GetEdgeNode() string {
if x != nil {
return x.EdgeNode
}
return ""
}
func (x *EdgeRouteStatus) GetCertExpires() string {
if x != nil {
return x.CertExpires
}
return ""
}
type ListNodesRequest struct {
state protoimpl.MessageState `protogen:"open.v1"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *ListNodesRequest) Reset() {
*x = ListNodesRequest{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[9]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *ListNodesRequest) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*ListNodesRequest) ProtoMessage() {}
func (x *ListNodesRequest) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[9]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use ListNodesRequest.ProtoReflect.Descriptor instead.
func (*ListNodesRequest) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{9}
}
type ListNodesResponse struct {
state protoimpl.MessageState `protogen:"open.v1"`
Nodes []*NodeInfo `protobuf:"bytes,1,rep,name=nodes,proto3" json:"nodes,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *ListNodesResponse) Reset() {
*x = ListNodesResponse{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[10]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *ListNodesResponse) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*ListNodesResponse) ProtoMessage() {}
func (x *ListNodesResponse) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[10]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use ListNodesResponse.ProtoReflect.Descriptor instead.
func (*ListNodesResponse) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{10}
}
func (x *ListNodesResponse) GetNodes() []*NodeInfo {
if x != nil {
return x.Nodes
}
return nil
}
type NodeInfo struct {
state protoimpl.MessageState `protogen:"open.v1"`
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
Role string `protobuf:"bytes,2,opt,name=role,proto3" json:"role,omitempty"`
Address string `protobuf:"bytes,3,opt,name=address,proto3" json:"address,omitempty"`
Arch string `protobuf:"bytes,4,opt,name=arch,proto3" json:"arch,omitempty"`
Status string `protobuf:"bytes,5,opt,name=status,proto3" json:"status,omitempty"` // "healthy", "unhealthy", "unknown"
Containers int32 `protobuf:"varint,6,opt,name=containers,proto3" json:"containers,omitempty"`
LastHeartbeat string `protobuf:"bytes,7,opt,name=last_heartbeat,json=lastHeartbeat,proto3" json:"last_heartbeat,omitempty"` // RFC3339
Services int32 `protobuf:"varint,8,opt,name=services,proto3" json:"services,omitempty"` // placement count
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
func (x *NodeInfo) Reset() {
*x = NodeInfo{}
mi := &file_proto_mcp_v1_master_proto_msgTypes[11]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
func (x *NodeInfo) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*NodeInfo) ProtoMessage() {}
func (x *NodeInfo) ProtoReflect() protoreflect.Message {
mi := &file_proto_mcp_v1_master_proto_msgTypes[11]
if x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use NodeInfo.ProtoReflect.Descriptor instead.
func (*NodeInfo) Descriptor() ([]byte, []int) {
return file_proto_mcp_v1_master_proto_rawDescGZIP(), []int{11}
}
func (x *NodeInfo) GetName() string {
if x != nil {
return x.Name
}
return ""
}
func (x *NodeInfo) GetRole() string {
if x != nil {
return x.Role
}
return ""
}
func (x *NodeInfo) GetAddress() string {
if x != nil {
return x.Address
}
return ""
}
func (x *NodeInfo) GetArch() string {
if x != nil {
return x.Arch
}
return ""
}
func (x *NodeInfo) GetStatus() string {
if x != nil {
return x.Status
}
return ""
}
func (x *NodeInfo) GetContainers() int32 {
if x != nil {
return x.Containers
}
return 0
}
func (x *NodeInfo) GetLastHeartbeat() string {
if x != nil {
return x.LastHeartbeat
}
return ""
}
func (x *NodeInfo) GetServices() int32 {
if x != nil {
return x.Services
}
return 0
}
var File_proto_mcp_v1_master_proto protoreflect.FileDescriptor
const file_proto_mcp_v1_master_proto_rawDesc = "" +
"\n" +
"\x19proto/mcp/v1/master.proto\x12\x06mcp.v1\x1a\x16proto/mcp/v1/mcp.proto\"D\n" +
"\x13MasterDeployRequest\x12-\n" +
"\aservice\x18\x01 \x01(\v2\x13.mcp.v1.ServiceSpecR\aservice\"\x86\x02\n" +
"\x14MasterDeployResponse\x12\x12\n" +
"\x04node\x18\x01 \x01(\tR\x04node\x12\x18\n" +
"\asuccess\x18\x02 \x01(\bR\asuccess\x12\x14\n" +
"\x05error\x18\x03 \x01(\tR\x05error\x127\n" +
"\rdeploy_result\x18\x04 \x01(\v2\x12.mcp.v1.StepResultR\fdeployResult\x12>\n" +
"\x11edge_route_result\x18\x05 \x01(\v2\x12.mcp.v1.StepResultR\x0fedgeRouteResult\x121\n" +
"\n" +
"dns_result\x18\x06 \x01(\v2\x12.mcp.v1.StepResultR\tdnsResult\"P\n" +
"\n" +
"StepResult\x12\x12\n" +
"\x04step\x18\x01 \x01(\tR\x04step\x12\x18\n" +
"\asuccess\x18\x02 \x01(\bR\asuccess\x12\x14\n" +
"\x05error\x18\x03 \x01(\tR\x05error\":\n" +
"\x15MasterUndeployRequest\x12!\n" +
"\fservice_name\x18\x01 \x01(\tR\vserviceName\"H\n" +
"\x16MasterUndeployResponse\x12\x18\n" +
"\asuccess\x18\x01 \x01(\bR\asuccess\x12\x14\n" +
"\x05error\x18\x02 \x01(\tR\x05error\"8\n" +
"\x13MasterStatusRequest\x12!\n" +
"\fservice_name\x18\x01 \x01(\tR\vserviceName\"I\n" +
"\x14MasterStatusResponse\x121\n" +
"\bservices\x18\x01 \x03(\v2\x15.mcp.v1.ServiceStatusR\bservices\"\x9d\x01\n" +
"\rServiceStatus\x12\x12\n" +
"\x04name\x18\x01 \x01(\tR\x04name\x12\x12\n" +
"\x04node\x18\x02 \x01(\tR\x04node\x12\x12\n" +
"\x04tier\x18\x03 \x01(\tR\x04tier\x12\x16\n" +
"\x06status\x18\x04 \x01(\tR\x06status\x128\n" +
"\vedge_routes\x18\x05 \x03(\v2\x17.mcp.v1.EdgeRouteStatusR\n" +
"edgeRoutes\"m\n" +
"\x0fEdgeRouteStatus\x12\x1a\n" +
"\bhostname\x18\x01 \x01(\tR\bhostname\x12\x1b\n" +
"\tedge_node\x18\x02 \x01(\tR\bedgeNode\x12!\n" +
"\fcert_expires\x18\x03 \x01(\tR\vcertExpires\"\x12\n" +
"\x10ListNodesRequest\";\n" +
"\x11ListNodesResponse\x12&\n" +
"\x05nodes\x18\x01 \x03(\v2\x10.mcp.v1.NodeInfoR\x05nodes\"\xdb\x01\n" +
"\bNodeInfo\x12\x12\n" +
"\x04name\x18\x01 \x01(\tR\x04name\x12\x12\n" +
"\x04role\x18\x02 \x01(\tR\x04role\x12\x18\n" +
"\aaddress\x18\x03 \x01(\tR\aaddress\x12\x12\n" +
"\x04arch\x18\x04 \x01(\tR\x04arch\x12\x16\n" +
"\x06status\x18\x05 \x01(\tR\x06status\x12\x1e\n" +
"\n" +
"containers\x18\x06 \x01(\x05R\n" +
"containers\x12%\n" +
"\x0elast_heartbeat\x18\a \x01(\tR\rlastHeartbeat\x12\x1a\n" +
"\bservices\x18\b \x01(\x05R\bservices2\xa9\x02\n" +
"\x10McpMasterService\x12C\n" +
"\x06Deploy\x12\x1b.mcp.v1.MasterDeployRequest\x1a\x1c.mcp.v1.MasterDeployResponse\x12I\n" +
"\bUndeploy\x12\x1d.mcp.v1.MasterUndeployRequest\x1a\x1e.mcp.v1.MasterUndeployResponse\x12C\n" +
"\x06Status\x12\x1b.mcp.v1.MasterStatusRequest\x1a\x1c.mcp.v1.MasterStatusResponse\x12@\n" +
"\tListNodes\x12\x18.mcp.v1.ListNodesRequest\x1a\x19.mcp.v1.ListNodesResponseB*Z(git.wntrmute.dev/mc/mcp/gen/mcp/v1;mcpv1b\x06proto3"
var (
file_proto_mcp_v1_master_proto_rawDescOnce sync.Once
file_proto_mcp_v1_master_proto_rawDescData []byte
)
func file_proto_mcp_v1_master_proto_rawDescGZIP() []byte {
file_proto_mcp_v1_master_proto_rawDescOnce.Do(func() {
file_proto_mcp_v1_master_proto_rawDescData = protoimpl.X.CompressGZIP(unsafe.Slice(unsafe.StringData(file_proto_mcp_v1_master_proto_rawDesc), len(file_proto_mcp_v1_master_proto_rawDesc)))
})
return file_proto_mcp_v1_master_proto_rawDescData
}
var file_proto_mcp_v1_master_proto_msgTypes = make([]protoimpl.MessageInfo, 12)
var file_proto_mcp_v1_master_proto_goTypes = []any{
(*MasterDeployRequest)(nil), // 0: mcp.v1.MasterDeployRequest
(*MasterDeployResponse)(nil), // 1: mcp.v1.MasterDeployResponse
(*StepResult)(nil), // 2: mcp.v1.StepResult
(*MasterUndeployRequest)(nil), // 3: mcp.v1.MasterUndeployRequest
(*MasterUndeployResponse)(nil), // 4: mcp.v1.MasterUndeployResponse
(*MasterStatusRequest)(nil), // 5: mcp.v1.MasterStatusRequest
(*MasterStatusResponse)(nil), // 6: mcp.v1.MasterStatusResponse
(*ServiceStatus)(nil), // 7: mcp.v1.ServiceStatus
(*EdgeRouteStatus)(nil), // 8: mcp.v1.EdgeRouteStatus
(*ListNodesRequest)(nil), // 9: mcp.v1.ListNodesRequest
(*ListNodesResponse)(nil), // 10: mcp.v1.ListNodesResponse
(*NodeInfo)(nil), // 11: mcp.v1.NodeInfo
(*ServiceSpec)(nil), // 12: mcp.v1.ServiceSpec
}
var file_proto_mcp_v1_master_proto_depIdxs = []int32{
12, // 0: mcp.v1.MasterDeployRequest.service:type_name -> mcp.v1.ServiceSpec
2, // 1: mcp.v1.MasterDeployResponse.deploy_result:type_name -> mcp.v1.StepResult
2, // 2: mcp.v1.MasterDeployResponse.edge_route_result:type_name -> mcp.v1.StepResult
2, // 3: mcp.v1.MasterDeployResponse.dns_result:type_name -> mcp.v1.StepResult
7, // 4: mcp.v1.MasterStatusResponse.services:type_name -> mcp.v1.ServiceStatus
8, // 5: mcp.v1.ServiceStatus.edge_routes:type_name -> mcp.v1.EdgeRouteStatus
11, // 6: mcp.v1.ListNodesResponse.nodes:type_name -> mcp.v1.NodeInfo
0, // 7: mcp.v1.McpMasterService.Deploy:input_type -> mcp.v1.MasterDeployRequest
3, // 8: mcp.v1.McpMasterService.Undeploy:input_type -> mcp.v1.MasterUndeployRequest
5, // 9: mcp.v1.McpMasterService.Status:input_type -> mcp.v1.MasterStatusRequest
9, // 10: mcp.v1.McpMasterService.ListNodes:input_type -> mcp.v1.ListNodesRequest
1, // 11: mcp.v1.McpMasterService.Deploy:output_type -> mcp.v1.MasterDeployResponse
4, // 12: mcp.v1.McpMasterService.Undeploy:output_type -> mcp.v1.MasterUndeployResponse
6, // 13: mcp.v1.McpMasterService.Status:output_type -> mcp.v1.MasterStatusResponse
10, // 14: mcp.v1.McpMasterService.ListNodes:output_type -> mcp.v1.ListNodesResponse
11, // [11:15] is the sub-list for method output_type
7, // [7:11] is the sub-list for method input_type
7, // [7:7] is the sub-list for extension type_name
7, // [7:7] is the sub-list for extension extendee
0, // [0:7] is the sub-list for field type_name
}
func init() { file_proto_mcp_v1_master_proto_init() }
func file_proto_mcp_v1_master_proto_init() {
if File_proto_mcp_v1_master_proto != nil {
return
}
file_proto_mcp_v1_mcp_proto_init()
type x struct{}
out := protoimpl.TypeBuilder{
File: protoimpl.DescBuilder{
GoPackagePath: reflect.TypeOf(x{}).PkgPath(),
RawDescriptor: unsafe.Slice(unsafe.StringData(file_proto_mcp_v1_master_proto_rawDesc), len(file_proto_mcp_v1_master_proto_rawDesc)),
NumEnums: 0,
NumMessages: 12,
NumExtensions: 0,
NumServices: 1,
},
GoTypes: file_proto_mcp_v1_master_proto_goTypes,
DependencyIndexes: file_proto_mcp_v1_master_proto_depIdxs,
MessageInfos: file_proto_mcp_v1_master_proto_msgTypes,
}.Build()
File_proto_mcp_v1_master_proto = out.File
file_proto_mcp_v1_master_proto_goTypes = nil
file_proto_mcp_v1_master_proto_depIdxs = nil
}

View File

@@ -0,0 +1,247 @@
// McpMasterService: Multi-node orchestration for the Metacircular platform.
// Code generated by protoc-gen-go-grpc. DO NOT EDIT.
// versions:
// - protoc-gen-go-grpc v1.6.1
// - protoc v6.32.1
// source: proto/mcp/v1/master.proto
package mcpv1
import (
context "context"
grpc "google.golang.org/grpc"
codes "google.golang.org/grpc/codes"
status "google.golang.org/grpc/status"
)
// This is a compile-time assertion to ensure that this generated file
// is compatible with the grpc package it is being compiled against.
// Requires gRPC-Go v1.64.0 or later.
const _ = grpc.SupportPackageIsVersion9
const (
McpMasterService_Deploy_FullMethodName = "/mcp.v1.McpMasterService/Deploy"
McpMasterService_Undeploy_FullMethodName = "/mcp.v1.McpMasterService/Undeploy"
McpMasterService_Status_FullMethodName = "/mcp.v1.McpMasterService/Status"
McpMasterService_ListNodes_FullMethodName = "/mcp.v1.McpMasterService/ListNodes"
)
// McpMasterServiceClient is the client API for McpMasterService service.
//
// For semantics around ctx use and closing/ending streaming RPCs, please refer to https://pkg.go.dev/google.golang.org/grpc/?tab=doc#ClientConn.NewStream.
//
// McpMasterService coordinates multi-node deployments. The CLI sends
// deploy/undeploy/status requests to the master, which places services on
// nodes, forwards to agents, and coordinates edge routing.
type McpMasterServiceClient interface {
// CLI operations.
Deploy(ctx context.Context, in *MasterDeployRequest, opts ...grpc.CallOption) (*MasterDeployResponse, error)
Undeploy(ctx context.Context, in *MasterUndeployRequest, opts ...grpc.CallOption) (*MasterUndeployResponse, error)
Status(ctx context.Context, in *MasterStatusRequest, opts ...grpc.CallOption) (*MasterStatusResponse, error)
ListNodes(ctx context.Context, in *ListNodesRequest, opts ...grpc.CallOption) (*ListNodesResponse, error)
}
type mcpMasterServiceClient struct {
cc grpc.ClientConnInterface
}
func NewMcpMasterServiceClient(cc grpc.ClientConnInterface) McpMasterServiceClient {
return &mcpMasterServiceClient{cc}
}
func (c *mcpMasterServiceClient) Deploy(ctx context.Context, in *MasterDeployRequest, opts ...grpc.CallOption) (*MasterDeployResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(MasterDeployResponse)
err := c.cc.Invoke(ctx, McpMasterService_Deploy_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpMasterServiceClient) Undeploy(ctx context.Context, in *MasterUndeployRequest, opts ...grpc.CallOption) (*MasterUndeployResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(MasterUndeployResponse)
err := c.cc.Invoke(ctx, McpMasterService_Undeploy_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpMasterServiceClient) Status(ctx context.Context, in *MasterStatusRequest, opts ...grpc.CallOption) (*MasterStatusResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(MasterStatusResponse)
err := c.cc.Invoke(ctx, McpMasterService_Status_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpMasterServiceClient) ListNodes(ctx context.Context, in *ListNodesRequest, opts ...grpc.CallOption) (*ListNodesResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(ListNodesResponse)
err := c.cc.Invoke(ctx, McpMasterService_ListNodes_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
// McpMasterServiceServer is the server API for McpMasterService service.
// All implementations must embed UnimplementedMcpMasterServiceServer
// for forward compatibility.
//
// McpMasterService coordinates multi-node deployments. The CLI sends
// deploy/undeploy/status requests to the master, which places services on
// nodes, forwards to agents, and coordinates edge routing.
type McpMasterServiceServer interface {
// CLI operations.
Deploy(context.Context, *MasterDeployRequest) (*MasterDeployResponse, error)
Undeploy(context.Context, *MasterUndeployRequest) (*MasterUndeployResponse, error)
Status(context.Context, *MasterStatusRequest) (*MasterStatusResponse, error)
ListNodes(context.Context, *ListNodesRequest) (*ListNodesResponse, error)
mustEmbedUnimplementedMcpMasterServiceServer()
}
// UnimplementedMcpMasterServiceServer must be embedded to have
// forward compatible implementations.
//
// NOTE: this should be embedded by value instead of pointer to avoid a nil
// pointer dereference when methods are called.
type UnimplementedMcpMasterServiceServer struct{}
func (UnimplementedMcpMasterServiceServer) Deploy(context.Context, *MasterDeployRequest) (*MasterDeployResponse, error) {
return nil, status.Error(codes.Unimplemented, "method Deploy not implemented")
}
func (UnimplementedMcpMasterServiceServer) Undeploy(context.Context, *MasterUndeployRequest) (*MasterUndeployResponse, error) {
return nil, status.Error(codes.Unimplemented, "method Undeploy not implemented")
}
func (UnimplementedMcpMasterServiceServer) Status(context.Context, *MasterStatusRequest) (*MasterStatusResponse, error) {
return nil, status.Error(codes.Unimplemented, "method Status not implemented")
}
func (UnimplementedMcpMasterServiceServer) ListNodes(context.Context, *ListNodesRequest) (*ListNodesResponse, error) {
return nil, status.Error(codes.Unimplemented, "method ListNodes not implemented")
}
func (UnimplementedMcpMasterServiceServer) mustEmbedUnimplementedMcpMasterServiceServer() {}
func (UnimplementedMcpMasterServiceServer) testEmbeddedByValue() {}
// UnsafeMcpMasterServiceServer may be embedded to opt out of forward compatibility for this service.
// Use of this interface is not recommended, as added methods to McpMasterServiceServer will
// result in compilation errors.
type UnsafeMcpMasterServiceServer interface {
mustEmbedUnimplementedMcpMasterServiceServer()
}
func RegisterMcpMasterServiceServer(s grpc.ServiceRegistrar, srv McpMasterServiceServer) {
// If the following call panics, it indicates UnimplementedMcpMasterServiceServer was
// embedded by pointer and is nil. This will cause panics if an
// unimplemented method is ever invoked, so we test this at initialization
// time to prevent it from happening at runtime later due to I/O.
if t, ok := srv.(interface{ testEmbeddedByValue() }); ok {
t.testEmbeddedByValue()
}
s.RegisterService(&McpMasterService_ServiceDesc, srv)
}
func _McpMasterService_Deploy_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(MasterDeployRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpMasterServiceServer).Deploy(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpMasterService_Deploy_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpMasterServiceServer).Deploy(ctx, req.(*MasterDeployRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpMasterService_Undeploy_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(MasterUndeployRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpMasterServiceServer).Undeploy(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpMasterService_Undeploy_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpMasterServiceServer).Undeploy(ctx, req.(*MasterUndeployRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpMasterService_Status_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(MasterStatusRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpMasterServiceServer).Status(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpMasterService_Status_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpMasterServiceServer).Status(ctx, req.(*MasterStatusRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpMasterService_ListNodes_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(ListNodesRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpMasterServiceServer).ListNodes(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpMasterService_ListNodes_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpMasterServiceServer).ListNodes(ctx, req.(*ListNodesRequest))
}
return interceptor(ctx, in, info, handler)
}
// McpMasterService_ServiceDesc is the grpc.ServiceDesc for McpMasterService service.
// It's only intended for direct use with grpc.RegisterService,
// and not to be introspected or modified (even as a copy)
var McpMasterService_ServiceDesc = grpc.ServiceDesc{
ServiceName: "mcp.v1.McpMasterService",
HandlerType: (*McpMasterServiceServer)(nil),
Methods: []grpc.MethodDesc{
{
MethodName: "Deploy",
Handler: _McpMasterService_Deploy_Handler,
},
{
MethodName: "Undeploy",
Handler: _McpMasterService_Undeploy_Handler,
},
{
MethodName: "Status",
Handler: _McpMasterService_Status_Handler,
},
{
MethodName: "ListNodes",
Handler: _McpMasterService_ListNodes_Handler,
},
},
Streams: []grpc.StreamDesc{},
Metadata: "proto/mcp/v1/master.proto",
}

File diff suppressed because it is too large Load Diff

View File

@@ -20,6 +20,7 @@ const _ = grpc.SupportPackageIsVersion9
const ( const (
McpAgentService_Deploy_FullMethodName = "/mcp.v1.McpAgentService/Deploy" McpAgentService_Deploy_FullMethodName = "/mcp.v1.McpAgentService/Deploy"
McpAgentService_UndeployService_FullMethodName = "/mcp.v1.McpAgentService/UndeployService"
McpAgentService_StopService_FullMethodName = "/mcp.v1.McpAgentService/StopService" McpAgentService_StopService_FullMethodName = "/mcp.v1.McpAgentService/StopService"
McpAgentService_StartService_FullMethodName = "/mcp.v1.McpAgentService/StartService" McpAgentService_StartService_FullMethodName = "/mcp.v1.McpAgentService/StartService"
McpAgentService_RestartService_FullMethodName = "/mcp.v1.McpAgentService/RestartService" McpAgentService_RestartService_FullMethodName = "/mcp.v1.McpAgentService/RestartService"
@@ -28,9 +29,19 @@ const (
McpAgentService_GetServiceStatus_FullMethodName = "/mcp.v1.McpAgentService/GetServiceStatus" McpAgentService_GetServiceStatus_FullMethodName = "/mcp.v1.McpAgentService/GetServiceStatus"
McpAgentService_LiveCheck_FullMethodName = "/mcp.v1.McpAgentService/LiveCheck" McpAgentService_LiveCheck_FullMethodName = "/mcp.v1.McpAgentService/LiveCheck"
McpAgentService_AdoptContainers_FullMethodName = "/mcp.v1.McpAgentService/AdoptContainers" McpAgentService_AdoptContainers_FullMethodName = "/mcp.v1.McpAgentService/AdoptContainers"
McpAgentService_PurgeComponent_FullMethodName = "/mcp.v1.McpAgentService/PurgeComponent"
McpAgentService_PushFile_FullMethodName = "/mcp.v1.McpAgentService/PushFile" McpAgentService_PushFile_FullMethodName = "/mcp.v1.McpAgentService/PushFile"
McpAgentService_PullFile_FullMethodName = "/mcp.v1.McpAgentService/PullFile" McpAgentService_PullFile_FullMethodName = "/mcp.v1.McpAgentService/PullFile"
McpAgentService_NodeStatus_FullMethodName = "/mcp.v1.McpAgentService/NodeStatus" McpAgentService_NodeStatus_FullMethodName = "/mcp.v1.McpAgentService/NodeStatus"
McpAgentService_ListDNSRecords_FullMethodName = "/mcp.v1.McpAgentService/ListDNSRecords"
McpAgentService_ListProxyRoutes_FullMethodName = "/mcp.v1.McpAgentService/ListProxyRoutes"
McpAgentService_AddProxyRoute_FullMethodName = "/mcp.v1.McpAgentService/AddProxyRoute"
McpAgentService_RemoveProxyRoute_FullMethodName = "/mcp.v1.McpAgentService/RemoveProxyRoute"
McpAgentService_SetupEdgeRoute_FullMethodName = "/mcp.v1.McpAgentService/SetupEdgeRoute"
McpAgentService_RemoveEdgeRoute_FullMethodName = "/mcp.v1.McpAgentService/RemoveEdgeRoute"
McpAgentService_ListEdgeRoutes_FullMethodName = "/mcp.v1.McpAgentService/ListEdgeRoutes"
McpAgentService_HealthCheck_FullMethodName = "/mcp.v1.McpAgentService/HealthCheck"
McpAgentService_Logs_FullMethodName = "/mcp.v1.McpAgentService/Logs"
) )
// McpAgentServiceClient is the client API for McpAgentService service. // McpAgentServiceClient is the client API for McpAgentService service.
@@ -39,6 +50,7 @@ const (
type McpAgentServiceClient interface { type McpAgentServiceClient interface {
// Service lifecycle // Service lifecycle
Deploy(ctx context.Context, in *DeployRequest, opts ...grpc.CallOption) (*DeployResponse, error) Deploy(ctx context.Context, in *DeployRequest, opts ...grpc.CallOption) (*DeployResponse, error)
UndeployService(ctx context.Context, in *UndeployServiceRequest, opts ...grpc.CallOption) (*UndeployServiceResponse, error)
StopService(ctx context.Context, in *StopServiceRequest, opts ...grpc.CallOption) (*StopServiceResponse, error) StopService(ctx context.Context, in *StopServiceRequest, opts ...grpc.CallOption) (*StopServiceResponse, error)
StartService(ctx context.Context, in *StartServiceRequest, opts ...grpc.CallOption) (*StartServiceResponse, error) StartService(ctx context.Context, in *StartServiceRequest, opts ...grpc.CallOption) (*StartServiceResponse, error)
RestartService(ctx context.Context, in *RestartServiceRequest, opts ...grpc.CallOption) (*RestartServiceResponse, error) RestartService(ctx context.Context, in *RestartServiceRequest, opts ...grpc.CallOption) (*RestartServiceResponse, error)
@@ -50,11 +62,27 @@ type McpAgentServiceClient interface {
LiveCheck(ctx context.Context, in *LiveCheckRequest, opts ...grpc.CallOption) (*LiveCheckResponse, error) LiveCheck(ctx context.Context, in *LiveCheckRequest, opts ...grpc.CallOption) (*LiveCheckResponse, error)
// Adopt // Adopt
AdoptContainers(ctx context.Context, in *AdoptContainersRequest, opts ...grpc.CallOption) (*AdoptContainersResponse, error) AdoptContainers(ctx context.Context, in *AdoptContainersRequest, opts ...grpc.CallOption) (*AdoptContainersResponse, error)
// Purge
PurgeComponent(ctx context.Context, in *PurgeRequest, opts ...grpc.CallOption) (*PurgeResponse, error)
// File transfer // File transfer
PushFile(ctx context.Context, in *PushFileRequest, opts ...grpc.CallOption) (*PushFileResponse, error) PushFile(ctx context.Context, in *PushFileRequest, opts ...grpc.CallOption) (*PushFileResponse, error)
PullFile(ctx context.Context, in *PullFileRequest, opts ...grpc.CallOption) (*PullFileResponse, error) PullFile(ctx context.Context, in *PullFileRequest, opts ...grpc.CallOption) (*PullFileResponse, error)
// Node // Node
NodeStatus(ctx context.Context, in *NodeStatusRequest, opts ...grpc.CallOption) (*NodeStatusResponse, error) NodeStatus(ctx context.Context, in *NodeStatusRequest, opts ...grpc.CallOption) (*NodeStatusResponse, error)
// DNS (query MCNS)
ListDNSRecords(ctx context.Context, in *ListDNSRecordsRequest, opts ...grpc.CallOption) (*ListDNSRecordsResponse, error)
// Proxy routes (query mc-proxy)
ListProxyRoutes(ctx context.Context, in *ListProxyRoutesRequest, opts ...grpc.CallOption) (*ListProxyRoutesResponse, error)
AddProxyRoute(ctx context.Context, in *AddProxyRouteRequest, opts ...grpc.CallOption) (*AddProxyRouteResponse, error)
RemoveProxyRoute(ctx context.Context, in *RemoveProxyRouteRequest, opts ...grpc.CallOption) (*RemoveProxyRouteResponse, error)
// Edge routing (called by master on edge nodes)
SetupEdgeRoute(ctx context.Context, in *SetupEdgeRouteRequest, opts ...grpc.CallOption) (*SetupEdgeRouteResponse, error)
RemoveEdgeRoute(ctx context.Context, in *RemoveEdgeRouteRequest, opts ...grpc.CallOption) (*RemoveEdgeRouteResponse, error)
ListEdgeRoutes(ctx context.Context, in *ListEdgeRoutesRequest, opts ...grpc.CallOption) (*ListEdgeRoutesResponse, error)
// Health (called by master on missed heartbeats)
HealthCheck(ctx context.Context, in *HealthCheckRequest, opts ...grpc.CallOption) (*HealthCheckResponse, error)
// Logs
Logs(ctx context.Context, in *LogsRequest, opts ...grpc.CallOption) (grpc.ServerStreamingClient[LogsResponse], error)
} }
type mcpAgentServiceClient struct { type mcpAgentServiceClient struct {
@@ -75,6 +103,16 @@ func (c *mcpAgentServiceClient) Deploy(ctx context.Context, in *DeployRequest, o
return out, nil return out, nil
} }
func (c *mcpAgentServiceClient) UndeployService(ctx context.Context, in *UndeployServiceRequest, opts ...grpc.CallOption) (*UndeployServiceResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(UndeployServiceResponse)
err := c.cc.Invoke(ctx, McpAgentService_UndeployService_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) StopService(ctx context.Context, in *StopServiceRequest, opts ...grpc.CallOption) (*StopServiceResponse, error) { func (c *mcpAgentServiceClient) StopService(ctx context.Context, in *StopServiceRequest, opts ...grpc.CallOption) (*StopServiceResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...) cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(StopServiceResponse) out := new(StopServiceResponse)
@@ -155,6 +193,16 @@ func (c *mcpAgentServiceClient) AdoptContainers(ctx context.Context, in *AdoptCo
return out, nil return out, nil
} }
func (c *mcpAgentServiceClient) PurgeComponent(ctx context.Context, in *PurgeRequest, opts ...grpc.CallOption) (*PurgeResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(PurgeResponse)
err := c.cc.Invoke(ctx, McpAgentService_PurgeComponent_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) PushFile(ctx context.Context, in *PushFileRequest, opts ...grpc.CallOption) (*PushFileResponse, error) { func (c *mcpAgentServiceClient) PushFile(ctx context.Context, in *PushFileRequest, opts ...grpc.CallOption) (*PushFileResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...) cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(PushFileResponse) out := new(PushFileResponse)
@@ -185,12 +233,112 @@ func (c *mcpAgentServiceClient) NodeStatus(ctx context.Context, in *NodeStatusRe
return out, nil return out, nil
} }
func (c *mcpAgentServiceClient) ListDNSRecords(ctx context.Context, in *ListDNSRecordsRequest, opts ...grpc.CallOption) (*ListDNSRecordsResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(ListDNSRecordsResponse)
err := c.cc.Invoke(ctx, McpAgentService_ListDNSRecords_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) ListProxyRoutes(ctx context.Context, in *ListProxyRoutesRequest, opts ...grpc.CallOption) (*ListProxyRoutesResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(ListProxyRoutesResponse)
err := c.cc.Invoke(ctx, McpAgentService_ListProxyRoutes_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) AddProxyRoute(ctx context.Context, in *AddProxyRouteRequest, opts ...grpc.CallOption) (*AddProxyRouteResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(AddProxyRouteResponse)
err := c.cc.Invoke(ctx, McpAgentService_AddProxyRoute_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) RemoveProxyRoute(ctx context.Context, in *RemoveProxyRouteRequest, opts ...grpc.CallOption) (*RemoveProxyRouteResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(RemoveProxyRouteResponse)
err := c.cc.Invoke(ctx, McpAgentService_RemoveProxyRoute_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) SetupEdgeRoute(ctx context.Context, in *SetupEdgeRouteRequest, opts ...grpc.CallOption) (*SetupEdgeRouteResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(SetupEdgeRouteResponse)
err := c.cc.Invoke(ctx, McpAgentService_SetupEdgeRoute_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) RemoveEdgeRoute(ctx context.Context, in *RemoveEdgeRouteRequest, opts ...grpc.CallOption) (*RemoveEdgeRouteResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(RemoveEdgeRouteResponse)
err := c.cc.Invoke(ctx, McpAgentService_RemoveEdgeRoute_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) ListEdgeRoutes(ctx context.Context, in *ListEdgeRoutesRequest, opts ...grpc.CallOption) (*ListEdgeRoutesResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(ListEdgeRoutesResponse)
err := c.cc.Invoke(ctx, McpAgentService_ListEdgeRoutes_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) HealthCheck(ctx context.Context, in *HealthCheckRequest, opts ...grpc.CallOption) (*HealthCheckResponse, error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
out := new(HealthCheckResponse)
err := c.cc.Invoke(ctx, McpAgentService_HealthCheck_FullMethodName, in, out, cOpts...)
if err != nil {
return nil, err
}
return out, nil
}
func (c *mcpAgentServiceClient) Logs(ctx context.Context, in *LogsRequest, opts ...grpc.CallOption) (grpc.ServerStreamingClient[LogsResponse], error) {
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
stream, err := c.cc.NewStream(ctx, &McpAgentService_ServiceDesc.Streams[0], McpAgentService_Logs_FullMethodName, cOpts...)
if err != nil {
return nil, err
}
x := &grpc.GenericClientStream[LogsRequest, LogsResponse]{ClientStream: stream}
if err := x.ClientStream.SendMsg(in); err != nil {
return nil, err
}
if err := x.ClientStream.CloseSend(); err != nil {
return nil, err
}
return x, nil
}
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
type McpAgentService_LogsClient = grpc.ServerStreamingClient[LogsResponse]
// McpAgentServiceServer is the server API for McpAgentService service. // McpAgentServiceServer is the server API for McpAgentService service.
// All implementations must embed UnimplementedMcpAgentServiceServer // All implementations must embed UnimplementedMcpAgentServiceServer
// for forward compatibility. // for forward compatibility.
type McpAgentServiceServer interface { type McpAgentServiceServer interface {
// Service lifecycle // Service lifecycle
Deploy(context.Context, *DeployRequest) (*DeployResponse, error) Deploy(context.Context, *DeployRequest) (*DeployResponse, error)
UndeployService(context.Context, *UndeployServiceRequest) (*UndeployServiceResponse, error)
StopService(context.Context, *StopServiceRequest) (*StopServiceResponse, error) StopService(context.Context, *StopServiceRequest) (*StopServiceResponse, error)
StartService(context.Context, *StartServiceRequest) (*StartServiceResponse, error) StartService(context.Context, *StartServiceRequest) (*StartServiceResponse, error)
RestartService(context.Context, *RestartServiceRequest) (*RestartServiceResponse, error) RestartService(context.Context, *RestartServiceRequest) (*RestartServiceResponse, error)
@@ -202,11 +350,27 @@ type McpAgentServiceServer interface {
LiveCheck(context.Context, *LiveCheckRequest) (*LiveCheckResponse, error) LiveCheck(context.Context, *LiveCheckRequest) (*LiveCheckResponse, error)
// Adopt // Adopt
AdoptContainers(context.Context, *AdoptContainersRequest) (*AdoptContainersResponse, error) AdoptContainers(context.Context, *AdoptContainersRequest) (*AdoptContainersResponse, error)
// Purge
PurgeComponent(context.Context, *PurgeRequest) (*PurgeResponse, error)
// File transfer // File transfer
PushFile(context.Context, *PushFileRequest) (*PushFileResponse, error) PushFile(context.Context, *PushFileRequest) (*PushFileResponse, error)
PullFile(context.Context, *PullFileRequest) (*PullFileResponse, error) PullFile(context.Context, *PullFileRequest) (*PullFileResponse, error)
// Node // Node
NodeStatus(context.Context, *NodeStatusRequest) (*NodeStatusResponse, error) NodeStatus(context.Context, *NodeStatusRequest) (*NodeStatusResponse, error)
// DNS (query MCNS)
ListDNSRecords(context.Context, *ListDNSRecordsRequest) (*ListDNSRecordsResponse, error)
// Proxy routes (query mc-proxy)
ListProxyRoutes(context.Context, *ListProxyRoutesRequest) (*ListProxyRoutesResponse, error)
AddProxyRoute(context.Context, *AddProxyRouteRequest) (*AddProxyRouteResponse, error)
RemoveProxyRoute(context.Context, *RemoveProxyRouteRequest) (*RemoveProxyRouteResponse, error)
// Edge routing (called by master on edge nodes)
SetupEdgeRoute(context.Context, *SetupEdgeRouteRequest) (*SetupEdgeRouteResponse, error)
RemoveEdgeRoute(context.Context, *RemoveEdgeRouteRequest) (*RemoveEdgeRouteResponse, error)
ListEdgeRoutes(context.Context, *ListEdgeRoutesRequest) (*ListEdgeRoutesResponse, error)
// Health (called by master on missed heartbeats)
HealthCheck(context.Context, *HealthCheckRequest) (*HealthCheckResponse, error)
// Logs
Logs(*LogsRequest, grpc.ServerStreamingServer[LogsResponse]) error
mustEmbedUnimplementedMcpAgentServiceServer() mustEmbedUnimplementedMcpAgentServiceServer()
} }
@@ -220,6 +384,9 @@ type UnimplementedMcpAgentServiceServer struct{}
func (UnimplementedMcpAgentServiceServer) Deploy(context.Context, *DeployRequest) (*DeployResponse, error) { func (UnimplementedMcpAgentServiceServer) Deploy(context.Context, *DeployRequest) (*DeployResponse, error) {
return nil, status.Error(codes.Unimplemented, "method Deploy not implemented") return nil, status.Error(codes.Unimplemented, "method Deploy not implemented")
} }
func (UnimplementedMcpAgentServiceServer) UndeployService(context.Context, *UndeployServiceRequest) (*UndeployServiceResponse, error) {
return nil, status.Error(codes.Unimplemented, "method UndeployService not implemented")
}
func (UnimplementedMcpAgentServiceServer) StopService(context.Context, *StopServiceRequest) (*StopServiceResponse, error) { func (UnimplementedMcpAgentServiceServer) StopService(context.Context, *StopServiceRequest) (*StopServiceResponse, error) {
return nil, status.Error(codes.Unimplemented, "method StopService not implemented") return nil, status.Error(codes.Unimplemented, "method StopService not implemented")
} }
@@ -244,6 +411,9 @@ func (UnimplementedMcpAgentServiceServer) LiveCheck(context.Context, *LiveCheckR
func (UnimplementedMcpAgentServiceServer) AdoptContainers(context.Context, *AdoptContainersRequest) (*AdoptContainersResponse, error) { func (UnimplementedMcpAgentServiceServer) AdoptContainers(context.Context, *AdoptContainersRequest) (*AdoptContainersResponse, error) {
return nil, status.Error(codes.Unimplemented, "method AdoptContainers not implemented") return nil, status.Error(codes.Unimplemented, "method AdoptContainers not implemented")
} }
func (UnimplementedMcpAgentServiceServer) PurgeComponent(context.Context, *PurgeRequest) (*PurgeResponse, error) {
return nil, status.Error(codes.Unimplemented, "method PurgeComponent not implemented")
}
func (UnimplementedMcpAgentServiceServer) PushFile(context.Context, *PushFileRequest) (*PushFileResponse, error) { func (UnimplementedMcpAgentServiceServer) PushFile(context.Context, *PushFileRequest) (*PushFileResponse, error) {
return nil, status.Error(codes.Unimplemented, "method PushFile not implemented") return nil, status.Error(codes.Unimplemented, "method PushFile not implemented")
} }
@@ -253,6 +423,33 @@ func (UnimplementedMcpAgentServiceServer) PullFile(context.Context, *PullFileReq
func (UnimplementedMcpAgentServiceServer) NodeStatus(context.Context, *NodeStatusRequest) (*NodeStatusResponse, error) { func (UnimplementedMcpAgentServiceServer) NodeStatus(context.Context, *NodeStatusRequest) (*NodeStatusResponse, error) {
return nil, status.Error(codes.Unimplemented, "method NodeStatus not implemented") return nil, status.Error(codes.Unimplemented, "method NodeStatus not implemented")
} }
func (UnimplementedMcpAgentServiceServer) ListDNSRecords(context.Context, *ListDNSRecordsRequest) (*ListDNSRecordsResponse, error) {
return nil, status.Error(codes.Unimplemented, "method ListDNSRecords not implemented")
}
func (UnimplementedMcpAgentServiceServer) ListProxyRoutes(context.Context, *ListProxyRoutesRequest) (*ListProxyRoutesResponse, error) {
return nil, status.Error(codes.Unimplemented, "method ListProxyRoutes not implemented")
}
func (UnimplementedMcpAgentServiceServer) AddProxyRoute(context.Context, *AddProxyRouteRequest) (*AddProxyRouteResponse, error) {
return nil, status.Error(codes.Unimplemented, "method AddProxyRoute not implemented")
}
func (UnimplementedMcpAgentServiceServer) RemoveProxyRoute(context.Context, *RemoveProxyRouteRequest) (*RemoveProxyRouteResponse, error) {
return nil, status.Error(codes.Unimplemented, "method RemoveProxyRoute not implemented")
}
func (UnimplementedMcpAgentServiceServer) SetupEdgeRoute(context.Context, *SetupEdgeRouteRequest) (*SetupEdgeRouteResponse, error) {
return nil, status.Error(codes.Unimplemented, "method SetupEdgeRoute not implemented")
}
func (UnimplementedMcpAgentServiceServer) RemoveEdgeRoute(context.Context, *RemoveEdgeRouteRequest) (*RemoveEdgeRouteResponse, error) {
return nil, status.Error(codes.Unimplemented, "method RemoveEdgeRoute not implemented")
}
func (UnimplementedMcpAgentServiceServer) ListEdgeRoutes(context.Context, *ListEdgeRoutesRequest) (*ListEdgeRoutesResponse, error) {
return nil, status.Error(codes.Unimplemented, "method ListEdgeRoutes not implemented")
}
func (UnimplementedMcpAgentServiceServer) HealthCheck(context.Context, *HealthCheckRequest) (*HealthCheckResponse, error) {
return nil, status.Error(codes.Unimplemented, "method HealthCheck not implemented")
}
func (UnimplementedMcpAgentServiceServer) Logs(*LogsRequest, grpc.ServerStreamingServer[LogsResponse]) error {
return status.Error(codes.Unimplemented, "method Logs not implemented")
}
func (UnimplementedMcpAgentServiceServer) mustEmbedUnimplementedMcpAgentServiceServer() {} func (UnimplementedMcpAgentServiceServer) mustEmbedUnimplementedMcpAgentServiceServer() {}
func (UnimplementedMcpAgentServiceServer) testEmbeddedByValue() {} func (UnimplementedMcpAgentServiceServer) testEmbeddedByValue() {}
@@ -292,6 +489,24 @@ func _McpAgentService_Deploy_Handler(srv interface{}, ctx context.Context, dec f
return interceptor(ctx, in, info, handler) return interceptor(ctx, in, info, handler)
} }
func _McpAgentService_UndeployService_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(UndeployServiceRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).UndeployService(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_UndeployService_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).UndeployService(ctx, req.(*UndeployServiceRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_StopService_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) { func _McpAgentService_StopService_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(StopServiceRequest) in := new(StopServiceRequest)
if err := dec(in); err != nil { if err := dec(in); err != nil {
@@ -436,6 +651,24 @@ func _McpAgentService_AdoptContainers_Handler(srv interface{}, ctx context.Conte
return interceptor(ctx, in, info, handler) return interceptor(ctx, in, info, handler)
} }
func _McpAgentService_PurgeComponent_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(PurgeRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).PurgeComponent(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_PurgeComponent_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).PurgeComponent(ctx, req.(*PurgeRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_PushFile_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) { func _McpAgentService_PushFile_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(PushFileRequest) in := new(PushFileRequest)
if err := dec(in); err != nil { if err := dec(in); err != nil {
@@ -490,6 +723,161 @@ func _McpAgentService_NodeStatus_Handler(srv interface{}, ctx context.Context, d
return interceptor(ctx, in, info, handler) return interceptor(ctx, in, info, handler)
} }
func _McpAgentService_ListDNSRecords_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(ListDNSRecordsRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).ListDNSRecords(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_ListDNSRecords_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).ListDNSRecords(ctx, req.(*ListDNSRecordsRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_ListProxyRoutes_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(ListProxyRoutesRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).ListProxyRoutes(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_ListProxyRoutes_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).ListProxyRoutes(ctx, req.(*ListProxyRoutesRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_AddProxyRoute_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(AddProxyRouteRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).AddProxyRoute(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_AddProxyRoute_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).AddProxyRoute(ctx, req.(*AddProxyRouteRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_RemoveProxyRoute_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(RemoveProxyRouteRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).RemoveProxyRoute(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_RemoveProxyRoute_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).RemoveProxyRoute(ctx, req.(*RemoveProxyRouteRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_SetupEdgeRoute_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(SetupEdgeRouteRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).SetupEdgeRoute(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_SetupEdgeRoute_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).SetupEdgeRoute(ctx, req.(*SetupEdgeRouteRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_RemoveEdgeRoute_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(RemoveEdgeRouteRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).RemoveEdgeRoute(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_RemoveEdgeRoute_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).RemoveEdgeRoute(ctx, req.(*RemoveEdgeRouteRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_ListEdgeRoutes_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(ListEdgeRoutesRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).ListEdgeRoutes(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_ListEdgeRoutes_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).ListEdgeRoutes(ctx, req.(*ListEdgeRoutesRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_HealthCheck_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
in := new(HealthCheckRequest)
if err := dec(in); err != nil {
return nil, err
}
if interceptor == nil {
return srv.(McpAgentServiceServer).HealthCheck(ctx, in)
}
info := &grpc.UnaryServerInfo{
Server: srv,
FullMethod: McpAgentService_HealthCheck_FullMethodName,
}
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
return srv.(McpAgentServiceServer).HealthCheck(ctx, req.(*HealthCheckRequest))
}
return interceptor(ctx, in, info, handler)
}
func _McpAgentService_Logs_Handler(srv interface{}, stream grpc.ServerStream) error {
m := new(LogsRequest)
if err := stream.RecvMsg(m); err != nil {
return err
}
return srv.(McpAgentServiceServer).Logs(m, &grpc.GenericServerStream[LogsRequest, LogsResponse]{ServerStream: stream})
}
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
type McpAgentService_LogsServer = grpc.ServerStreamingServer[LogsResponse]
// McpAgentService_ServiceDesc is the grpc.ServiceDesc for McpAgentService service. // McpAgentService_ServiceDesc is the grpc.ServiceDesc for McpAgentService service.
// It's only intended for direct use with grpc.RegisterService, // It's only intended for direct use with grpc.RegisterService,
// and not to be introspected or modified (even as a copy) // and not to be introspected or modified (even as a copy)
@@ -501,6 +889,10 @@ var McpAgentService_ServiceDesc = grpc.ServiceDesc{
MethodName: "Deploy", MethodName: "Deploy",
Handler: _McpAgentService_Deploy_Handler, Handler: _McpAgentService_Deploy_Handler,
}, },
{
MethodName: "UndeployService",
Handler: _McpAgentService_UndeployService_Handler,
},
{ {
MethodName: "StopService", MethodName: "StopService",
Handler: _McpAgentService_StopService_Handler, Handler: _McpAgentService_StopService_Handler,
@@ -533,6 +925,10 @@ var McpAgentService_ServiceDesc = grpc.ServiceDesc{
MethodName: "AdoptContainers", MethodName: "AdoptContainers",
Handler: _McpAgentService_AdoptContainers_Handler, Handler: _McpAgentService_AdoptContainers_Handler,
}, },
{
MethodName: "PurgeComponent",
Handler: _McpAgentService_PurgeComponent_Handler,
},
{ {
MethodName: "PushFile", MethodName: "PushFile",
Handler: _McpAgentService_PushFile_Handler, Handler: _McpAgentService_PushFile_Handler,
@@ -545,7 +941,45 @@ var McpAgentService_ServiceDesc = grpc.ServiceDesc{
MethodName: "NodeStatus", MethodName: "NodeStatus",
Handler: _McpAgentService_NodeStatus_Handler, Handler: _McpAgentService_NodeStatus_Handler,
}, },
{
MethodName: "ListDNSRecords",
Handler: _McpAgentService_ListDNSRecords_Handler,
},
{
MethodName: "ListProxyRoutes",
Handler: _McpAgentService_ListProxyRoutes_Handler,
},
{
MethodName: "AddProxyRoute",
Handler: _McpAgentService_AddProxyRoute_Handler,
},
{
MethodName: "RemoveProxyRoute",
Handler: _McpAgentService_RemoveProxyRoute_Handler,
},
{
MethodName: "SetupEdgeRoute",
Handler: _McpAgentService_SetupEdgeRoute_Handler,
},
{
MethodName: "RemoveEdgeRoute",
Handler: _McpAgentService_RemoveEdgeRoute_Handler,
},
{
MethodName: "ListEdgeRoutes",
Handler: _McpAgentService_ListEdgeRoutes_Handler,
},
{
MethodName: "HealthCheck",
Handler: _McpAgentService_HealthCheck_Handler,
},
},
Streams: []grpc.StreamDesc{
{
StreamName: "Logs",
Handler: _McpAgentService_Logs_Handler,
ServerStreams: true,
},
}, },
Streams: []grpc.StreamDesc{},
Metadata: "proto/mcp/v1/mcp.proto", Metadata: "proto/mcp/v1/mcp.proto",
} }

5
go.mod
View File

@@ -1,8 +1,10 @@
module git.wntrmute.dev/kyle/mcp module git.wntrmute.dev/mc/mcp
go 1.25.7 go 1.25.7
require ( require (
git.wntrmute.dev/mc/mc-proxy v1.2.0
git.wntrmute.dev/mc/mcdsl v1.3.0
github.com/pelletier/go-toml/v2 v2.3.0 github.com/pelletier/go-toml/v2 v2.3.0
github.com/spf13/cobra v1.10.2 github.com/spf13/cobra v1.10.2
golang.org/x/sys v0.42.0 golang.org/x/sys v0.42.0
@@ -20,6 +22,7 @@ require (
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
github.com/spf13/pflag v1.0.9 // indirect github.com/spf13/pflag v1.0.9 // indirect
golang.org/x/net v0.48.0 // indirect golang.org/x/net v0.48.0 // indirect
golang.org/x/term v0.41.0 // indirect
golang.org/x/text v0.32.0 // indirect golang.org/x/text v0.32.0 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20251202230838-ff82c1b0f217 // indirect google.golang.org/genproto/googleapis/rpc v0.0.0-20251202230838-ff82c1b0f217 // indirect
modernc.org/libc v1.70.0 // indirect modernc.org/libc v1.70.0 // indirect

22
go.sum
View File

@@ -1,3 +1,9 @@
git.wntrmute.dev/mc/mc-proxy v1.2.0 h1:TVfwdZzYqMs/ksZ0a6aSR7hKGDDMG8X0Od5RIxlbXKQ=
git.wntrmute.dev/mc/mc-proxy v1.2.0/go.mod h1:6w8smZ/DNJVBb4n5std/faye0ROLEXfk3iJY1XNc1JU=
git.wntrmute.dev/mc/mcdsl v1.3.0 h1:QYmRdGDHjDEyNQpiKqHqPflpwNJcP0cFR9hcfMza/x4=
git.wntrmute.dev/mc/mcdsl v1.3.0/go.mod h1:MhYahIu7Sg53lE2zpQ20nlrsoNRjQzOJBAlCmom2wJc=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs= github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g= github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
@@ -21,10 +27,22 @@ github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2
github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw= github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY= github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y= github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/ncruces/go-strftime v1.0.0 h1:HMFp8mLCTPp341M/ZnA4qaf7ZlsbTc+miZjCLOFAw7w= github.com/ncruces/go-strftime v1.0.0 h1:HMFp8mLCTPp341M/ZnA4qaf7ZlsbTc+miZjCLOFAw7w=
github.com/ncruces/go-strftime v1.0.0/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls= github.com/ncruces/go-strftime v1.0.0/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls=
github.com/oschwald/maxminddb-golang v1.13.1 h1:G3wwjdN9JmIK2o/ermkHM+98oX5fS+k5MbwsmL4MRQE=
github.com/oschwald/maxminddb-golang v1.13.1/go.mod h1:K4pgV9N/GcK694KSTmVSDTODk4IsCNThNdTmnaBZ/F8=
github.com/pelletier/go-toml/v2 v2.3.0 h1:k59bC/lIZREW0/iVaQR8nDHxVq8OVlIzYCOJf421CaM= github.com/pelletier/go-toml/v2 v2.3.0 h1:k59bC/lIZREW0/iVaQR8nDHxVq8OVlIzYCOJf421CaM=
github.com/pelletier/go-toml/v2 v2.3.0/go.mod h1:2gIqNv+qfxSVS7cM2xJQKtLSTLUE9V8t9Stt+h56mCY= github.com/pelletier/go-toml/v2 v2.3.0/go.mod h1:2gIqNv+qfxSVS7cM2xJQKtLSTLUE9V8t9Stt+h56mCY=
github.com/prometheus/client_golang v1.23.2 h1:Je96obch5RDVy3FDMndoUsjAhG5Edi49h0RJWRi/o0o=
github.com/prometheus/client_golang v1.23.2/go.mod h1:Tb1a6LWHB3/SPIzCoaDXI4I8UHKeFTEQ1YCr+0Gyqmg=
github.com/prometheus/client_model v0.6.2 h1:oBsgwpGs7iVziMvrGhE53c/GrLUsZdHnqNwqPLxwZyk=
github.com/prometheus/client_model v0.6.2/go.mod h1:y3m2F6Gdpfy6Ut/GBsUqTWZqCUvMVzSfMLjcu6wAwpE=
github.com/prometheus/common v0.66.1 h1:h5E0h5/Y8niHc5DlaLlWLArTQI7tMrsfQjHV+d9ZoGs=
github.com/prometheus/common v0.66.1/go.mod h1:gcaUsgf3KfRSwHY4dIMXLPV0K/Wg1oZ8+SbZk/HH/dA=
github.com/prometheus/procfs v0.16.1 h1:hZ15bTNuirocR6u0JZ6BAHHmwS1p8B4P6MRqxtzMyRg=
github.com/prometheus/procfs v0.16.1/go.mod h1:teAbpZRB1iIAJYREa1LsoWUXykVXA1KlTmWl8x/U+Is=
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94icq4NjY3clb7Lk8O1qJ8BdBEF8z0ibU0rE= github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec h1:W09IVJc94icq4NjY3clb7Lk8O1qJ8BdBEF8z0ibU0rE=
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo= github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec/go.mod h1:qqbHyh8v60DhA7CoWK5oRCqLrMHRGoxYCSS9EjAz6Eo=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
@@ -44,6 +62,8 @@ go.opentelemetry.io/otel/sdk/metric v1.39.0 h1:cXMVVFVgsIf2YL6QkRF4Urbr/aMInf+2W
go.opentelemetry.io/otel/sdk/metric v1.39.0/go.mod h1:xq9HEVH7qeX69/JnwEfp6fVq5wosJsY1mt4lLfYdVew= go.opentelemetry.io/otel/sdk/metric v1.39.0/go.mod h1:xq9HEVH7qeX69/JnwEfp6fVq5wosJsY1mt4lLfYdVew=
go.opentelemetry.io/otel/trace v1.39.0 h1:2d2vfpEDmCJ5zVYz7ijaJdOF59xLomrvj7bjt6/qCJI= go.opentelemetry.io/otel/trace v1.39.0 h1:2d2vfpEDmCJ5zVYz7ijaJdOF59xLomrvj7bjt6/qCJI=
go.opentelemetry.io/otel/trace v1.39.0/go.mod h1:88w4/PnZSazkGzz/w84VHpQafiU4EtqqlVdxWy+rNOA= go.opentelemetry.io/otel/trace v1.39.0/go.mod h1:88w4/PnZSazkGzz/w84VHpQafiU4EtqqlVdxWy+rNOA=
go.yaml.in/yaml/v2 v2.4.2 h1:DzmwEr2rDGHl7lsFgAHxmNz/1NlQ7xLIrlN2h5d1eGI=
go.yaml.in/yaml/v2 v2.4.2/go.mod h1:081UH+NErpNdqlCXm3TtEran0rJZGxAYx9hb/ELlsPU=
go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg= go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg=
golang.org/x/mod v0.33.0 h1:tHFzIWbBifEmbwtGz65eaWyGiGZatSrT9prnU8DbVL8= golang.org/x/mod v0.33.0 h1:tHFzIWbBifEmbwtGz65eaWyGiGZatSrT9prnU8DbVL8=
golang.org/x/mod v0.33.0/go.mod h1:swjeQEj+6r7fODbD2cqrnje9PnziFuw4bmLbBZFrQ5w= golang.org/x/mod v0.33.0/go.mod h1:swjeQEj+6r7fODbD2cqrnje9PnziFuw4bmLbBZFrQ5w=
@@ -54,6 +74,8 @@ golang.org/x/sync v0.19.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo= golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo=
golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw= golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
golang.org/x/term v0.41.0 h1:QCgPso/Q3RTJx2Th4bDLqML4W6iJiaXFq2/ftQF13YU=
golang.org/x/term v0.41.0/go.mod h1:3pfBgksrReYfZ5lvYM0kSO0LIkAl4Yl2bXOkKP7Ec2A=
golang.org/x/text v0.32.0 h1:ZD01bjUt1FQ9WJ0ClOL5vxgxOI/sVCNgX1YtKwcY0mU= golang.org/x/text v0.32.0 h1:ZD01bjUt1FQ9WJ0ClOL5vxgxOI/sVCNgX1YtKwcY0mU=
golang.org/x/text v0.32.0/go.mod h1:o/rUWzghvpD5TXrTIBuJU77MTaN0ljMWE47kxGJQ7jY= golang.org/x/text v0.32.0/go.mod h1:o/rUWzghvpD5TXrTIBuJU77MTaN0ljMWE47kxGJQ7jY=
golang.org/x/tools v0.42.0 h1:uNgphsn75Tdz5Ji2q36v/nsFSfR/9BRFvqhGBaJGd5k= golang.org/x/tools v0.42.0 h1:uNgphsn75Tdz5Ji2q36v/nsFSfR/9BRFvqhGBaJGd5k=

View File

@@ -5,9 +5,9 @@ import (
"fmt" "fmt"
"strings" "strings"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
// AdoptContainers discovers running containers that match the given service // AdoptContainers discovers running containers that match the given service

View File

@@ -4,9 +4,9 @@ import (
"context" "context"
"testing" "testing"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
func TestAdoptContainers(t *testing.T) { func TestAdoptContainers(t *testing.T) {

View File

@@ -11,12 +11,12 @@ import (
"os/signal" "os/signal"
"syscall" "syscall"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/auth" "git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/monitor" "git.wntrmute.dev/mc/mcp/internal/monitor"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
"google.golang.org/grpc" "google.golang.org/grpc"
"google.golang.org/grpc/credentials" "google.golang.org/grpc/credentials"
) )
@@ -31,11 +31,16 @@ type Agent struct {
Runtime runtime.Runtime Runtime runtime.Runtime
Monitor *monitor.Monitor Monitor *monitor.Monitor
Logger *slog.Logger Logger *slog.Logger
PortAlloc *PortAllocator
Proxy *ProxyRouter
Certs *CertProvisioner
DNS *DNSRegistrar
Version string
} }
// Run starts the agent: opens the database, sets up the gRPC server with // Run starts the agent: opens the database, sets up the gRPC server with
// TLS and auth, and blocks until SIGINT/SIGTERM. // TLS and auth, and blocks until SIGINT/SIGTERM.
func Run(cfg *config.AgentConfig) error { func Run(cfg *config.AgentConfig, version string) error {
logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{ logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: parseLogLevel(cfg.Log.Level), Level: parseLogLevel(cfg.Log.Level),
})) }))
@@ -50,12 +55,32 @@ func Run(cfg *config.AgentConfig) error {
mon := monitor.New(db, rt, cfg.Monitor, cfg.Agent.NodeName, logger) mon := monitor.New(db, rt, cfg.Monitor, cfg.Agent.NodeName, logger)
proxy, err := NewProxyRouter(cfg.MCProxy.Socket, cfg.MCProxy.CertDir, logger)
if err != nil {
return fmt.Errorf("connect to mc-proxy: %w", err)
}
certs, err := NewCertProvisioner(cfg.Metacrypt, cfg.MCProxy.CertDir, logger)
if err != nil {
return fmt.Errorf("create cert provisioner: %w", err)
}
dns, err := NewDNSRegistrar(cfg.MCNS, logger)
if err != nil {
return fmt.Errorf("create DNS registrar: %w", err)
}
a := &Agent{ a := &Agent{
Config: cfg, Config: cfg,
DB: db, DB: db,
Runtime: rt, Runtime: rt,
Monitor: mon, Monitor: mon,
Logger: logger, Logger: logger,
PortAlloc: NewPortAllocator(),
Proxy: proxy,
Certs: certs,
DNS: dns,
Version: version,
} }
tlsCert, err := tls.LoadX509KeyPair(cfg.Server.TLSCert, cfg.Server.TLSKey) tlsCert, err := tls.LoadX509KeyPair(cfg.Server.TLSCert, cfg.Server.TLSKey)
@@ -77,6 +102,9 @@ func Run(cfg *config.AgentConfig) error {
grpc.ChainUnaryInterceptor( grpc.ChainUnaryInterceptor(
auth.AuthInterceptor(validator), auth.AuthInterceptor(validator),
), ),
grpc.ChainStreamInterceptor(
auth.StreamAuthInterceptor(validator),
),
) )
mcpv1.RegisterMcpAgentServiceServer(server, a) mcpv1.RegisterMcpAgentServiceServer(server, a)
@@ -106,6 +134,7 @@ func Run(cfg *config.AgentConfig) error {
logger.Info("shutting down") logger.Info("shutting down")
mon.Stop() mon.Stop()
server.GracefulStop() server.GracefulStop()
_ = proxy.Close()
return nil return nil
case err := <-errCh: case err := <-errCh:
mon.Stop() mon.Stop()

263
internal/agent/certs.go Normal file
View File

@@ -0,0 +1,263 @@
package agent
import (
"bytes"
"context"
"crypto/tls"
"crypto/x509"
"encoding/json"
"encoding/pem"
"fmt"
"io"
"log/slog"
"net/http"
"os"
"path/filepath"
"strings"
"time"
"git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/mc/mcp/internal/config"
)
// renewWindow is how far before expiry a cert is considered stale and
// should be re-issued.
const renewWindow = 30 * 24 * time.Hour // 30 days
// CertProvisioner requests TLS certificates from Metacrypt's CA API
// and writes them to the mc-proxy cert directory. It is nil-safe: all
// methods are no-ops when the receiver is nil.
type CertProvisioner struct {
serverURL string
token string
mount string
issuer string
certDir string
httpClient *http.Client
logger *slog.Logger
}
// NewCertProvisioner creates a CertProvisioner. Returns (nil, nil) if
// cfg.ServerURL is empty (cert provisioning disabled).
func NewCertProvisioner(cfg config.MetacryptConfig, certDir string, logger *slog.Logger) (*CertProvisioner, error) {
if cfg.ServerURL == "" {
logger.Info("metacrypt not configured, cert provisioning disabled")
return nil, nil
}
token, err := auth.LoadToken(cfg.TokenPath)
if err != nil {
return nil, fmt.Errorf("load metacrypt token: %w", err)
}
httpClient, err := newTLSClient(cfg.CACert)
if err != nil {
return nil, fmt.Errorf("create metacrypt HTTP client: %w", err)
}
logger.Info("metacrypt cert provisioner enabled", "server", cfg.ServerURL, "mount", cfg.Mount, "issuer", cfg.Issuer)
return &CertProvisioner{
serverURL: strings.TrimRight(cfg.ServerURL, "/"),
token: token,
mount: cfg.Mount,
issuer: cfg.Issuer,
certDir: certDir,
httpClient: httpClient,
logger: logger,
}, nil
}
// EnsureCert checks whether a valid TLS certificate exists for the
// service. If the cert is missing or near expiry, it requests a new
// one from Metacrypt.
func (p *CertProvisioner) EnsureCert(ctx context.Context, serviceName string, hostnames []string) error {
if p == nil || len(hostnames) == 0 {
return nil
}
certPath := filepath.Join(p.certDir, serviceName+".pem")
if remaining, ok := certTimeRemaining(certPath); ok {
if remaining > renewWindow {
p.logger.Debug("cert valid, skipping provisioning",
"service", serviceName,
"expires_in", remaining.Round(time.Hour),
)
return nil
}
p.logger.Info("cert near expiry, re-issuing",
"service", serviceName,
"expires_in", remaining.Round(time.Hour),
)
}
return p.issueCert(ctx, serviceName, hostnames[0], hostnames)
}
// issueCert calls Metacrypt's CA API to issue a certificate and writes
// the chain and key to the cert directory.
func (p *CertProvisioner) issueCert(ctx context.Context, serviceName, commonName string, dnsNames []string) error {
p.logger.Info("provisioning TLS cert",
"service", serviceName,
"cn", commonName,
"sans", dnsNames,
)
reqBody := map[string]interface{}{
"mount": p.mount,
"operation": "issue",
"data": map[string]interface{}{
"issuer": p.issuer,
"common_name": commonName,
"dns_names": dnsNames,
"profile": "server",
"ttl": "2160h",
},
}
body, err := json.Marshal(reqBody)
if err != nil {
return fmt.Errorf("marshal issue request: %w", err)
}
url := p.serverURL + "/v1/engine/request"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("create issue request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+p.token)
resp, err := p.httpClient.Do(req)
if err != nil {
return fmt.Errorf("issue cert: %w", err)
}
defer func() { _ = resp.Body.Close() }()
respBody, err := io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("read issue response: %w", err)
}
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("issue cert: metacrypt returned %d: %s", resp.StatusCode, string(respBody))
}
var result struct {
ChainPEM string `json:"chain_pem"`
KeyPEM string `json:"key_pem"`
Serial string `json:"serial"`
ExpiresAt string `json:"expires_at"`
}
if err := json.Unmarshal(respBody, &result); err != nil {
return fmt.Errorf("parse issue response: %w", err)
}
if result.ChainPEM == "" || result.KeyPEM == "" {
return fmt.Errorf("issue cert: response missing chain_pem or key_pem")
}
// Write cert and key atomically (temp file + rename).
certPath := filepath.Join(p.certDir, serviceName+".pem")
keyPath := filepath.Join(p.certDir, serviceName+".key")
if err := atomicWrite(certPath, []byte(result.ChainPEM), 0644); err != nil {
return fmt.Errorf("write cert: %w", err)
}
if err := atomicWrite(keyPath, []byte(result.KeyPEM), 0600); err != nil {
return fmt.Errorf("write key: %w", err)
}
p.logger.Info("cert provisioned",
"service", serviceName,
"serial", result.Serial,
"expires_at", result.ExpiresAt,
)
return nil
}
// RemoveCert removes TLS certificate and key files for a service.
func (p *CertProvisioner) RemoveCert(serviceName string) error {
if p == nil {
return nil
}
certPath := filepath.Join(p.certDir, serviceName+".pem")
keyPath := filepath.Join(p.certDir, serviceName+".key")
for _, path := range []string{certPath, keyPath} {
if err := os.Remove(path); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("remove %s: %w", path, err)
}
}
p.logger.Info("cert removed", "service", serviceName)
return nil
}
// certTimeRemaining returns the time until the leaf certificate at
// path expires. Returns (0, false) if the cert cannot be read or parsed.
func certTimeRemaining(path string) (time.Duration, bool) {
data, err := os.ReadFile(path) //nolint:gosec // path from trusted config
if err != nil {
return 0, false
}
block, _ := pem.Decode(data)
if block == nil {
return 0, false
}
cert, err := x509.ParseCertificate(block.Bytes)
if err != nil {
return 0, false
}
remaining := time.Until(cert.NotAfter)
if remaining <= 0 {
return 0, true // expired
}
return remaining, true
}
// atomicWrite writes data to a temporary file then renames it to path,
// ensuring readers never see a partial file.
func atomicWrite(path string, data []byte, perm os.FileMode) error {
tmp := path + ".tmp"
if err := os.WriteFile(tmp, data, perm); err != nil {
return fmt.Errorf("write %s: %w", tmp, err)
}
if err := os.Rename(tmp, path); err != nil {
_ = os.Remove(tmp)
return fmt.Errorf("rename %s -> %s: %w", tmp, path, err)
}
return nil
}
// newTLSClient creates an HTTP client with TLS 1.3 minimum. If
// caCertPath is non-empty, the CA certificate is loaded into the
// root CA pool.
func newTLSClient(caCertPath string) (*http.Client, error) {
tlsConfig := &tls.Config{
MinVersion: tls.VersionTLS13,
}
if caCertPath != "" {
caCert, err := os.ReadFile(caCertPath) //nolint:gosec // path from trusted config
if err != nil {
return nil, fmt.Errorf("read CA cert %q: %w", caCertPath, err)
}
pool := x509.NewCertPool()
if !pool.AppendCertsFromPEM(caCert) {
return nil, fmt.Errorf("parse CA cert %q: no valid certificates found", caCertPath)
}
tlsConfig.RootCAs = pool
}
return &http.Client{
Timeout: 30 * time.Second,
Transport: &http.Transport{
TLSClientConfig: tlsConfig,
},
}, nil
}

View File

@@ -0,0 +1,392 @@
package agent
import (
"context"
"crypto/ecdsa"
"crypto/elliptic"
"crypto/rand"
"crypto/x509"
"crypto/x509/pkix"
"encoding/json"
"encoding/pem"
"log/slog"
"math/big"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"time"
"git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
func TestNilCertProvisionerIsNoop(t *testing.T) {
var p *CertProvisioner
if err := p.EnsureCert(context.Background(), "svc", []string{"svc.example.com"}); err != nil {
t.Fatalf("EnsureCert on nil: %v", err)
}
}
func TestNewCertProvisionerDisabledWhenUnconfigured(t *testing.T) {
p, err := NewCertProvisioner(config.MetacryptConfig{}, "/tmp", slog.Default())
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if p != nil {
t.Fatal("expected nil provisioner for empty config")
}
}
func TestEnsureCertSkipsValidCert(t *testing.T) {
certDir := t.TempDir()
certPath := filepath.Join(certDir, "svc.pem")
keyPath := filepath.Join(certDir, "svc.key")
// Generate a cert that expires in 90 days.
writeSelfSignedCert(t, certPath, keyPath, "svc.example.com", 90*24*time.Hour)
// Create a provisioner that would fail if it tried to issue.
p := &CertProvisioner{
serverURL: "https://will-fail-if-called:9999",
certDir: certDir,
logger: slog.Default(),
}
if err := p.EnsureCert(context.Background(), "svc", []string{"svc.example.com"}); err != nil {
t.Fatalf("EnsureCert: %v", err)
}
}
func TestEnsureCertReissuesExpiring(t *testing.T) {
certDir := t.TempDir()
certPath := filepath.Join(certDir, "svc.pem")
keyPath := filepath.Join(certDir, "svc.key")
// Generate a cert that expires in 10 days (within 30-day renewal window).
writeSelfSignedCert(t, certPath, keyPath, "svc.example.com", 10*24*time.Hour)
// Mock Metacrypt API.
newCert, newKey := generateCertPEM(t, "svc.example.com", 90*24*time.Hour)
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
resp := map[string]string{
"chain_pem": newCert,
"key_pem": newKey,
"serial": "abc123",
"expires_at": time.Now().Add(90 * 24 * time.Hour).Format(time.RFC3339),
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(resp)
}))
defer srv.Close()
p := &CertProvisioner{
serverURL: srv.URL,
token: "test-token",
mount: "pki",
issuer: "infra",
certDir: certDir,
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := p.EnsureCert(context.Background(), "svc", []string{"svc.example.com"}); err != nil {
t.Fatalf("EnsureCert: %v", err)
}
// Verify new cert was written.
got, err := os.ReadFile(certPath)
if err != nil {
t.Fatalf("read cert: %v", err)
}
if string(got) != newCert {
t.Fatal("cert file was not updated with new cert")
}
}
func TestIssueCertWritesFiles(t *testing.T) {
certDir := t.TempDir()
// Mock Metacrypt API.
certPEM, keyPEM := generateCertPEM(t, "svc.example.com", 90*24*time.Hour)
var gotAuth string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotAuth = r.Header.Get("Authorization")
var req map[string]interface{}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "bad request", http.StatusBadRequest)
return
}
// Verify request structure.
if req["mount"] != "pki" || req["operation"] != "issue" {
t.Errorf("unexpected request: %v", req)
}
resp := map[string]string{
"chain_pem": certPEM,
"key_pem": keyPEM,
"serial": "deadbeef",
"expires_at": time.Now().Add(90 * 24 * time.Hour).Format(time.RFC3339),
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(resp)
}))
defer srv.Close()
p := &CertProvisioner{
serverURL: srv.URL,
token: "my-service-token",
mount: "pki",
issuer: "infra",
certDir: certDir,
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := p.EnsureCert(context.Background(), "svc", []string{"svc.example.com"}); err != nil {
t.Fatalf("EnsureCert: %v", err)
}
// Verify auth header.
if gotAuth != "Bearer my-service-token" {
t.Fatalf("auth header: got %q, want %q", gotAuth, "Bearer my-service-token")
}
// Verify cert file.
certData, err := os.ReadFile(filepath.Join(certDir, "svc.pem"))
if err != nil {
t.Fatalf("read cert: %v", err)
}
if string(certData) != certPEM {
t.Fatal("cert content mismatch")
}
// Verify key file.
keyData, err := os.ReadFile(filepath.Join(certDir, "svc.key"))
if err != nil {
t.Fatalf("read key: %v", err)
}
if string(keyData) != keyPEM {
t.Fatal("key content mismatch")
}
// Verify key file permissions.
info, err := os.Stat(filepath.Join(certDir, "svc.key"))
if err != nil {
t.Fatalf("stat key: %v", err)
}
if perm := info.Mode().Perm(); perm != 0600 {
t.Fatalf("key permissions: got %o, want 0600", perm)
}
}
func TestIssueCertAPIError(t *testing.T) {
certDir := t.TempDir()
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, `{"error":"sealed"}`, http.StatusServiceUnavailable)
}))
defer srv.Close()
p := &CertProvisioner{
serverURL: srv.URL,
token: "test-token",
mount: "pki",
issuer: "infra",
certDir: certDir,
httpClient: srv.Client(),
logger: slog.Default(),
}
err := p.EnsureCert(context.Background(), "svc", []string{"svc.example.com"})
if err == nil {
t.Fatal("expected error for sealed metacrypt")
}
}
func TestCertTimeRemaining(t *testing.T) {
t.Run("missing file", func(t *testing.T) {
if _, ok := certTimeRemaining("/nonexistent/cert.pem"); ok {
t.Fatal("expected false for missing file")
}
})
t.Run("valid cert", func(t *testing.T) {
certDir := t.TempDir()
path := filepath.Join(certDir, "test.pem")
writeSelfSignedCert(t, path, filepath.Join(certDir, "test.key"), "test.example.com", 90*24*time.Hour)
remaining, ok := certTimeRemaining(path)
if !ok {
t.Fatal("expected true for valid cert")
}
// Should be close to 90 days.
if remaining < 89*24*time.Hour || remaining > 91*24*time.Hour {
t.Fatalf("remaining: got %v, want ~90 days", remaining)
}
})
t.Run("expired cert", func(t *testing.T) {
certDir := t.TempDir()
path := filepath.Join(certDir, "expired.pem")
// Write a cert that's already expired (valid from -2h to -1h).
writeExpiredCert(t, path, filepath.Join(certDir, "expired.key"), "expired.example.com")
remaining, ok := certTimeRemaining(path)
if !ok {
t.Fatal("expected true for expired cert")
}
if remaining > 0 {
t.Fatalf("remaining: got %v, want <= 0", remaining)
}
})
}
func TestHasL7Routes(t *testing.T) {
tests := []struct {
name string
routes []registry.Route
want bool
}{
{"nil", nil, false},
{"empty", []registry.Route{}, false},
{"l4 only", []registry.Route{{Mode: "l4"}}, false},
{"l7 only", []registry.Route{{Mode: "l7"}}, true},
{"mixed", []registry.Route{{Mode: "l4"}, {Mode: "l7"}}, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := hasL7Routes(tt.routes); got != tt.want {
t.Fatalf("hasL7Routes = %v, want %v", got, tt.want)
}
})
}
}
func TestL7Hostnames(t *testing.T) {
routes := []registry.Route{
{Mode: "l7", Hostname: ""},
{Mode: "l4", Hostname: "ignored.example.com"},
{Mode: "l7", Hostname: "custom.example.com"},
{Mode: "l7", Hostname: ""}, // duplicate default
}
got := l7Hostnames("myservice", routes)
want := []string{"myservice.svc.mcp.metacircular.net", "custom.example.com"}
if len(got) != len(want) {
t.Fatalf("got %v, want %v", got, want)
}
for i := range want {
if got[i] != want[i] {
t.Fatalf("got[%d] = %q, want %q", i, got[i], want[i])
}
}
}
func TestAtomicWrite(t *testing.T) {
dir := t.TempDir()
path := filepath.Join(dir, "test.txt")
if err := atomicWrite(path, []byte("hello"), 0644); err != nil {
t.Fatalf("atomicWrite: %v", err)
}
data, err := os.ReadFile(path)
if err != nil {
t.Fatalf("read: %v", err)
}
if string(data) != "hello" {
t.Fatalf("got %q, want %q", string(data), "hello")
}
// Verify no .tmp file left behind.
if _, err := os.Stat(path + ".tmp"); !os.IsNotExist(err) {
t.Fatal("temp file should not exist after atomic write")
}
}
// --- test helpers ---
// writeSelfSignedCert generates a self-signed cert/key and writes them to disk.
func writeSelfSignedCert(t *testing.T, certPath, keyPath, hostname string, validity time.Duration) {
t.Helper()
certPEM, keyPEM := generateCertPEM(t, hostname, validity)
if err := os.WriteFile(certPath, []byte(certPEM), 0644); err != nil {
t.Fatalf("write cert: %v", err)
}
if err := os.WriteFile(keyPath, []byte(keyPEM), 0600); err != nil {
t.Fatalf("write key: %v", err)
}
}
// writeExpiredCert generates a cert that is already expired.
func writeExpiredCert(t *testing.T, certPath, keyPath, hostname string) {
t.Helper()
key, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
if err != nil {
t.Fatalf("generate key: %v", err)
}
tmpl := &x509.Certificate{
SerialNumber: big.NewInt(1),
Subject: pkix.Name{CommonName: hostname},
DNSNames: []string{hostname},
NotBefore: time.Now().Add(-2 * time.Hour),
NotAfter: time.Now().Add(-1 * time.Hour),
}
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, &key.PublicKey, key)
if err != nil {
t.Fatalf("create cert: %v", err)
}
certPEM := pem.EncodeToMemory(&pem.Block{Type: "CERTIFICATE", Bytes: der})
keyDER, err := x509.MarshalECPrivateKey(key)
if err != nil {
t.Fatalf("marshal key: %v", err)
}
keyPEM := pem.EncodeToMemory(&pem.Block{Type: "EC PRIVATE KEY", Bytes: keyDER})
if err := os.WriteFile(certPath, certPEM, 0644); err != nil {
t.Fatalf("write cert: %v", err)
}
if err := os.WriteFile(keyPath, keyPEM, 0600); err != nil {
t.Fatalf("write key: %v", err)
}
}
// generateCertPEM generates a self-signed cert and returns PEM strings.
func generateCertPEM(t *testing.T, hostname string, validity time.Duration) (certPEM, keyPEM string) {
t.Helper()
key, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
if err != nil {
t.Fatalf("generate key: %v", err)
}
tmpl := &x509.Certificate{
SerialNumber: big.NewInt(1),
Subject: pkix.Name{CommonName: hostname},
DNSNames: []string{hostname},
NotBefore: time.Now().Add(-1 * time.Hour),
NotAfter: time.Now().Add(validity),
}
der, err := x509.CreateCertificate(rand.Reader, tmpl, tmpl, &key.PublicKey, key)
if err != nil {
t.Fatalf("create cert: %v", err)
}
certBlock := pem.EncodeToMemory(&pem.Block{Type: "CERTIFICATE", Bytes: der})
keyDER, err := x509.MarshalECPrivateKey(key)
if err != nil {
t.Fatalf("marshal key: %v", err)
}
keyBlock := pem.EncodeToMemory(&pem.Block{Type: "EC PRIVATE KEY", Bytes: keyDER})
return string(certBlock), string(keyBlock)
}

View File

@@ -5,10 +5,11 @@ import (
"database/sql" "database/sql"
"errors" "errors"
"fmt" "fmt"
"strings"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
// Deploy deploys a service (or a single component of it) to this node. // Deploy deploys a service (or a single component of it) to this node.
@@ -33,6 +34,9 @@ func (a *Agent) Deploy(ctx context.Context, req *mcpv1.DeployRequest) (*mcpv1.De
filtered = append(filtered, cs) filtered = append(filtered, cs)
} }
} }
if len(filtered) == 0 {
return nil, fmt.Errorf("component %q not found in service %q", target, serviceName)
}
components = filtered components = filtered
} }
@@ -49,7 +53,7 @@ func (a *Agent) Deploy(ctx context.Context, req *mcpv1.DeployRequest) (*mcpv1.De
// deployComponent handles the full deploy lifecycle for a single component. // deployComponent handles the full deploy lifecycle for a single component.
func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcpv1.ComponentSpec, active bool) *mcpv1.ComponentResult { func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcpv1.ComponentSpec, active bool) *mcpv1.ComponentResult {
compName := cs.GetName() compName := cs.GetName()
containerName := serviceName + "-" + compName containerName := ContainerNameFor(serviceName, compName)
desiredState := "running" desiredState := "running"
if !active { if !active {
@@ -58,6 +62,25 @@ func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcp
a.Logger.Info("deploying component", "service", serviceName, "component", compName, "desired", desiredState) a.Logger.Info("deploying component", "service", serviceName, "component", compName, "desired", desiredState)
// Convert proto routes to registry routes.
var regRoutes []registry.Route
for _, r := range cs.GetRoutes() {
mode := r.GetMode()
if mode == "" {
mode = "l4"
}
name := r.GetName()
if name == "" {
name = "default"
}
regRoutes = append(regRoutes, registry.Route{
Name: name,
Port: int(r.GetPort()),
Mode: mode,
Hostname: r.GetHostname(),
})
}
regComp := &registry.Component{ regComp := &registry.Component{
Name: compName, Name: compName,
Service: serviceName, Service: serviceName,
@@ -70,6 +93,7 @@ func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcp
Ports: cs.GetPorts(), Ports: cs.GetPorts(),
Volumes: cs.GetVolumes(), Volumes: cs.GetVolumes(),
Cmd: cs.GetCmd(), Cmd: cs.GetCmd(),
Routes: regRoutes,
} }
if err := ensureComponent(a.DB, regComp); err != nil { if err := ensureComponent(a.DB, regComp); err != nil {
@@ -89,16 +113,35 @@ func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcp
_ = a.Runtime.Stop(ctx, containerName) // may not exist yet _ = a.Runtime.Stop(ctx, containerName) // may not exist yet
_ = a.Runtime.Remove(ctx, containerName) // may not exist yet _ = a.Runtime.Remove(ctx, containerName) // may not exist yet
// Build the container spec. If the component has routes, use route-based
// port allocation and env injection. Otherwise, fall back to legacy ports.
runSpec := runtime.ContainerSpec{ runSpec := runtime.ContainerSpec{
Name: containerName, Name: containerName,
Image: cs.GetImage(), Image: cs.GetImage(),
Network: cs.GetNetwork(), Network: cs.GetNetwork(),
User: cs.GetUser(), User: cs.GetUser(),
Restart: cs.GetRestart(), Restart: cs.GetRestart(),
Ports: cs.GetPorts(),
Volumes: cs.GetVolumes(), Volumes: cs.GetVolumes(),
Cmd: cs.GetCmd(), Cmd: cs.GetCmd(),
Env: cs.GetEnv(),
} }
if len(regRoutes) > 0 && a.PortAlloc != nil {
ports, env, err := a.allocateRoutePorts(serviceName, compName, regRoutes)
if err != nil {
return &mcpv1.ComponentResult{
Name: compName,
Error: fmt.Sprintf("allocate route ports: %v", err),
}
}
// Merge explicit ports from the spec with route-allocated ports.
runSpec.Ports = append(cs.GetPorts(), ports...)
runSpec.Env = append(runSpec.Env, env...)
} else {
// Legacy: use ports directly from the spec.
runSpec.Ports = cs.GetPorts()
}
if err := a.Runtime.Run(ctx, runSpec); err != nil { if err := a.Runtime.Run(ctx, runSpec); err != nil {
_ = registry.UpdateComponentState(a.DB, serviceName, compName, "", "removed") _ = registry.UpdateComponentState(a.DB, serviceName, compName, "", "removed")
return &mcpv1.ComponentResult{ return &mcpv1.ComponentResult{
@@ -107,6 +150,31 @@ func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcp
} }
} }
// Provision TLS certs for L7 routes before registering with mc-proxy.
if a.Certs != nil && hasL7Routes(regRoutes) {
hostnames := l7Hostnames(serviceName, regRoutes)
if err := a.Certs.EnsureCert(ctx, serviceName, hostnames); err != nil {
a.Logger.Warn("failed to provision TLS cert", "service", serviceName, "err", err)
}
}
// Register routes with mc-proxy after the container is running.
if len(regRoutes) > 0 && a.Proxy != nil {
hostPorts, err := registry.GetRouteHostPorts(a.DB, serviceName, compName)
if err != nil {
a.Logger.Warn("failed to get host ports for route registration", "service", serviceName, "component", compName, "err", err)
} else if err := a.Proxy.RegisterRoutes(ctx, serviceName, regRoutes, hostPorts); err != nil {
a.Logger.Warn("failed to register routes with mc-proxy", "service", serviceName, "component", compName, "err", err)
}
}
// Register DNS record for the service.
if a.DNS != nil && len(regRoutes) > 0 {
if err := a.DNS.EnsureRecord(ctx, serviceName); err != nil {
a.Logger.Warn("failed to register DNS record", "service", serviceName, "err", err)
}
}
if err := registry.UpdateComponentState(a.DB, serviceName, compName, "running", "running"); err != nil { if err := registry.UpdateComponentState(a.DB, serviceName, compName, "running", "running"); err != nil {
a.Logger.Warn("failed to update component state", "service", serviceName, "component", compName, "err", err) a.Logger.Warn("failed to update component state", "service", serviceName, "component", compName, "err", err)
} }
@@ -117,6 +185,39 @@ func (a *Agent) deployComponent(ctx context.Context, serviceName string, cs *mcp
} }
} }
// allocateRoutePorts allocates host ports for each route, stores them in
// the registry, and returns the port mappings and env vars for the container.
func (a *Agent) allocateRoutePorts(service, component string, routes []registry.Route) ([]string, []string, error) {
var ports []string
var env []string
for _, r := range routes {
hostPort, err := a.PortAlloc.Allocate()
if err != nil {
return nil, nil, fmt.Errorf("allocate port for route %q: %w", r.Name, err)
}
if err := registry.UpdateRouteHostPort(a.DB, service, component, r.Name, hostPort); err != nil {
a.PortAlloc.Release(hostPort)
return nil, nil, fmt.Errorf("store host port for route %q: %w", r.Name, err)
}
// The container port must match hostPort (which is also set as $PORT),
// so the app's listen address matches the podman port mapping.
// r.Port is the mc-proxy listener port, NOT the container port.
ports = append(ports, fmt.Sprintf("127.0.0.1:%d:%d", hostPort, hostPort))
if len(routes) == 1 {
env = append(env, fmt.Sprintf("PORT=%d", hostPort))
} else {
envName := "PORT_" + strings.ToUpper(r.Name)
env = append(env, fmt.Sprintf("%s=%d", envName, hostPort))
}
}
return ports, env, nil
}
// ensureService creates the service if it does not exist, or updates its // ensureService creates the service if it does not exist, or updates its
// active flag if it does. // active flag if it does.
func ensureService(db *sql.DB, name string, active bool) error { func ensureService(db *sql.DB, name string, active bool) error {
@@ -130,6 +231,37 @@ func ensureService(db *sql.DB, name string, active bool) error {
return registry.UpdateServiceActive(db, name, active) return registry.UpdateServiceActive(db, name, active)
} }
// hasL7Routes reports whether any route uses L7 (TLS-terminating) mode.
func hasL7Routes(routes []registry.Route) bool {
for _, r := range routes {
if r.Mode == "l7" {
return true
}
}
return false
}
// l7Hostnames returns the unique hostnames from L7 routes, applying
// the default hostname convention when a route has no explicit hostname.
func l7Hostnames(serviceName string, routes []registry.Route) []string {
seen := make(map[string]bool)
var hostnames []string
for _, r := range routes {
if r.Mode != "l7" {
continue
}
h := r.Hostname
if h == "" {
h = serviceName + ".svc.mcp.metacircular.net"
}
if !seen[h] {
seen[h] = true
hostnames = append(hostnames, h)
}
}
return hostnames
}
// ensureComponent creates the component if it does not exist, or updates its // ensureComponent creates the component if it does not exist, or updates its
// spec if it does. // spec if it does.
func ensureComponent(db *sql.DB, c *registry.Component) error { func ensureComponent(db *sql.DB, c *registry.Component) error {

View File

@@ -0,0 +1,159 @@
package agent
import (
"database/sql"
"fmt"
"log/slog"
"os"
"path/filepath"
"testing"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
func openTestDB(t *testing.T) *sql.DB {
t.Helper()
db, err := registry.Open(filepath.Join(t.TempDir(), "test.db"))
if err != nil {
t.Fatalf("open db: %v", err)
}
t.Cleanup(func() { _ = db.Close() })
return db
}
func testAgent(t *testing.T) *Agent {
t.Helper()
return &Agent{
DB: openTestDB(t),
PortAlloc: NewPortAllocator(),
Logger: slog.New(slog.NewTextHandler(os.Stderr, nil)),
}
}
// seedComponent creates the service and component in the registry so that
// allocateRoutePorts can store host ports for it.
func seedComponent(t *testing.T, db *sql.DB, service, component string, routes []registry.Route) {
t.Helper()
if err := registry.CreateService(db, service, true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(db, &registry.Component{
Name: component,
Service: service,
Image: "img:latest",
DesiredState: "running",
ObservedState: "unknown",
Routes: routes,
}); err != nil {
t.Fatalf("create component: %v", err)
}
}
func TestAllocateRoutePorts_SingleRoute(t *testing.T) {
a := testAgent(t)
routes := []registry.Route{
{Name: "default", Port: 443, Mode: "l7"},
}
seedComponent(t, a.DB, "mcdoc", "mcdoc", routes)
ports, env, err := a.allocateRoutePorts("mcdoc", "mcdoc", routes)
if err != nil {
t.Fatalf("allocateRoutePorts: %v", err)
}
if len(ports) != 1 {
t.Fatalf("expected 1 port mapping, got %d", len(ports))
}
if len(env) != 1 {
t.Fatalf("expected 1 env var, got %d", len(env))
}
// Parse the port mapping: should be "127.0.0.1:<hostPort>:<hostPort>"
// NOT "127.0.0.1:<hostPort>:443"
var hostPort, containerPort int
n, _ := fmt.Sscanf(ports[0], "127.0.0.1:%d:%d", &hostPort, &containerPort)
if n != 2 {
t.Fatalf("failed to parse port mapping %q", ports[0])
}
if hostPort != containerPort {
t.Errorf("host port (%d) != container port (%d); container port must match host port for $PORT consistency", hostPort, containerPort)
}
// Env var should be PORT=<hostPort>
var envPort int
n, _ = fmt.Sscanf(env[0], "PORT=%d", &envPort)
if n != 1 {
t.Fatalf("failed to parse env var %q", env[0])
}
if envPort != hostPort {
t.Errorf("PORT env (%d) != host port (%d)", envPort, hostPort)
}
}
func TestAllocateRoutePorts_MultiRoute(t *testing.T) {
a := testAgent(t)
routes := []registry.Route{
{Name: "rest", Port: 8443, Mode: "l4"},
{Name: "grpc", Port: 9443, Mode: "l4"},
}
seedComponent(t, a.DB, "metacrypt", "api", routes)
ports, env, err := a.allocateRoutePorts("metacrypt", "api", routes)
if err != nil {
t.Fatalf("allocateRoutePorts: %v", err)
}
if len(ports) != 2 {
t.Fatalf("expected 2 port mappings, got %d", len(ports))
}
if len(env) != 2 {
t.Fatalf("expected 2 env vars, got %d", len(env))
}
// Each port mapping should have host port == container port.
for i, p := range ports {
var hp, cp int
n, _ := fmt.Sscanf(p, "127.0.0.1:%d:%d", &hp, &cp)
if n != 2 {
t.Fatalf("port[%d]: failed to parse %q", i, p)
}
if hp != cp {
t.Errorf("port[%d]: host port (%d) != container port (%d)", i, hp, cp)
}
}
// Env vars should be PORT_REST and PORT_GRPC (not bare PORT).
if env[0][:10] != "PORT_REST=" {
t.Errorf("env[0] = %q, want PORT_REST=...", env[0])
}
if env[1][:10] != "PORT_GRPC=" {
t.Errorf("env[1] = %q, want PORT_GRPC=...", env[1])
}
}
func TestAllocateRoutePorts_L7PortNotUsedAsContainerPort(t *testing.T) {
a := testAgent(t)
routes := []registry.Route{
{Name: "default", Port: 443, Mode: "l7"},
}
seedComponent(t, a.DB, "svc", "web", routes)
ports, _, err := a.allocateRoutePorts("svc", "web", routes)
if err != nil {
t.Fatalf("allocateRoutePorts: %v", err)
}
// The container port must NOT be 443 (the mc-proxy listener port).
// It must be the host port (which is in range 10000-60000).
var hostPort, containerPort int
n, _ := fmt.Sscanf(ports[0], "127.0.0.1:%d:%d", &hostPort, &containerPort)
if n != 2 {
t.Fatalf("failed to parse port mapping %q", ports[0])
}
if containerPort == 443 {
t.Errorf("container port is 443 (mc-proxy listener); should be %d (host port)", hostPort)
}
if containerPort < portRangeMin || containerPort >= portRangeMax {
t.Errorf("container port %d outside allocation range [%d, %d)", containerPort, portRangeMin, portRangeMax)
}
}

344
internal/agent/dns.go Normal file
View File

@@ -0,0 +1,344 @@
package agent
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"log/slog"
"net/http"
"strings"
"git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/mc/mcp/internal/config"
)
// DNSRegistrar creates and removes A records in MCNS during deploy
// and stop. It is nil-safe: all methods are no-ops when the receiver
// is nil.
type DNSRegistrar struct {
serverURL string
token string
zone string
nodeAddr string
httpClient *http.Client
logger *slog.Logger
}
// DNSRecord is the JSON representation of an MCNS record.
type DNSRecord struct {
ID int `json:"ID"`
Name string `json:"Name"`
Type string `json:"Type"`
Value string `json:"Value"`
TTL int `json:"TTL"`
}
// NewDNSRegistrar creates a DNSRegistrar. Returns (nil, nil) if
// cfg.ServerURL is empty (DNS registration disabled).
func NewDNSRegistrar(cfg config.MCNSConfig, logger *slog.Logger) (*DNSRegistrar, error) {
if cfg.ServerURL == "" {
logger.Info("mcns not configured, DNS registration disabled")
return nil, nil
}
token, err := auth.LoadToken(cfg.TokenPath)
if err != nil {
return nil, fmt.Errorf("load mcns token: %w", err)
}
httpClient, err := newTLSClient(cfg.CACert)
if err != nil {
return nil, fmt.Errorf("create mcns HTTP client: %w", err)
}
logger.Info("mcns DNS registrar enabled", "server", cfg.ServerURL, "zone", cfg.Zone, "node_addr", cfg.NodeAddr)
return &DNSRegistrar{
serverURL: strings.TrimRight(cfg.ServerURL, "/"),
token: token,
zone: cfg.Zone,
nodeAddr: cfg.NodeAddr,
httpClient: httpClient,
logger: logger,
}, nil
}
// EnsureRecord ensures an A record exists for the service in the
// configured zone, pointing to the node's address.
func (d *DNSRegistrar) EnsureRecord(ctx context.Context, serviceName string) error {
if d == nil {
return nil
}
existing, err := d.listRecords(ctx, serviceName)
if err != nil {
return fmt.Errorf("list DNS records: %w", err)
}
// Check if any existing record already has the correct value.
for _, r := range existing {
if r.Value == d.nodeAddr {
d.logger.Debug("DNS record exists, skipping",
"service", serviceName,
"record", r.Name+"."+d.zone,
"value", r.Value,
)
return nil
}
}
// No record with the correct value — update the first one if it exists.
if len(existing) > 0 {
d.logger.Info("updating DNS record",
"service", serviceName,
"old_value", existing[0].Value,
"new_value", d.nodeAddr,
)
return d.updateRecord(ctx, existing[0].ID, serviceName)
}
// No existing record — create one.
d.logger.Info("creating DNS record",
"service", serviceName,
"record", serviceName+"."+d.zone,
"value", d.nodeAddr,
)
return d.createRecord(ctx, serviceName)
}
// RemoveRecord removes A records for the service from the configured zone.
func (d *DNSRegistrar) RemoveRecord(ctx context.Context, serviceName string) error {
if d == nil {
return nil
}
existing, err := d.listRecords(ctx, serviceName)
if err != nil {
return fmt.Errorf("list DNS records: %w", err)
}
if len(existing) == 0 {
d.logger.Debug("no DNS record to remove", "service", serviceName)
return nil
}
for _, r := range existing {
d.logger.Info("removing DNS record",
"service", serviceName,
"record", r.Name+"."+d.zone,
"id", r.ID,
)
if err := d.deleteRecord(ctx, r.ID); err != nil {
return err
}
}
return nil
}
// DNSZone is the JSON representation of an MCNS zone.
type DNSZone struct {
Name string `json:"Name"`
}
// ListZones returns all zones from MCNS.
func (d *DNSRegistrar) ListZones(ctx context.Context) ([]DNSZone, error) {
if d == nil {
return nil, fmt.Errorf("DNS registrar not configured")
}
url := fmt.Sprintf("%s/v1/zones", d.serverURL)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, fmt.Errorf("create list zones request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("list zones: %w", err)
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("read list zones response: %w", err)
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("list zones: mcns returned %d: %s", resp.StatusCode, string(body))
}
var envelope struct {
Zones []DNSZone `json:"zones"`
}
if err := json.Unmarshal(body, &envelope); err != nil {
return nil, fmt.Errorf("parse list zones response: %w", err)
}
return envelope.Zones, nil
}
// ListZoneRecords returns all records in the given zone (no filters).
func (d *DNSRegistrar) ListZoneRecords(ctx context.Context, zone string) ([]DNSRecord, error) {
if d == nil {
return nil, fmt.Errorf("DNS registrar not configured")
}
url := fmt.Sprintf("%s/v1/zones/%s/records", d.serverURL, zone)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, fmt.Errorf("create list zone records request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("list zone records: %w", err)
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("read list zone records response: %w", err)
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("list zone records: mcns returned %d: %s", resp.StatusCode, string(body))
}
var envelope struct {
Records []DNSRecord `json:"records"`
}
if err := json.Unmarshal(body, &envelope); err != nil {
return nil, fmt.Errorf("parse list zone records response: %w", err)
}
return envelope.Records, nil
}
// listRecords returns A records matching the service name in the zone.
func (d *DNSRegistrar) listRecords(ctx context.Context, serviceName string) ([]DNSRecord, error) {
url := fmt.Sprintf("%s/v1/zones/%s/records?name=%s&type=A", d.serverURL, d.zone, serviceName)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, fmt.Errorf("create list request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("list records: %w", err)
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("read list response: %w", err)
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("list records: mcns returned %d: %s", resp.StatusCode, string(body))
}
var envelope struct {
Records []DNSRecord `json:"records"`
}
if err := json.Unmarshal(body, &envelope); err != nil {
return nil, fmt.Errorf("parse list response: %w", err)
}
return envelope.Records, nil
}
// createRecord creates an A record in the zone.
func (d *DNSRegistrar) createRecord(ctx context.Context, serviceName string) error {
reqBody := map[string]interface{}{
"name": serviceName,
"type": "A",
"value": d.nodeAddr,
"ttl": 300,
}
body, err := json.Marshal(reqBody)
if err != nil {
return fmt.Errorf("marshal create request: %w", err)
}
url := fmt.Sprintf("%s/v1/zones/%s/records", d.serverURL, d.zone)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("create record request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return fmt.Errorf("create record: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusCreated && resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return fmt.Errorf("create record: mcns returned %d: %s", resp.StatusCode, string(respBody))
}
return nil
}
// updateRecord updates an existing record's value.
func (d *DNSRegistrar) updateRecord(ctx context.Context, recordID int, serviceName string) error {
reqBody := map[string]interface{}{
"name": serviceName,
"type": "A",
"value": d.nodeAddr,
"ttl": 300,
}
body, err := json.Marshal(reqBody)
if err != nil {
return fmt.Errorf("marshal update request: %w", err)
}
url := fmt.Sprintf("%s/v1/zones/%s/records/%d", d.serverURL, d.zone, recordID)
req, err := http.NewRequestWithContext(ctx, http.MethodPut, url, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("create update request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return fmt.Errorf("update record: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return fmt.Errorf("update record: mcns returned %d: %s", resp.StatusCode, string(respBody))
}
return nil
}
// deleteRecord deletes a record by ID.
func (d *DNSRegistrar) deleteRecord(ctx context.Context, recordID int) error {
url := fmt.Sprintf("%s/v1/zones/%s/records/%d", d.serverURL, d.zone, recordID)
req, err := http.NewRequestWithContext(ctx, http.MethodDelete, url, nil)
if err != nil {
return fmt.Errorf("create delete request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return fmt.Errorf("delete record: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusNoContent && resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return fmt.Errorf("delete record: mcns returned %d: %s", resp.StatusCode, string(respBody))
}
return nil
}

40
internal/agent/dns_rpc.go Normal file
View File

@@ -0,0 +1,40 @@
package agent
import (
"context"
"fmt"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
)
// ListDNSRecords queries MCNS for all zones and their records.
func (a *Agent) ListDNSRecords(ctx context.Context, _ *mcpv1.ListDNSRecordsRequest) (*mcpv1.ListDNSRecordsResponse, error) {
a.Logger.Debug("ListDNSRecords called")
zones, err := a.DNS.ListZones(ctx)
if err != nil {
return nil, fmt.Errorf("list zones: %w", err)
}
resp := &mcpv1.ListDNSRecordsResponse{}
for _, z := range zones {
records, err := a.DNS.ListZoneRecords(ctx, z.Name)
if err != nil {
return nil, fmt.Errorf("list records for zone %q: %w", z.Name, err)
}
zone := &mcpv1.DNSZone{Name: z.Name}
for _, r := range records {
zone.Records = append(zone.Records, &mcpv1.DNSRecord{
Id: int64(r.ID),
Name: r.Name,
Type: r.Type,
Value: r.Value,
Ttl: int32(r.TTL), //nolint:gosec // TTL is bounded
})
}
resp.Zones = append(resp.Zones, zone)
}
return resp, nil
}

214
internal/agent/dns_test.go Normal file
View File

@@ -0,0 +1,214 @@
package agent
import (
"context"
"encoding/json"
"log/slog"
"net/http"
"net/http/httptest"
"testing"
"git.wntrmute.dev/mc/mcp/internal/config"
)
func TestNilDNSRegistrarIsNoop(t *testing.T) {
var d *DNSRegistrar
if err := d.EnsureRecord(context.Background(), "svc"); err != nil {
t.Fatalf("EnsureRecord on nil: %v", err)
}
if err := d.RemoveRecord(context.Background(), "svc"); err != nil {
t.Fatalf("RemoveRecord on nil: %v", err)
}
}
func TestNewDNSRegistrarDisabledWhenUnconfigured(t *testing.T) {
d, err := NewDNSRegistrar(config.MCNSConfig{}, slog.Default())
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if d != nil {
t.Fatal("expected nil registrar for empty config")
}
}
func TestEnsureRecordCreatesWhenMissing(t *testing.T) {
var gotMethod, gotPath, gotAuth string
var gotBody map[string]interface{}
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method == http.MethodGet {
// List returns empty — no existing records.
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"records":[]}`))
return
}
gotMethod = r.Method
gotPath = r.URL.Path
gotAuth = r.Header.Get("Authorization")
_ = json.NewDecoder(r.Body).Decode(&gotBody)
w.WriteHeader(http.StatusCreated)
_, _ = w.Write([]byte(`{"id":1}`))
}))
defer srv.Close()
d := &DNSRegistrar{
serverURL: srv.URL,
token: "test-token",
zone: "svc.mcp.metacircular.net",
nodeAddr: "192.168.88.181",
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := d.EnsureRecord(context.Background(), "myservice"); err != nil {
t.Fatalf("EnsureRecord: %v", err)
}
if gotMethod != http.MethodPost {
t.Fatalf("method: got %q, want POST", gotMethod)
}
if gotPath != "/v1/zones/svc.mcp.metacircular.net/records" {
t.Fatalf("path: got %q", gotPath)
}
if gotAuth != "Bearer test-token" {
t.Fatalf("auth: got %q", gotAuth)
}
if gotBody["name"] != "myservice" {
t.Fatalf("name: got %v", gotBody["name"])
}
if gotBody["type"] != "A" {
t.Fatalf("type: got %v", gotBody["type"])
}
if gotBody["value"] != "192.168.88.181" {
t.Fatalf("value: got %v", gotBody["value"])
}
}
func TestEnsureRecordSkipsWhenExists(t *testing.T) {
createCalled := false
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method == http.MethodGet {
// Return an existing record with the correct value.
resp := map[string][]DNSRecord{"records": {{ID: 1, Name: "myservice", Type: "A", Value: "192.168.88.181", TTL: 300}}}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(resp)
return
}
createCalled = true
w.WriteHeader(http.StatusCreated)
}))
defer srv.Close()
d := &DNSRegistrar{
serverURL: srv.URL,
token: "test-token",
zone: "svc.mcp.metacircular.net",
nodeAddr: "192.168.88.181",
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := d.EnsureRecord(context.Background(), "myservice"); err != nil {
t.Fatalf("EnsureRecord: %v", err)
}
if createCalled {
t.Fatal("should not create when record already exists with correct value")
}
}
func TestEnsureRecordUpdatesWrongValue(t *testing.T) {
var gotMethod string
var gotPath string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method == http.MethodGet {
// Return a record with a stale value.
resp := map[string][]DNSRecord{"records": {{ID: 42, Name: "myservice", Type: "A", Value: "10.0.0.1", TTL: 300}}}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(resp)
return
}
gotMethod = r.Method
gotPath = r.URL.Path
w.WriteHeader(http.StatusOK)
}))
defer srv.Close()
d := &DNSRegistrar{
serverURL: srv.URL,
token: "test-token",
zone: "svc.mcp.metacircular.net",
nodeAddr: "192.168.88.181",
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := d.EnsureRecord(context.Background(), "myservice"); err != nil {
t.Fatalf("EnsureRecord: %v", err)
}
if gotMethod != http.MethodPut {
t.Fatalf("method: got %q, want PUT", gotMethod)
}
if gotPath != "/v1/zones/svc.mcp.metacircular.net/records/42" {
t.Fatalf("path: got %q", gotPath)
}
}
func TestRemoveRecordDeletes(t *testing.T) {
var gotMethod, gotPath string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Method == http.MethodGet {
resp := map[string][]DNSRecord{"records": {{ID: 7, Name: "myservice", Type: "A", Value: "192.168.88.181", TTL: 300}}}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(resp)
return
}
gotMethod = r.Method
gotPath = r.URL.Path
w.WriteHeader(http.StatusNoContent)
}))
defer srv.Close()
d := &DNSRegistrar{
serverURL: srv.URL,
token: "test-token",
zone: "svc.mcp.metacircular.net",
nodeAddr: "192.168.88.181",
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := d.RemoveRecord(context.Background(), "myservice"); err != nil {
t.Fatalf("RemoveRecord: %v", err)
}
if gotMethod != http.MethodDelete {
t.Fatalf("method: got %q, want DELETE", gotMethod)
}
if gotPath != "/v1/zones/svc.mcp.metacircular.net/records/7" {
t.Fatalf("path: got %q", gotPath)
}
}
func TestRemoveRecordNoopWhenMissing(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// List returns empty.
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"records":[]}`))
}))
defer srv.Close()
d := &DNSRegistrar{
serverURL: srv.URL,
token: "test-token",
zone: "svc.mcp.metacircular.net",
nodeAddr: "192.168.88.181",
httpClient: srv.Client(),
logger: slog.Default(),
}
if err := d.RemoveRecord(context.Background(), "myservice"); err != nil {
t.Fatalf("RemoveRecord: %v", err)
}
}

196
internal/agent/edge_rpc.go Normal file
View File

@@ -0,0 +1,196 @@
package agent
import (
"context"
"crypto/x509"
"encoding/pem"
"fmt"
"net"
"os"
"time"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
mcproxy "git.wntrmute.dev/mc/mc-proxy/client/mcproxy"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
// SetupEdgeRoute provisions a TLS cert and registers an mc-proxy route for a
// public hostname. Called by the master on edge nodes.
func (a *Agent) SetupEdgeRoute(ctx context.Context, req *mcpv1.SetupEdgeRouteRequest) (*mcpv1.SetupEdgeRouteResponse, error) {
a.Logger.Info("SetupEdgeRoute", "hostname", req.GetHostname(),
"backend_hostname", req.GetBackendHostname(), "backend_port", req.GetBackendPort())
// Validate required fields.
if req.GetHostname() == "" {
return nil, status.Error(codes.InvalidArgument, "hostname is required")
}
if req.GetBackendHostname() == "" {
return nil, status.Error(codes.InvalidArgument, "backend_hostname is required")
}
if req.GetBackendPort() == 0 {
return nil, status.Error(codes.InvalidArgument, "backend_port is required")
}
if !req.GetBackendTls() {
return nil, status.Error(codes.InvalidArgument, "backend_tls must be true")
}
if a.Proxy == nil {
return nil, status.Error(codes.FailedPrecondition, "mc-proxy not configured")
}
// Resolve the backend hostname to a Tailnet IP.
ips, err := net.LookupHost(req.GetBackendHostname())
if err != nil || len(ips) == 0 {
return nil, status.Errorf(codes.InvalidArgument, "cannot resolve backend_hostname %q: %v", req.GetBackendHostname(), err)
}
backendIP := ips[0]
// Validate the resolved IP is a Tailnet address (100.64.0.0/10).
ip := net.ParseIP(backendIP)
if ip == nil {
return nil, status.Errorf(codes.InvalidArgument, "resolved IP %q is not valid", backendIP)
}
_, tailnet, _ := net.ParseCIDR("100.64.0.0/10")
if !tailnet.Contains(ip) {
return nil, status.Errorf(codes.InvalidArgument, "resolved IP %s is not a Tailnet address", backendIP)
}
backend := fmt.Sprintf("%s:%d", backendIP, req.GetBackendPort())
// Provision TLS cert for the public hostname if cert provisioner is available.
certPath := ""
keyPath := ""
if a.Certs != nil {
if err := a.Certs.EnsureCert(ctx, req.GetHostname(), []string{req.GetHostname()}); err != nil {
return nil, status.Errorf(codes.Internal, "provision cert for %s: %v", req.GetHostname(), err)
}
certPath = a.Proxy.CertPath(req.GetHostname())
keyPath = a.Proxy.KeyPath(req.GetHostname())
} else {
// No cert provisioner — check if certs already exist on disk.
certPath = a.Proxy.CertPath(req.GetHostname())
keyPath = a.Proxy.KeyPath(req.GetHostname())
if _, err := os.Stat(certPath); err != nil {
return nil, status.Errorf(codes.FailedPrecondition, "no cert provisioner and cert not found at %s", certPath)
}
}
// Register the L7 route in mc-proxy.
route := mcproxy.Route{
Hostname: req.GetHostname(),
Backend: backend,
Mode: "l7",
TLSCert: certPath,
TLSKey: keyPath,
BackendTLS: true,
}
if err := a.Proxy.AddRoute(ctx, ":443", route); err != nil {
return nil, status.Errorf(codes.Internal, "add mc-proxy route: %v", err)
}
// Persist the edge route in the registry.
if err := registry.CreateEdgeRoute(a.DB, req.GetHostname(), req.GetBackendHostname(), int(req.GetBackendPort()), certPath, keyPath); err != nil {
a.Logger.Warn("failed to persist edge route", "hostname", req.GetHostname(), "err", err)
}
a.Logger.Info("edge route established",
"hostname", req.GetHostname(), "backend", backend, "cert", certPath)
return &mcpv1.SetupEdgeRouteResponse{}, nil
}
// RemoveEdgeRoute removes an mc-proxy route and cleans up the TLS cert for a
// public hostname. Called by the master on edge nodes.
func (a *Agent) RemoveEdgeRoute(ctx context.Context, req *mcpv1.RemoveEdgeRouteRequest) (*mcpv1.RemoveEdgeRouteResponse, error) {
a.Logger.Info("RemoveEdgeRoute", "hostname", req.GetHostname())
if req.GetHostname() == "" {
return nil, status.Error(codes.InvalidArgument, "hostname is required")
}
if a.Proxy == nil {
return nil, status.Error(codes.FailedPrecondition, "mc-proxy not configured")
}
// Remove the mc-proxy route.
if err := a.Proxy.RemoveRoute(ctx, ":443", req.GetHostname()); err != nil {
a.Logger.Warn("remove mc-proxy route", "hostname", req.GetHostname(), "err", err)
// Continue — clean up cert and registry even if route removal fails.
}
// Remove the TLS cert.
if a.Certs != nil {
if err := a.Certs.RemoveCert(req.GetHostname()); err != nil {
a.Logger.Warn("remove cert", "hostname", req.GetHostname(), "err", err)
}
}
// Remove from registry.
if err := registry.DeleteEdgeRoute(a.DB, req.GetHostname()); err != nil {
a.Logger.Warn("delete edge route from registry", "hostname", req.GetHostname(), "err", err)
}
a.Logger.Info("edge route removed", "hostname", req.GetHostname())
return &mcpv1.RemoveEdgeRouteResponse{}, nil
}
// ListEdgeRoutes returns all edge routes managed by this agent.
func (a *Agent) ListEdgeRoutes(_ context.Context, _ *mcpv1.ListEdgeRoutesRequest) (*mcpv1.ListEdgeRoutesResponse, error) {
a.Logger.Debug("ListEdgeRoutes called")
routes, err := registry.ListEdgeRoutes(a.DB)
if err != nil {
return nil, status.Errorf(codes.Internal, "list edge routes: %v", err)
}
resp := &mcpv1.ListEdgeRoutesResponse{}
for _, r := range routes {
er := &mcpv1.EdgeRoute{
Hostname: r.Hostname,
BackendHostname: r.BackendHostname,
BackendPort: int32(r.BackendPort), //nolint:gosec // port is a small positive integer
}
// Read cert metadata if available.
if r.TLSCert != "" {
if certData, readErr := os.ReadFile(r.TLSCert); readErr == nil { //nolint:gosec // path from registry, not user input
if block, _ := pem.Decode(certData); block != nil {
if cert, parseErr := x509.ParseCertificate(block.Bytes); parseErr == nil {
er.CertSerial = cert.SerialNumber.String()
er.CertExpires = cert.NotAfter.UTC().Format(time.RFC3339)
}
}
}
}
resp.Routes = append(resp.Routes, er)
}
return resp, nil
}
// HealthCheck returns the agent's health status. Called by the master when
// heartbeats are missed.
func (a *Agent) HealthCheck(_ context.Context, _ *mcpv1.HealthCheckRequest) (*mcpv1.HealthCheckResponse, error) {
a.Logger.Debug("HealthCheck called")
st := "healthy"
containers := int32(0)
// Count running containers if the runtime is available.
if a.Runtime != nil {
if list, err := a.Runtime.List(context.Background()); err == nil {
containers = int32(len(list)) //nolint:gosec // container count is small
} else {
st = "degraded"
}
}
return &mcpv1.HealthCheckResponse{
Status: st,
Containers: containers,
}, nil
}

View File

@@ -8,7 +8,7 @@ import (
"path/filepath" "path/filepath"
"strings" "strings"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"google.golang.org/grpc/codes" "google.golang.org/grpc/codes"
"google.golang.org/grpc/status" "google.golang.org/grpc/status"
) )

View File

@@ -5,16 +5,16 @@ import (
"database/sql" "database/sql"
"fmt" "fmt"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
"google.golang.org/grpc/codes" "google.golang.org/grpc/codes"
"google.golang.org/grpc/status" "google.golang.org/grpc/status"
) )
// StopService stops all components of a service. // StopService stops all components of a service, or a single component if specified.
func (a *Agent) StopService(ctx context.Context, req *mcpv1.StopServiceRequest) (*mcpv1.StopServiceResponse, error) { func (a *Agent) StopService(ctx context.Context, req *mcpv1.StopServiceRequest) (*mcpv1.StopServiceResponse, error) {
a.Logger.Info("StopService", "service", req.GetName()) a.Logger.Info("StopService", "service", req.GetName(), "component", req.GetComponent())
if req.GetName() == "" { if req.GetName() == "" {
return nil, status.Error(codes.InvalidArgument, "service name is required") return nil, status.Error(codes.InvalidArgument, "service name is required")
@@ -25,11 +25,32 @@ func (a *Agent) StopService(ctx context.Context, req *mcpv1.StopServiceRequest)
return nil, status.Errorf(codes.Internal, "list components: %v", err) return nil, status.Errorf(codes.Internal, "list components: %v", err)
} }
if target := req.GetComponent(); target != "" {
components, err = filterComponents(components, req.GetName(), target)
if err != nil {
return nil, err
}
}
var results []*mcpv1.ComponentResult var results []*mcpv1.ComponentResult
for _, c := range components { for _, c := range components {
containerName := req.GetName() + "-" + c.Name containerName := ContainerNameFor(req.GetName(), c.Name)
r := &mcpv1.ComponentResult{Name: c.Name, Success: true} r := &mcpv1.ComponentResult{Name: c.Name, Success: true}
// Remove routes from mc-proxy before stopping the container.
if len(c.Routes) > 0 && a.Proxy != nil {
if err := a.Proxy.RemoveRoutes(ctx, req.GetName(), c.Routes); err != nil {
a.Logger.Warn("failed to remove routes", "service", req.GetName(), "component", c.Name, "err", err)
}
}
// Remove DNS record when stopping the service.
if len(c.Routes) > 0 && a.DNS != nil {
if err := a.DNS.RemoveRecord(ctx, req.GetName()); err != nil {
a.Logger.Warn("failed to remove DNS record", "service", req.GetName(), "err", err)
}
}
if err := a.Runtime.Stop(ctx, containerName); err != nil { if err := a.Runtime.Stop(ctx, containerName); err != nil {
a.Logger.Info("stop container (ignored)", "container", containerName, "error", err) a.Logger.Info("stop container (ignored)", "container", containerName, "error", err)
} }
@@ -45,10 +66,10 @@ func (a *Agent) StopService(ctx context.Context, req *mcpv1.StopServiceRequest)
return &mcpv1.StopServiceResponse{Results: results}, nil return &mcpv1.StopServiceResponse{Results: results}, nil
} }
// StartService starts all components of a service. If a container already // StartService starts all components of a service, or a single component if specified.
// exists but is stopped, it is removed first so a fresh one can be created. // If a container already exists but is stopped, it is removed first so a fresh one can be created.
func (a *Agent) StartService(ctx context.Context, req *mcpv1.StartServiceRequest) (*mcpv1.StartServiceResponse, error) { func (a *Agent) StartService(ctx context.Context, req *mcpv1.StartServiceRequest) (*mcpv1.StartServiceResponse, error) {
a.Logger.Info("StartService", "service", req.GetName()) a.Logger.Info("StartService", "service", req.GetName(), "component", req.GetComponent())
if req.GetName() == "" { if req.GetName() == "" {
return nil, status.Error(codes.InvalidArgument, "service name is required") return nil, status.Error(codes.InvalidArgument, "service name is required")
@@ -59,6 +80,13 @@ func (a *Agent) StartService(ctx context.Context, req *mcpv1.StartServiceRequest
return nil, status.Errorf(codes.Internal, "list components: %v", err) return nil, status.Errorf(codes.Internal, "list components: %v", err)
} }
if target := req.GetComponent(); target != "" {
components, err = filterComponents(components, req.GetName(), target)
if err != nil {
return nil, err
}
}
var results []*mcpv1.ComponentResult var results []*mcpv1.ComponentResult
for _, c := range components { for _, c := range components {
r := startComponent(ctx, a, req.GetName(), &c) r := startComponent(ctx, a, req.GetName(), &c)
@@ -68,10 +96,10 @@ func (a *Agent) StartService(ctx context.Context, req *mcpv1.StartServiceRequest
return &mcpv1.StartServiceResponse{Results: results}, nil return &mcpv1.StartServiceResponse{Results: results}, nil
} }
// RestartService restarts all components of a service by stopping, removing, // RestartService restarts all components of a service, or a single component if specified,
// and re-creating each container. The desired_state is not changed. // by stopping, removing, and re-creating each container. The desired_state is not changed.
func (a *Agent) RestartService(ctx context.Context, req *mcpv1.RestartServiceRequest) (*mcpv1.RestartServiceResponse, error) { func (a *Agent) RestartService(ctx context.Context, req *mcpv1.RestartServiceRequest) (*mcpv1.RestartServiceResponse, error) {
a.Logger.Info("RestartService", "service", req.GetName()) a.Logger.Info("RestartService", "service", req.GetName(), "component", req.GetComponent())
if req.GetName() == "" { if req.GetName() == "" {
return nil, status.Error(codes.InvalidArgument, "service name is required") return nil, status.Error(codes.InvalidArgument, "service name is required")
@@ -82,6 +110,13 @@ func (a *Agent) RestartService(ctx context.Context, req *mcpv1.RestartServiceReq
return nil, status.Errorf(codes.Internal, "list components: %v", err) return nil, status.Errorf(codes.Internal, "list components: %v", err)
} }
if target := req.GetComponent(); target != "" {
components, err = filterComponents(components, req.GetName(), target)
if err != nil {
return nil, err
}
}
var results []*mcpv1.ComponentResult var results []*mcpv1.ComponentResult
for _, c := range components { for _, c := range components {
r := restartComponent(ctx, a, req.GetName(), &c) r := restartComponent(ctx, a, req.GetName(), &c)
@@ -94,7 +129,7 @@ func (a *Agent) RestartService(ctx context.Context, req *mcpv1.RestartServiceReq
// startComponent removes any existing container and runs a fresh one from // startComponent removes any existing container and runs a fresh one from
// the registry spec, then updates state to running. // the registry spec, then updates state to running.
func startComponent(ctx context.Context, a *Agent, service string, c *registry.Component) *mcpv1.ComponentResult { func startComponent(ctx context.Context, a *Agent, service string, c *registry.Component) *mcpv1.ComponentResult {
containerName := service + "-" + c.Name containerName := ContainerNameFor(service, c.Name)
r := &mcpv1.ComponentResult{Name: c.Name, Success: true} r := &mcpv1.ComponentResult{Name: c.Name, Success: true}
// Remove any pre-existing container; ignore errors for non-existent ones. // Remove any pre-existing container; ignore errors for non-existent ones.
@@ -118,7 +153,7 @@ func startComponent(ctx context.Context, a *Agent, service string, c *registry.C
// restartComponent stops, removes, and re-creates a container without // restartComponent stops, removes, and re-creates a container without
// changing the desired_state in the registry. // changing the desired_state in the registry.
func restartComponent(ctx context.Context, a *Agent, service string, c *registry.Component) *mcpv1.ComponentResult { func restartComponent(ctx context.Context, a *Agent, service string, c *registry.Component) *mcpv1.ComponentResult {
containerName := service + "-" + c.Name containerName := ContainerNameFor(service, c.Name)
r := &mcpv1.ComponentResult{Name: c.Name, Success: true} r := &mcpv1.ComponentResult{Name: c.Name, Success: true}
_ = a.Runtime.Stop(ctx, containerName) _ = a.Runtime.Stop(ctx, containerName)
@@ -142,7 +177,7 @@ func restartComponent(ctx context.Context, a *Agent, service string, c *registry
// componentToSpec builds a runtime.ContainerSpec from a registry Component. // componentToSpec builds a runtime.ContainerSpec from a registry Component.
func componentToSpec(service string, c *registry.Component) runtime.ContainerSpec { func componentToSpec(service string, c *registry.Component) runtime.ContainerSpec {
return runtime.ContainerSpec{ return runtime.ContainerSpec{
Name: service + "-" + c.Name, Name: ContainerNameFor(service, c.Name),
Image: c.Image, Image: c.Image,
Network: c.Network, Network: c.Network,
User: c.UserSpec, User: c.UserSpec,
@@ -153,6 +188,16 @@ func componentToSpec(service string, c *registry.Component) runtime.ContainerSpe
} }
} }
// filterComponents returns only the component matching target, or an error if not found.
func filterComponents(components []registry.Component, service, target string) ([]registry.Component, error) {
for _, c := range components {
if c.Name == target {
return []registry.Component{c}, nil
}
}
return nil, status.Errorf(codes.NotFound, "component %q not found in service %q", target, service)
}
// componentExists checks whether a component already exists in the registry. // componentExists checks whether a component already exists in the registry.
func componentExists(db *sql.DB, service, name string) bool { func componentExists(db *sql.DB, service, name string) bool {
_, err := registry.GetComponent(db, service, name) _, err := registry.GetComponent(db, service, name)

79
internal/agent/logs.go Normal file
View File

@@ -0,0 +1,79 @@
package agent
import (
"bufio"
"io"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/mc/mcp/internal/runtime"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
)
// Logs streams container logs for a service component.
func (a *Agent) Logs(req *mcpv1.LogsRequest, stream mcpv1.McpAgentService_LogsServer) error {
if req.GetService() == "" {
return status.Error(codes.InvalidArgument, "service name is required")
}
// Resolve component name.
component := req.GetComponent()
if component == "" {
components, err := registry.ListComponents(a.DB, req.GetService())
if err != nil {
return status.Errorf(codes.Internal, "list components: %v", err)
}
if len(components) == 0 {
return status.Error(codes.NotFound, "no components found for service")
}
component = components[0].Name
}
containerName := ContainerNameFor(req.GetService(), component)
podman, ok := a.Runtime.(*runtime.Podman)
if !ok {
return status.Error(codes.Internal, "logs requires podman runtime")
}
cmd := podman.Logs(stream.Context(), containerName, int(req.GetTail()), req.GetFollow(), req.GetTimestamps(), req.GetSince())
a.Logger.Info("running podman logs", "container", containerName, "args", cmd.Args)
// Podman writes container stdout to its stdout and container stderr
// to its stderr. Merge both into a single pipe.
pr, pw := io.Pipe()
cmd.Stdout = pw
cmd.Stderr = pw
if err := cmd.Start(); err != nil {
pw.Close()
return status.Errorf(codes.Internal, "start podman logs: %v", err)
}
// Close the write end when the command exits so the scanner finishes.
go func() {
err := cmd.Wait()
if err != nil {
a.Logger.Warn("podman logs exited", "container", containerName, "error", err)
}
pw.Close()
}()
scanner := bufio.NewScanner(pr)
for scanner.Scan() {
line := scanner.Bytes()
if len(line) == 0 {
continue
}
if err := stream.Send(&mcpv1.LogsResponse{
Data: append(line, '\n'),
}); err != nil {
_ = cmd.Process.Kill()
return err
}
}
return nil
}

34
internal/agent/names.go Normal file
View File

@@ -0,0 +1,34 @@
package agent
import "strings"
// ContainerNameFor returns the expected container name for a service and
// component. For single-component services where the component name equals
// the service name, the container name is just the service name (e.g.,
// "mc-proxy" not "mc-proxy-mc-proxy").
func ContainerNameFor(service, component string) string {
if service == component {
return service
}
return service + "-" + component
}
// SplitContainerName splits a container name into service and component parts.
// It checks known service names first to handle names like "mc-proxy" where a
// naive split on "-" would produce the wrong result. If no known service
// matches, it falls back to splitting on the first "-".
func SplitContainerName(name string, knownServices map[string]bool) (service, component string) {
if knownServices[name] {
return name, name
}
for svc := range knownServices {
prefix := svc + "-"
if strings.HasPrefix(name, prefix) && len(name) > len(prefix) {
return svc, name[len(prefix):]
}
}
if i := strings.Index(name, "-"); i >= 0 {
return name[:i], name[i+1:]
}
return name, name
}

View File

@@ -7,8 +7,8 @@ import (
"strings" "strings"
"time" "time"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"golang.org/x/sys/unix" "golang.org/x/sys/unix"
"google.golang.org/protobuf/types/known/timestamppb" "google.golang.org/protobuf/types/known/timestamppb"
) )
@@ -31,6 +31,7 @@ func (a *Agent) NodeStatus(ctx context.Context, _ *mcpv1.NodeStatusRequest) (*mc
Runtime: a.Config.Agent.ContainerRuntime, Runtime: a.Config.Agent.ContainerRuntime,
ServiceCount: uint32(len(services)), //nolint:gosec // bounded ServiceCount: uint32(len(services)), //nolint:gosec // bounded
ComponentCount: componentCount, ComponentCount: componentCount,
AgentVersion: a.Version,
} }
// Runtime version. // Runtime version.

View File

@@ -0,0 +1,68 @@
package agent
import (
"fmt"
"math/rand/v2"
"net"
"sync"
)
const (
portRangeMin = 10000
portRangeMax = 60000
maxRetries = 10
)
// PortAllocator manages host port allocation for route-based deployments.
// It tracks allocated ports within the agent session to avoid double-allocation.
type PortAllocator struct {
mu sync.Mutex
allocated map[int]bool
}
// NewPortAllocator creates a new PortAllocator.
func NewPortAllocator() *PortAllocator {
return &PortAllocator{
allocated: make(map[int]bool),
}
}
// Allocate picks a free port in range [10000, 60000).
// It tries random ports, checks availability with net.Listen, and retries up to 10 times.
func (pa *PortAllocator) Allocate() (int, error) {
pa.mu.Lock()
defer pa.mu.Unlock()
for range maxRetries {
port := portRangeMin + rand.IntN(portRangeMax-portRangeMin) //nolint:gosec // port selection, not security
if pa.allocated[port] {
continue
}
if !isPortFree(port) {
continue
}
pa.allocated[port] = true
return port, nil
}
return 0, fmt.Errorf("failed to allocate port after %d attempts", maxRetries)
}
// Release marks a port as available again.
func (pa *PortAllocator) Release(port int) {
pa.mu.Lock()
defer pa.mu.Unlock()
delete(pa.allocated, port)
}
// isPortFree checks if a TCP port is available by attempting to listen on it.
func isPortFree(port int) bool {
ln, err := net.Listen("tcp", fmt.Sprintf("127.0.0.1:%d", port))
if err != nil {
return false
}
_ = ln.Close()
return true
}

View File

@@ -0,0 +1,65 @@
package agent
import (
"testing"
)
func TestPortAllocator_Allocate(t *testing.T) {
pa := NewPortAllocator()
port, err := pa.Allocate()
if err != nil {
t.Fatalf("allocate: %v", err)
}
if port < portRangeMin || port >= portRangeMax {
t.Fatalf("port %d out of range [%d, %d)", port, portRangeMin, portRangeMax)
}
}
func TestPortAllocator_NoDuplicates(t *testing.T) {
pa := NewPortAllocator()
ports := make(map[int]bool)
for range 20 {
port, err := pa.Allocate()
if err != nil {
t.Fatalf("allocate: %v", err)
}
if ports[port] {
t.Fatalf("duplicate port allocated: %d", port)
}
ports[port] = true
}
}
func TestPortAllocator_Release(t *testing.T) {
pa := NewPortAllocator()
port, err := pa.Allocate()
if err != nil {
t.Fatalf("allocate: %v", err)
}
pa.Release(port)
// After release, the port should no longer be tracked as allocated.
pa.mu.Lock()
if pa.allocated[port] {
t.Fatal("port should not be tracked after release")
}
pa.mu.Unlock()
}
func TestPortAllocator_PortIsFree(t *testing.T) {
pa := NewPortAllocator()
port, err := pa.Allocate()
if err != nil {
t.Fatalf("allocate: %v", err)
}
// The port should be free (we only track it, we don't hold the listener).
if !isPortFree(port) {
t.Fatalf("allocated port %d should be free on the system", port)
}
}

172
internal/agent/proxy.go Normal file
View File

@@ -0,0 +1,172 @@
package agent
import (
"context"
"fmt"
"log/slog"
"path/filepath"
"git.wntrmute.dev/mc/mc-proxy/client/mcproxy"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
// ProxyRouter registers and removes routes with mc-proxy.
// If the mc-proxy socket is not configured, it logs and returns nil
// (route registration is optional).
type ProxyRouter struct {
client *mcproxy.Client
certDir string
logger *slog.Logger
}
// NewProxyRouter connects to mc-proxy via Unix socket. Returns nil
// if socketPath is empty (route registration disabled).
func NewProxyRouter(socketPath, certDir string, logger *slog.Logger) (*ProxyRouter, error) {
if socketPath == "" {
logger.Info("mc-proxy socket not configured, route registration disabled")
return nil, nil
}
client, err := mcproxy.Dial(socketPath)
if err != nil {
return nil, fmt.Errorf("connect to mc-proxy at %s: %w", socketPath, err)
}
logger.Info("connected to mc-proxy", "socket", socketPath)
return &ProxyRouter{
client: client,
certDir: certDir,
logger: logger,
}, nil
}
// Close closes the mc-proxy connection.
func (p *ProxyRouter) Close() error {
if p == nil || p.client == nil {
return nil
}
return p.client.Close()
}
// CertPath returns the expected TLS certificate path for a given name.
func (p *ProxyRouter) CertPath(name string) string {
return filepath.Join(p.certDir, name+".pem")
}
// KeyPath returns the expected TLS key path for a given name.
func (p *ProxyRouter) KeyPath(name string) string {
return filepath.Join(p.certDir, name+".key")
}
// GetStatus returns the mc-proxy server status.
func (p *ProxyRouter) GetStatus(ctx context.Context) (*mcproxy.Status, error) {
if p == nil {
return nil, fmt.Errorf("mc-proxy not configured")
}
return p.client.GetStatus(ctx)
}
// AddRoute adds a single route to mc-proxy.
func (p *ProxyRouter) AddRoute(ctx context.Context, listenerAddr string, route mcproxy.Route) error {
if p == nil {
return fmt.Errorf("mc-proxy not configured")
}
return p.client.AddRoute(ctx, listenerAddr, route)
}
// RemoveRoute removes a single route from mc-proxy.
func (p *ProxyRouter) RemoveRoute(ctx context.Context, listenerAddr, hostname string) error {
if p == nil {
return fmt.Errorf("mc-proxy not configured")
}
return p.client.RemoveRoute(ctx, listenerAddr, hostname)
}
// RegisterRoutes registers all routes for a service component with mc-proxy.
// It uses the assigned host ports from the registry.
func (p *ProxyRouter) RegisterRoutes(ctx context.Context, serviceName string, routes []registry.Route, hostPorts map[string]int) error {
if p == nil {
return nil
}
for _, r := range routes {
hostPort, ok := hostPorts[r.Name]
if !ok || hostPort == 0 {
continue
}
hostname := r.Hostname
if hostname == "" {
hostname = serviceName + ".svc.mcp.metacircular.net"
}
listenerAddr := listenerForMode(r.Mode, r.Port)
backend := fmt.Sprintf("127.0.0.1:%d", hostPort)
route := mcproxy.Route{
Hostname: hostname,
Backend: backend,
Mode: r.Mode,
BackendTLS: r.Mode == "l4", // L4 passthrough: backend handles TLS. L7: mc-proxy terminates.
}
// L7 routes need TLS cert/key for mc-proxy to terminate TLS.
if r.Mode == "l7" {
route.TLSCert = filepath.Join(p.certDir, serviceName+".pem")
route.TLSKey = filepath.Join(p.certDir, serviceName+".key")
}
p.logger.Info("registering route",
"service", serviceName,
"hostname", hostname,
"listener", listenerAddr,
"backend", backend,
"mode", r.Mode,
)
if err := p.client.AddRoute(ctx, listenerAddr, route); err != nil {
return fmt.Errorf("register route %s on %s: %w", hostname, listenerAddr, err)
}
}
return nil
}
// RemoveRoutes removes all routes for a service component from mc-proxy.
func (p *ProxyRouter) RemoveRoutes(ctx context.Context, serviceName string, routes []registry.Route) error {
if p == nil {
return nil
}
for _, r := range routes {
hostname := r.Hostname
if hostname == "" {
hostname = serviceName + ".svc.mcp.metacircular.net"
}
listenerAddr := listenerForMode(r.Mode, r.Port)
p.logger.Info("removing route",
"service", serviceName,
"hostname", hostname,
"listener", listenerAddr,
)
if err := p.client.RemoveRoute(ctx, listenerAddr, hostname); err != nil {
// Log but don't fail — the route may already be gone.
p.logger.Warn("failed to remove route",
"hostname", hostname,
"listener", listenerAddr,
"err", err,
)
}
}
return nil
}
// listenerForMode returns the mc-proxy listener address for a given
// route mode and external port.
func listenerForMode(mode string, port int) string {
return fmt.Sprintf(":%d", port)
}

113
internal/agent/proxy_rpc.go Normal file
View File

@@ -0,0 +1,113 @@
package agent
import (
"context"
"fmt"
"git.wntrmute.dev/mc/mc-proxy/client/mcproxy"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
"google.golang.org/protobuf/types/known/timestamppb"
)
// ListProxyRoutes queries mc-proxy for its current status and routes.
func (a *Agent) ListProxyRoutes(ctx context.Context, _ *mcpv1.ListProxyRoutesRequest) (*mcpv1.ListProxyRoutesResponse, error) {
a.Logger.Debug("ListProxyRoutes called")
status, err := a.Proxy.GetStatus(ctx)
if err != nil {
return nil, fmt.Errorf("get mc-proxy status: %w", err)
}
resp := &mcpv1.ListProxyRoutesResponse{
Version: status.Version,
TotalConnections: status.TotalConnections,
}
if !status.StartedAt.IsZero() {
resp.StartedAt = timestamppb.New(status.StartedAt)
}
for _, ls := range status.Listeners {
listener := &mcpv1.ProxyListenerInfo{
Addr: ls.Addr,
RouteCount: int32(ls.RouteCount), //nolint:gosec // bounded
ActiveConnections: ls.ActiveConnections,
}
for _, r := range ls.Routes {
listener.Routes = append(listener.Routes, &mcpv1.ProxyRouteInfo{
Hostname: r.Hostname,
Backend: r.Backend,
Mode: r.Mode,
BackendTls: r.BackendTLS,
})
}
resp.Listeners = append(resp.Listeners, listener)
}
return resp, nil
}
// AddProxyRoute adds a route to mc-proxy.
func (a *Agent) AddProxyRoute(ctx context.Context, req *mcpv1.AddProxyRouteRequest) (*mcpv1.AddProxyRouteResponse, error) {
if req.GetListenerAddr() == "" {
return nil, status.Error(codes.InvalidArgument, "listener_addr is required")
}
if req.GetHostname() == "" {
return nil, status.Error(codes.InvalidArgument, "hostname is required")
}
if req.GetBackend() == "" {
return nil, status.Error(codes.InvalidArgument, "backend is required")
}
if a.Proxy == nil {
return nil, status.Error(codes.FailedPrecondition, "mc-proxy not configured")
}
route := mcproxy.Route{
Hostname: req.GetHostname(),
Backend: req.GetBackend(),
Mode: req.GetMode(),
BackendTLS: req.GetBackendTls(),
TLSCert: req.GetTlsCert(),
TLSKey: req.GetTlsKey(),
}
if err := a.Proxy.AddRoute(ctx, req.GetListenerAddr(), route); err != nil {
return nil, fmt.Errorf("add route: %w", err)
}
a.Logger.Info("route added",
"listener", req.GetListenerAddr(),
"hostname", req.GetHostname(),
"backend", req.GetBackend(),
"mode", req.GetMode(),
)
return &mcpv1.AddProxyRouteResponse{}, nil
}
// RemoveProxyRoute removes a route from mc-proxy.
func (a *Agent) RemoveProxyRoute(ctx context.Context, req *mcpv1.RemoveProxyRouteRequest) (*mcpv1.RemoveProxyRouteResponse, error) {
if req.GetListenerAddr() == "" {
return nil, status.Error(codes.InvalidArgument, "listener_addr is required")
}
if req.GetHostname() == "" {
return nil, status.Error(codes.InvalidArgument, "hostname is required")
}
if a.Proxy == nil {
return nil, status.Error(codes.FailedPrecondition, "mc-proxy not configured")
}
if err := a.Proxy.RemoveRoute(ctx, req.GetListenerAddr(), req.GetHostname()); err != nil {
return nil, fmt.Errorf("remove route: %w", err)
}
a.Logger.Info("route removed",
"listener", req.GetListenerAddr(),
"hostname", req.GetHostname(),
)
return &mcpv1.RemoveProxyRouteResponse{}, nil
}

View File

@@ -0,0 +1,57 @@
package agent
import (
"testing"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
func TestListenerForMode(t *testing.T) {
tests := []struct {
mode string
port int
want string
}{
{"l4", 8443, ":8443"},
{"l7", 443, ":443"},
{"l4", 9443, ":9443"},
}
for _, tt := range tests {
got := listenerForMode(tt.mode, tt.port)
if got != tt.want {
t.Errorf("listenerForMode(%q, %d) = %q, want %q", tt.mode, tt.port, got, tt.want)
}
}
}
func TestNilProxyRouterIsNoop(t *testing.T) {
var p *ProxyRouter
// All methods should return nil on a nil ProxyRouter.
if err := p.RegisterRoutes(nil, "svc", nil, nil); err != nil {
t.Errorf("RegisterRoutes on nil: %v", err)
}
if err := p.RemoveRoutes(nil, "svc", nil); err != nil {
t.Errorf("RemoveRoutes on nil: %v", err)
}
if err := p.Close(); err != nil {
t.Errorf("Close on nil: %v", err)
}
}
func TestRegisterRoutesSkipsZeroHostPort(t *testing.T) {
// A nil ProxyRouter should be a no-op, so this tests the skip logic
// indirectly. With a nil proxy, RegisterRoutes returns nil even
// with routes that have zero host ports.
var p *ProxyRouter
routes := []registry.Route{
{Name: "rest", Port: 8443, Mode: "l4"},
}
hostPorts := map[string]int{"rest": 0}
if err := p.RegisterRoutes(nil, "svc", routes, hostPorts); err != nil {
t.Errorf("RegisterRoutes: %v", err)
}
}

155
internal/agent/purge.go Normal file
View File

@@ -0,0 +1,155 @@
package agent
import (
"context"
"fmt"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
// PurgeComponent removes stale registry entries for components that are both
// gone (observed state is removed/unknown/exited) and unwanted (not in any
// current service definition). It never touches running containers.
func (a *Agent) PurgeComponent(ctx context.Context, req *mcpv1.PurgeRequest) (*mcpv1.PurgeResponse, error) {
a.Logger.Info("PurgeComponent",
"service", req.GetService(),
"component", req.GetComponent(),
"dry_run", req.GetDryRun(),
)
// Build a set of defined service/component pairs for quick lookup.
defined := make(map[string]bool, len(req.GetDefinedComponents()))
for _, dc := range req.GetDefinedComponents() {
defined[dc] = true
}
// Determine which services to examine.
var services []registry.Service
if req.GetService() != "" {
svc, err := registry.GetService(a.DB, req.GetService())
if err != nil {
return nil, fmt.Errorf("get service %q: %w", req.GetService(), err)
}
services = []registry.Service{*svc}
} else {
var err error
services, err = registry.ListServices(a.DB)
if err != nil {
return nil, fmt.Errorf("list services: %w", err)
}
}
var results []*mcpv1.PurgeResult
for _, svc := range services {
components, err := registry.ListComponents(a.DB, svc.Name)
if err != nil {
return nil, fmt.Errorf("list components for %q: %w", svc.Name, err)
}
// If a specific component was requested, filter to just that one.
if req.GetComponent() != "" {
var filtered []registry.Component
for _, c := range components {
if c.Name == req.GetComponent() {
filtered = append(filtered, c)
}
}
components = filtered
}
for _, comp := range components {
result := a.evaluatePurge(svc.Name, &comp, defined, req.GetDryRun())
results = append(results, result)
}
// If all components of this service were purged (not dry-run),
// check if the service should be cleaned up too.
if !req.GetDryRun() {
remaining, err := registry.ListComponents(a.DB, svc.Name)
if err != nil {
a.Logger.Warn("failed to check remaining components", "service", svc.Name, "err", err)
continue
}
if len(remaining) == 0 {
if err := registry.DeleteService(a.DB, svc.Name); err != nil {
a.Logger.Warn("failed to delete empty service", "service", svc.Name, "err", err)
} else {
a.Logger.Info("purged empty service", "service", svc.Name)
}
}
}
}
return &mcpv1.PurgeResponse{Results: results}, nil
}
// purgeableStates are observed states that indicate a component's container
// is gone and the registry entry can be safely removed.
var purgeableStates = map[string]bool{
"removed": true,
"unknown": true,
"exited": true,
}
// evaluatePurge checks whether a single component is eligible for purge and,
// if not in dry-run mode, deletes it.
func (a *Agent) evaluatePurge(service string, comp *registry.Component, defined map[string]bool, dryRun bool) *mcpv1.PurgeResult {
key := service + "/" + comp.Name
// Safety: refuse to purge components with a live container.
if !purgeableStates[comp.ObservedState] {
return &mcpv1.PurgeResult{
Service: service,
Component: comp.Name,
Purged: false,
Reason: fmt.Sprintf("observed=%s, container still exists", comp.ObservedState),
}
}
// Don't purge components that are still in service definitions.
if defined[key] {
return &mcpv1.PurgeResult{
Service: service,
Component: comp.Name,
Purged: false,
Reason: "still in service definitions",
}
}
reason := fmt.Sprintf("observed=%s, not in service definitions", comp.ObservedState)
if dryRun {
return &mcpv1.PurgeResult{
Service: service,
Component: comp.Name,
Purged: true,
Reason: reason,
}
}
// Delete events first (events table has no FK to components).
if err := registry.DeleteComponentEvents(a.DB, service, comp.Name); err != nil {
a.Logger.Warn("failed to delete events during purge", "service", service, "component", comp.Name, "err", err)
}
// Delete the component (CASCADE handles ports, volumes, cmd).
if err := registry.DeleteComponent(a.DB, service, comp.Name); err != nil {
return &mcpv1.PurgeResult{
Service: service,
Component: comp.Name,
Purged: false,
Reason: fmt.Sprintf("delete failed: %v", err),
}
}
a.Logger.Info("purged component", "service", service, "component", comp.Name, "reason", reason)
return &mcpv1.PurgeResult{
Service: service,
Component: comp.Name,
Purged: true,
Reason: reason,
}
}

View File

@@ -0,0 +1,405 @@
package agent
import (
"context"
"testing"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/registry"
)
func TestPurgeComponentRemoved(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
// Set up a service with a stale component.
if err := registry.CreateService(a.DB, "mcns", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "coredns",
Service: "mcns",
Image: "coredns:latest",
DesiredState: "running",
ObservedState: "removed",
}); err != nil {
t.Fatalf("create component: %v", err)
}
// Insert an event for this component.
if err := registry.InsertEvent(a.DB, "mcns", "coredns", "running", "removed"); err != nil {
t.Fatalf("insert event: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{
DefinedComponents: []string{"mcns/mcns"},
})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 {
t.Fatalf("expected 1 result, got %d", len(resp.Results))
}
r := resp.Results[0]
if !r.Purged {
t.Fatalf("expected purged=true, got reason: %s", r.Reason)
}
if r.Service != "mcns" || r.Component != "coredns" {
t.Fatalf("unexpected result: %s/%s", r.Service, r.Component)
}
// Verify component was deleted.
_, err = registry.GetComponent(a.DB, "mcns", "coredns")
if err == nil {
t.Fatal("component should have been deleted")
}
// Service should also be deleted since it has no remaining components.
_, err = registry.GetService(a.DB, "mcns")
if err == nil {
t.Fatal("service should have been deleted (no remaining components)")
}
}
func TestPurgeRefusesRunning(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "mcr", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "api",
Service: "mcr",
Image: "mcr:latest",
DesiredState: "running",
ObservedState: "running",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{
Service: "mcr",
Component: "api",
})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 {
t.Fatalf("expected 1 result, got %d", len(resp.Results))
}
if resp.Results[0].Purged {
t.Fatal("should not purge a running component")
}
// Verify component still exists.
_, err = registry.GetComponent(a.DB, "mcr", "api")
if err != nil {
t.Fatalf("component should still exist: %v", err)
}
}
func TestPurgeRefusesStopped(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "mcr", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "api",
Service: "mcr",
Image: "mcr:latest",
DesiredState: "stopped",
ObservedState: "stopped",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{
Service: "mcr",
Component: "api",
})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if resp.Results[0].Purged {
t.Fatal("should not purge a stopped component")
}
}
func TestPurgeSkipsDefinedComponent(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "mcns", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "mcns",
Service: "mcns",
Image: "mcns:latest",
DesiredState: "running",
ObservedState: "exited",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{
DefinedComponents: []string{"mcns/mcns"},
})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 {
t.Fatalf("expected 1 result, got %d", len(resp.Results))
}
if resp.Results[0].Purged {
t.Fatal("should not purge a component that is still in service definitions")
}
if resp.Results[0].Reason != "still in service definitions" {
t.Fatalf("unexpected reason: %s", resp.Results[0].Reason)
}
}
func TestPurgeDryRun(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "mcns", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "coredns",
Service: "mcns",
Image: "coredns:latest",
DesiredState: "running",
ObservedState: "removed",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{
DryRun: true,
DefinedComponents: []string{"mcns/mcns"},
})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 {
t.Fatalf("expected 1 result, got %d", len(resp.Results))
}
if !resp.Results[0].Purged {
t.Fatal("dry run should report purged=true for eligible components")
}
// Verify component was NOT deleted (dry run).
_, err = registry.GetComponent(a.DB, "mcns", "coredns")
if err != nil {
t.Fatalf("component should still exist after dry run: %v", err)
}
}
func TestPurgeServiceFilter(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
// Create two services.
if err := registry.CreateService(a.DB, "mcns", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "coredns", Service: "mcns", Image: "coredns:latest",
DesiredState: "running", ObservedState: "removed",
}); err != nil {
t.Fatalf("create component: %v", err)
}
if err := registry.CreateService(a.DB, "mcr", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "old", Service: "mcr", Image: "old:latest",
DesiredState: "running", ObservedState: "removed",
}); err != nil {
t.Fatalf("create component: %v", err)
}
// Purge only mcns.
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{
Service: "mcns",
})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 {
t.Fatalf("expected 1 result, got %d", len(resp.Results))
}
if resp.Results[0].Service != "mcns" {
t.Fatalf("expected mcns, got %s", resp.Results[0].Service)
}
// mcr/old should still exist.
_, err = registry.GetComponent(a.DB, "mcr", "old")
if err != nil {
t.Fatalf("mcr/old should still exist: %v", err)
}
}
func TestPurgeServiceDeletedWhenEmpty(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "mcns", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "coredns", Service: "mcns", Image: "coredns:latest",
DesiredState: "running", ObservedState: "removed",
}); err != nil {
t.Fatalf("create component: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "old-thing", Service: "mcns", Image: "old:latest",
DesiredState: "stopped", ObservedState: "unknown",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
// Both components should be purged.
if len(resp.Results) != 2 {
t.Fatalf("expected 2 results, got %d", len(resp.Results))
}
for _, r := range resp.Results {
if !r.Purged {
t.Fatalf("expected purged=true for %s/%s: %s", r.Service, r.Component, r.Reason)
}
}
// Service should be deleted.
_, err = registry.GetService(a.DB, "mcns")
if err == nil {
t.Fatal("service should have been deleted")
}
}
func TestPurgeServiceKeptWhenComponentsRemain(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "mcns", true); err != nil {
t.Fatalf("create service: %v", err)
}
// Stale component (will be purged).
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "coredns", Service: "mcns", Image: "coredns:latest",
DesiredState: "running", ObservedState: "removed",
}); err != nil {
t.Fatalf("create component: %v", err)
}
// Live component (will not be purged).
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "mcns", Service: "mcns", Image: "mcns:latest",
DesiredState: "running", ObservedState: "running",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 2 {
t.Fatalf("expected 2 results, got %d", len(resp.Results))
}
// coredns should be purged, mcns should not.
purged := 0
for _, r := range resp.Results {
if r.Purged {
purged++
if r.Component != "coredns" {
t.Fatalf("expected coredns to be purged, got %s", r.Component)
}
}
}
if purged != 1 {
t.Fatalf("expected 1 purged, got %d", purged)
}
// Service should still exist.
_, err = registry.GetService(a.DB, "mcns")
if err != nil {
t.Fatalf("service should still exist: %v", err)
}
}
func TestPurgeExitedState(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "test", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "old", Service: "test", Image: "old:latest",
DesiredState: "stopped", ObservedState: "exited",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 || !resp.Results[0].Purged {
t.Fatalf("exited component should be purgeable")
}
}
func TestPurgeUnknownState(t *testing.T) {
rt := &fakeRuntime{}
a := newTestAgent(t, rt)
ctx := context.Background()
if err := registry.CreateService(a.DB, "test", true); err != nil {
t.Fatalf("create service: %v", err)
}
if err := registry.CreateComponent(a.DB, &registry.Component{
Name: "ghost", Service: "test", Image: "ghost:latest",
DesiredState: "running", ObservedState: "unknown",
}); err != nil {
t.Fatalf("create component: %v", err)
}
resp, err := a.PurgeComponent(ctx, &mcpv1.PurgeRequest{})
if err != nil {
t.Fatalf("PurgeComponent: %v", err)
}
if len(resp.Results) != 1 || !resp.Results[0].Purged {
t.Fatalf("unknown component should be purgeable")
}
}

View File

@@ -3,12 +3,11 @@ package agent
import ( import (
"context" "context"
"fmt" "fmt"
"strings"
"time" "time"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
"google.golang.org/protobuf/types/known/timestamppb" "google.golang.org/protobuf/types/known/timestamppb"
) )
@@ -75,7 +74,10 @@ func (a *Agent) liveCheckServices(ctx context.Context) ([]*mcpv1.ServiceInfo, er
} }
var result []*mcpv1.ServiceInfo var result []*mcpv1.ServiceInfo
knownServices := make(map[string]bool, len(services))
for _, svc := range services { for _, svc := range services {
knownServices[svc.Name] = true
components, err := registry.ListComponents(a.DB, svc.Name) components, err := registry.ListComponents(a.DB, svc.Name)
if err != nil { if err != nil {
return nil, fmt.Errorf("list components for %q: %w", svc.Name, err) return nil, fmt.Errorf("list components for %q: %w", svc.Name, err)
@@ -87,7 +89,7 @@ func (a *Agent) liveCheckServices(ctx context.Context) ([]*mcpv1.ServiceInfo, er
} }
for _, comp := range components { for _, comp := range components {
containerName := svc.Name + "-" + comp.Name containerName := ContainerNameFor(svc.Name, comp.Name)
ci := &mcpv1.ComponentInfo{ ci := &mcpv1.ComponentInfo{
Name: comp.Name, Name: comp.Name,
Image: comp.Image, Image: comp.Image,
@@ -97,6 +99,12 @@ func (a *Agent) liveCheckServices(ctx context.Context) ([]*mcpv1.ServiceInfo, er
if rc, ok := runtimeByName[containerName]; ok { if rc, ok := runtimeByName[containerName]; ok {
ci.ObservedState = rc.State ci.ObservedState = rc.State
if rc.Version != "" {
ci.Version = rc.Version
}
if rc.Image != "" {
ci.Image = rc.Image
}
if !rc.Started.IsZero() { if !rc.Started.IsZero() {
ci.Started = timestamppb.New(rc.Started) ci.Started = timestamppb.New(rc.Started)
} }
@@ -116,7 +124,7 @@ func (a *Agent) liveCheckServices(ctx context.Context) ([]*mcpv1.ServiceInfo, er
continue continue
} }
svcName, compName := splitContainerName(c.Name) svcName, compName := SplitContainerName(c.Name, knownServices)
result = append(result, &mcpv1.ServiceInfo{ result = append(result, &mcpv1.ServiceInfo{
Name: svcName, Name: svcName,
@@ -210,13 +218,3 @@ func (a *Agent) GetServiceStatus(ctx context.Context, req *mcpv1.GetServiceStatu
RecentEvents: protoEvents, RecentEvents: protoEvents,
}, nil }, nil
} }
// splitContainerName splits a container name like "metacrypt-api" into service
// and component parts. If there is no hyphen, the whole name is used as both
// the service and component name.
func splitContainerName(name string) (service, component string) {
if i := strings.Index(name, "-"); i >= 0 {
return name[:i], name[i+1:]
}
return name, name
}

View File

@@ -4,9 +4,9 @@ import (
"context" "context"
"testing" "testing"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
func TestListServices(t *testing.T) { func TestListServices(t *testing.T) {
@@ -253,22 +253,47 @@ func TestGetServiceStatus_IgnoreSkipsDrift(t *testing.T) {
} }
func TestSplitContainerName(t *testing.T) { func TestSplitContainerName(t *testing.T) {
known := map[string]bool{
"metacrypt": true,
"mc-proxy": true,
"mcr": true,
}
tests := []struct { tests := []struct {
name string name string
service string service string
comp string comp string
}{ }{
{"metacrypt-api", "metacrypt", "api"}, {"metacrypt-api", "metacrypt", "api"},
{"metacrypt-web-ui", "metacrypt", "web-ui"}, {"metacrypt-web", "metacrypt", "web"},
{"mc-proxy", "mc-proxy", "mc-proxy"},
{"mcr-api", "mcr", "api"},
{"standalone", "standalone", "standalone"}, {"standalone", "standalone", "standalone"},
{"unknown-thing", "unknown", "thing"},
} }
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
svc, comp := splitContainerName(tt.name) svc, comp := SplitContainerName(tt.name, known)
if svc != tt.service || comp != tt.comp { if svc != tt.service || comp != tt.comp {
t.Fatalf("splitContainerName(%q) = (%q, %q), want (%q, %q)", t.Fatalf("SplitContainerName(%q) = (%q, %q), want (%q, %q)",
tt.name, svc, comp, tt.service, tt.comp) tt.name, svc, comp, tt.service, tt.comp)
} }
}) })
} }
} }
func TestContainerNameFor(t *testing.T) {
tests := []struct {
service, component, want string
}{
{"metacrypt", "api", "metacrypt-api"},
{"mc-proxy", "mc-proxy", "mc-proxy"},
{"mcr", "web", "mcr-web"},
}
for _, tt := range tests {
got := ContainerNameFor(tt.service, tt.component)
if got != tt.want {
t.Fatalf("ContainerNameFor(%q, %q) = %q, want %q",
tt.service, tt.component, got, tt.want)
}
}
}

View File

@@ -5,9 +5,9 @@ import (
"fmt" "fmt"
"strings" "strings"
mcpv1 "git.wntrmute.dev/kyle/mcp/gen/mcp/v1" mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
"google.golang.org/grpc/codes" "google.golang.org/grpc/codes"
"google.golang.org/grpc/status" "google.golang.org/grpc/status"
) )
@@ -157,6 +157,24 @@ func (a *Agent) reconcileUntracked(ctx context.Context, known map[string]bool) e
// protoToComponent converts a proto ComponentSpec to a registry Component. // protoToComponent converts a proto ComponentSpec to a registry Component.
func protoToComponent(service string, cs *mcpv1.ComponentSpec, desiredState string) *registry.Component { func protoToComponent(service string, cs *mcpv1.ComponentSpec, desiredState string) *registry.Component {
var routes []registry.Route
for _, r := range cs.GetRoutes() {
mode := r.GetMode()
if mode == "" {
mode = "l4"
}
name := r.GetName()
if name == "" {
name = "default"
}
routes = append(routes, registry.Route{
Name: name,
Port: int(r.GetPort()),
Mode: mode,
Hostname: r.GetHostname(),
})
}
return &registry.Component{ return &registry.Component{
Name: cs.GetName(), Name: cs.GetName(),
Service: service, Service: service,
@@ -167,6 +185,7 @@ func protoToComponent(service string, cs *mcpv1.ComponentSpec, desiredState stri
Ports: cs.GetPorts(), Ports: cs.GetPorts(),
Volumes: cs.GetVolumes(), Volumes: cs.GetVolumes(),
Cmd: cs.GetCmd(), Cmd: cs.GetCmd(),
Routes: routes,
DesiredState: desiredState, DesiredState: desiredState,
Version: runtime.ExtractVersion(cs.GetImage()), Version: runtime.ExtractVersion(cs.GetImage()),
} }

View File

@@ -6,9 +6,9 @@ import (
"path/filepath" "path/filepath"
"testing" "testing"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
// fakeRuntime implements runtime.Runtime for testing. // fakeRuntime implements runtime.Runtime for testing.
@@ -22,6 +22,10 @@ func (f *fakeRuntime) Pull(_ context.Context, _ string) error { re
func (f *fakeRuntime) Run(_ context.Context, _ runtime.ContainerSpec) error { return nil } func (f *fakeRuntime) Run(_ context.Context, _ runtime.ContainerSpec) error { return nil }
func (f *fakeRuntime) Stop(_ context.Context, _ string) error { return nil } func (f *fakeRuntime) Stop(_ context.Context, _ string) error { return nil }
func (f *fakeRuntime) Remove(_ context.Context, _ string) error { return nil } func (f *fakeRuntime) Remove(_ context.Context, _ string) error { return nil }
func (f *fakeRuntime) Build(_ context.Context, _, _, _ string) error { return nil }
func (f *fakeRuntime) Push(_ context.Context, _ string) error { return nil }
func (f *fakeRuntime) ImageExists(_ context.Context, _ string) (bool, error) { return true, nil }
func (f *fakeRuntime) List(_ context.Context) ([]runtime.ContainerInfo, error) { func (f *fakeRuntime) List(_ context.Context) ([]runtime.ContainerInfo, error) {
return f.containers, f.listErr return f.containers, f.listErr

100
internal/agent/undeploy.go Normal file
View File

@@ -0,0 +1,100 @@
package agent
import (
"context"
"fmt"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/registry"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
)
// UndeployService fully tears down a service: removes routes, DNS records,
// TLS certificates, stops and removes containers, releases ports, and marks
// the service inactive. This is the inverse of Deploy.
func (a *Agent) UndeployService(ctx context.Context, req *mcpv1.UndeployServiceRequest) (*mcpv1.UndeployServiceResponse, error) {
a.Logger.Info("UndeployService", "service", req.GetName())
if req.GetName() == "" {
return nil, status.Error(codes.InvalidArgument, "service name is required")
}
serviceName := req.GetName()
components, err := registry.ListComponents(a.DB, serviceName)
if err != nil {
return nil, status.Errorf(codes.Internal, "list components: %v", err)
}
var results []*mcpv1.ComponentResult
dnsRemoved := false
for _, c := range components {
r := a.undeployComponent(ctx, serviceName, &c, &dnsRemoved)
results = append(results, r)
}
// Mark the service as inactive.
if err := registry.UpdateServiceActive(a.DB, serviceName, false); err != nil {
a.Logger.Warn("failed to mark service inactive", "service", serviceName, "err", err)
}
return &mcpv1.UndeployServiceResponse{Results: results}, nil
}
// undeployComponent tears down a single component. The dnsRemoved flag
// tracks whether DNS has already been removed for this service (DNS is
// per-service, not per-component).
func (a *Agent) undeployComponent(ctx context.Context, serviceName string, c *registry.Component, dnsRemoved *bool) *mcpv1.ComponentResult {
containerName := ContainerNameFor(serviceName, c.Name)
r := &mcpv1.ComponentResult{Name: c.Name, Success: true}
// 1. Remove mc-proxy routes.
if len(c.Routes) > 0 && a.Proxy != nil {
if err := a.Proxy.RemoveRoutes(ctx, serviceName, c.Routes); err != nil {
a.Logger.Warn("failed to remove routes", "service", serviceName, "component", c.Name, "err", err)
}
}
// 2. Remove DNS records (once per service).
if len(c.Routes) > 0 && a.DNS != nil && !*dnsRemoved {
if err := a.DNS.RemoveRecord(ctx, serviceName); err != nil {
a.Logger.Warn("failed to remove DNS record", "service", serviceName, "err", err)
}
*dnsRemoved = true
}
// 3. Remove TLS certs (L7 routes only).
if hasL7Routes(c.Routes) && a.Certs != nil {
if err := a.Certs.RemoveCert(serviceName); err != nil {
a.Logger.Warn("failed to remove TLS cert", "service", serviceName, "err", err)
}
}
// 4. Stop and remove the container.
if err := a.Runtime.Stop(ctx, containerName); err != nil {
a.Logger.Info("stop container (ignored)", "container", containerName, "error", err)
}
if err := a.Runtime.Remove(ctx, containerName); err != nil {
a.Logger.Info("remove container (ignored)", "container", containerName, "error", err)
}
// 5. Release allocated ports.
if a.PortAlloc != nil {
hostPorts, err := registry.GetRouteHostPorts(a.DB, serviceName, c.Name)
if err == nil {
for _, port := range hostPorts {
a.PortAlloc.Release(port)
}
}
}
// 6. Update registry state.
if err := registry.UpdateComponentState(a.DB, serviceName, c.Name, "removed", "removed"); err != nil {
r.Success = false
r.Error = fmt.Sprintf("update state: %v", err)
}
return r
}

View File

@@ -206,7 +206,10 @@ func TokenInfoFromContext(ctx context.Context) *TokenInfo {
} }
// AuthInterceptor returns a gRPC unary server interceptor that validates // AuthInterceptor returns a gRPC unary server interceptor that validates
// bearer tokens and requires the "admin" role. // bearer tokens. Any authenticated user or system account is accepted,
// except guests which are explicitly rejected. Admin role is not required
// for agent operations — it is reserved for MCIAS account management and
// policy changes.
func AuthInterceptor(validator TokenValidator) grpc.UnaryServerInterceptor { func AuthInterceptor(validator TokenValidator) grpc.UnaryServerInterceptor {
return func( return func(
ctx context.Context, ctx context.Context,
@@ -240,9 +243,9 @@ func AuthInterceptor(validator TokenValidator) grpc.UnaryServerInterceptor {
return nil, status.Error(codes.Unauthenticated, "invalid token") return nil, status.Error(codes.Unauthenticated, "invalid token")
} }
if !tokenInfo.HasRole("admin") { if tokenInfo.HasRole("guest") {
slog.Warn("permission denied", "method", info.FullMethod, "user", tokenInfo.Username) slog.Warn("guest access denied", "method", info.FullMethod, "user", tokenInfo.Username)
return nil, status.Error(codes.PermissionDenied, "admin role required") return nil, status.Error(codes.PermissionDenied, "guest access not permitted")
} }
slog.Info("rpc", "method", info.FullMethod, "user", tokenInfo.Username, "account_type", tokenInfo.AccountType) slog.Info("rpc", "method", info.FullMethod, "user", tokenInfo.Username, "account_type", tokenInfo.AccountType)
@@ -252,6 +255,52 @@ func AuthInterceptor(validator TokenValidator) grpc.UnaryServerInterceptor {
} }
} }
// StreamAuthInterceptor returns a gRPC stream server interceptor with
// the same authentication rules as AuthInterceptor.
func StreamAuthInterceptor(validator TokenValidator) grpc.StreamServerInterceptor {
return func(
srv any,
ss grpc.ServerStream,
info *grpc.StreamServerInfo,
handler grpc.StreamHandler,
) error {
md, ok := metadata.FromIncomingContext(ss.Context())
if !ok {
return status.Error(codes.Unauthenticated, "missing metadata")
}
authValues := md.Get("authorization")
if len(authValues) == 0 {
return status.Error(codes.Unauthenticated, "missing authorization header")
}
authHeader := authValues[0]
if !strings.HasPrefix(authHeader, "Bearer ") {
return status.Error(codes.Unauthenticated, "malformed authorization header")
}
token := strings.TrimPrefix(authHeader, "Bearer ")
tokenInfo, err := validator.ValidateToken(ss.Context(), token)
if err != nil {
slog.Error("token validation failed", "method", info.FullMethod, "error", err)
return status.Error(codes.Unauthenticated, "token validation failed")
}
if !tokenInfo.Valid {
return status.Error(codes.Unauthenticated, "invalid token")
}
if tokenInfo.HasRole("guest") {
slog.Warn("guest access denied", "method", info.FullMethod, "user", tokenInfo.Username)
return status.Error(codes.PermissionDenied, "guest access not permitted")
}
slog.Info("rpc", "method", info.FullMethod, "user", tokenInfo.Username, "account_type", tokenInfo.AccountType)
return handler(srv, ss)
}
}
// Login authenticates with MCIAS and returns a bearer token. // Login authenticates with MCIAS and returns a bearer token.
func Login(serverURL, caCertPath, username, password string) (string, error) { func Login(serverURL, caCertPath, username, password string) (string, error) {
client, err := newHTTPClient(caCertPath) client, err := newHTTPClient(caCertPath)

View File

@@ -126,7 +126,7 @@ func TestInterceptorRejectsInvalidToken(t *testing.T) {
} }
} }
func TestInterceptorRejectsNonAdmin(t *testing.T) { func TestInterceptorAcceptsRegularUser(t *testing.T) {
server := mockMCIAS(t, func(authHeader string) (any, int) { server := mockMCIAS(t, func(authHeader string) (any, int) {
return &TokenInfo{ return &TokenInfo{
Valid: true, Valid: true,
@@ -142,6 +142,28 @@ func TestInterceptorRejectsNonAdmin(t *testing.T) {
md := metadata.Pairs("authorization", "Bearer user-token") md := metadata.Pairs("authorization", "Bearer user-token")
ctx := metadata.NewIncomingContext(context.Background(), md) ctx := metadata.NewIncomingContext(context.Background(), md)
_, err := callInterceptor(ctx, v)
if err != nil {
t.Fatalf("expected regular user to be accepted, got %v", err)
}
}
func TestInterceptorRejectsGuest(t *testing.T) {
server := mockMCIAS(t, func(authHeader string) (any, int) {
return &TokenInfo{
Valid: true,
Username: "visitor",
Roles: []string{"guest"},
AccountType: "human",
}, http.StatusOK
})
defer server.Close()
v := validatorFromServer(t, server)
md := metadata.Pairs("authorization", "Bearer guest-token")
ctx := metadata.NewIncomingContext(context.Background(), md)
_, err := callInterceptor(ctx, v) _, err := callInterceptor(ctx, v)
if err == nil { if err == nil {
t.Fatal("expected error, got nil") t.Fatal("expected error, got nil")

View File

@@ -14,10 +14,65 @@ type AgentConfig struct {
Database DatabaseConfig `toml:"database"` Database DatabaseConfig `toml:"database"`
MCIAS MCIASConfig `toml:"mcias"` MCIAS MCIASConfig `toml:"mcias"`
Agent AgentSettings `toml:"agent"` Agent AgentSettings `toml:"agent"`
MCProxy MCProxyConfig `toml:"mcproxy"`
Metacrypt MetacryptConfig `toml:"metacrypt"`
MCNS MCNSConfig `toml:"mcns"`
Monitor MonitorConfig `toml:"monitor"` Monitor MonitorConfig `toml:"monitor"`
Log LogConfig `toml:"log"` Log LogConfig `toml:"log"`
} }
// MetacryptConfig holds the Metacrypt CA integration settings for
// automated TLS cert provisioning. If ServerURL is empty, cert
// provisioning is disabled.
type MetacryptConfig struct {
// ServerURL is the Metacrypt API base URL (e.g. "https://metacrypt:8443").
ServerURL string `toml:"server_url"`
// CACert is the path to the CA certificate for verifying Metacrypt's TLS.
CACert string `toml:"ca_cert"`
// Mount is the CA engine mount name. Defaults to "pki".
Mount string `toml:"mount"`
// Issuer is the intermediate CA issuer name. Defaults to "infra".
Issuer string `toml:"issuer"`
// TokenPath is the path to the MCIAS service token file.
TokenPath string `toml:"token_path"`
}
// MCNSConfig holds the MCNS DNS integration settings for automated
// DNS record registration. If ServerURL is empty, DNS registration
// is disabled.
type MCNSConfig struct {
// ServerURL is the MCNS API base URL (e.g. "https://localhost:28443").
ServerURL string `toml:"server_url"`
// CACert is the path to the CA certificate for verifying MCNS's TLS.
CACert string `toml:"ca_cert"`
// TokenPath is the path to the MCIAS service token file.
TokenPath string `toml:"token_path"`
// Zone is the DNS zone for service records. Defaults to "svc.mcp.metacircular.net".
Zone string `toml:"zone"`
// NodeAddr is the IP address to register as the A record value.
NodeAddr string `toml:"node_addr"`
}
// MCProxyConfig holds the mc-proxy connection settings.
type MCProxyConfig struct {
// Socket is the path to the mc-proxy gRPC admin API Unix socket.
// If empty, route registration is disabled.
Socket string `toml:"socket"`
// CertDir is the directory containing TLS certificates for routes.
// Convention: <service>.pem and <service>.key per service.
// Defaults to /srv/mc-proxy/certs.
CertDir string `toml:"cert_dir"`
}
// ServerConfig holds gRPC server listen address and TLS paths. // ServerConfig holds gRPC server listen address and TLS paths.
type ServerConfig struct { type ServerConfig struct {
GRPCAddr string `toml:"grpc_addr"` GRPCAddr string `toml:"grpc_addr"`
@@ -134,6 +189,18 @@ func applyAgentDefaults(cfg *AgentConfig) {
if cfg.Agent.ContainerRuntime == "" { if cfg.Agent.ContainerRuntime == "" {
cfg.Agent.ContainerRuntime = "podman" cfg.Agent.ContainerRuntime = "podman"
} }
if cfg.MCProxy.CertDir == "" {
cfg.MCProxy.CertDir = "/srv/mc-proxy/certs"
}
if cfg.Metacrypt.Mount == "" {
cfg.Metacrypt.Mount = "pki"
}
if cfg.Metacrypt.Issuer == "" {
cfg.Metacrypt.Issuer = "infra"
}
if cfg.MCNS.Zone == "" {
cfg.MCNS.Zone = "svc.mcp.metacircular.net"
}
} }
func applyAgentEnvOverrides(cfg *AgentConfig) { func applyAgentEnvOverrides(cfg *AgentConfig) {
@@ -158,6 +225,27 @@ func applyAgentEnvOverrides(cfg *AgentConfig) {
if v := os.Getenv("MCP_AGENT_LOG_LEVEL"); v != "" { if v := os.Getenv("MCP_AGENT_LOG_LEVEL"); v != "" {
cfg.Log.Level = v cfg.Log.Level = v
} }
if v := os.Getenv("MCP_AGENT_MCPROXY_SOCKET"); v != "" {
cfg.MCProxy.Socket = v
}
if v := os.Getenv("MCP_AGENT_MCPROXY_CERT_DIR"); v != "" {
cfg.MCProxy.CertDir = v
}
if v := os.Getenv("MCP_AGENT_METACRYPT_SERVER_URL"); v != "" {
cfg.Metacrypt.ServerURL = v
}
if v := os.Getenv("MCP_AGENT_METACRYPT_TOKEN_PATH"); v != "" {
cfg.Metacrypt.TokenPath = v
}
if v := os.Getenv("MCP_AGENT_MCNS_SERVER_URL"); v != "" {
cfg.MCNS.ServerURL = v
}
if v := os.Getenv("MCP_AGENT_MCNS_TOKEN_PATH"); v != "" {
cfg.MCNS.TokenPath = v
}
if v := os.Getenv("MCP_AGENT_MCNS_NODE_ADDR"); v != "" {
cfg.MCNS.NodeAddr = v
}
} }
func validateAgentConfig(cfg *AgentConfig) error { func validateAgentConfig(cfg *AgentConfig) error {

View File

@@ -3,6 +3,7 @@ package config
import ( import (
"fmt" "fmt"
"os" "os"
"strings"
toml "github.com/pelletier/go-toml/v2" toml "github.com/pelletier/go-toml/v2"
) )
@@ -10,9 +11,23 @@ import (
// CLIConfig is the configuration for the mcp CLI binary. // CLIConfig is the configuration for the mcp CLI binary.
type CLIConfig struct { type CLIConfig struct {
Services ServicesConfig `toml:"services"` Services ServicesConfig `toml:"services"`
Build BuildConfig `toml:"build"`
MCIAS MCIASConfig `toml:"mcias"` MCIAS MCIASConfig `toml:"mcias"`
Auth AuthConfig `toml:"auth"` Auth AuthConfig `toml:"auth"`
Nodes []NodeConfig `toml:"nodes"` Nodes []NodeConfig `toml:"nodes"`
Master *CLIMasterConfig `toml:"master,omitempty"`
}
// CLIMasterConfig holds the optional master connection settings.
// When configured, deploy/undeploy/status go through the master
// instead of directly to agents.
type CLIMasterConfig struct {
Address string `toml:"address"` // master gRPC address (e.g. "100.95.252.120:9555")
}
// BuildConfig holds settings for building container images.
type BuildConfig struct {
Workspace string `toml:"workspace"`
} }
// ServicesConfig defines where service definition files live. // ServicesConfig defines where service definition files live.
@@ -66,6 +81,9 @@ func applyCLIEnvOverrides(cfg *CLIConfig) {
if v := os.Getenv("MCP_SERVICES_DIR"); v != "" { if v := os.Getenv("MCP_SERVICES_DIR"); v != "" {
cfg.Services.Dir = v cfg.Services.Dir = v
} }
if v := os.Getenv("MCP_BUILD_WORKSPACE"); v != "" {
cfg.Build.Workspace = v
}
if v := os.Getenv("MCP_MCIAS_SERVER_URL"); v != "" { if v := os.Getenv("MCP_MCIAS_SERVER_URL"); v != "" {
cfg.MCIAS.ServerURL = v cfg.MCIAS.ServerURL = v
} }
@@ -93,5 +111,15 @@ func validateCLIConfig(cfg *CLIConfig) error {
if cfg.Auth.TokenPath == "" { if cfg.Auth.TokenPath == "" {
return fmt.Errorf("auth.token_path is required") return fmt.Errorf("auth.token_path is required")
} }
// Expand ~ in workspace path.
if strings.HasPrefix(cfg.Build.Workspace, "~/") {
home, err := os.UserHomeDir()
if err != nil {
return fmt.Errorf("expand workspace path: %w", err)
}
cfg.Build.Workspace = home + cfg.Build.Workspace[1:]
}
return nil return nil
} }

View File

@@ -163,6 +163,19 @@ func TestLoadAgentConfig(t *testing.T) {
if cfg.Log.Level != "debug" { if cfg.Log.Level != "debug" {
t.Fatalf("log.level: got %q", cfg.Log.Level) t.Fatalf("log.level: got %q", cfg.Log.Level)
} }
// Metacrypt defaults when section is omitted.
if cfg.Metacrypt.Mount != "pki" {
t.Fatalf("metacrypt.mount default: got %q, want pki", cfg.Metacrypt.Mount)
}
if cfg.Metacrypt.Issuer != "infra" {
t.Fatalf("metacrypt.issuer default: got %q, want infra", cfg.Metacrypt.Issuer)
}
// MCNS defaults when section is omitted.
if cfg.MCNS.Zone != "svc.mcp.metacircular.net" {
t.Fatalf("mcns.zone default: got %q, want svc.mcp.metacircular.net", cfg.MCNS.Zone)
}
} }
func TestCLIConfigValidation(t *testing.T) { func TestCLIConfigValidation(t *testing.T) {
@@ -439,6 +452,155 @@ level = "info"
}) })
} }
func TestAgentConfigMetacrypt(t *testing.T) {
cfgStr := `
[server]
grpc_addr = "0.0.0.0:9444"
tls_cert = "/srv/mcp/cert.pem"
tls_key = "/srv/mcp/key.pem"
[database]
path = "/srv/mcp/mcp.db"
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcp-agent"
[agent]
node_name = "rift"
[metacrypt]
server_url = "https://metacrypt.metacircular.net:8443"
ca_cert = "/etc/mcp/metacircular-ca.pem"
mount = "custom-pki"
issuer = "custom-issuer"
token_path = "/srv/mcp/metacrypt-token"
`
path := writeTempConfig(t, cfgStr)
cfg, err := LoadAgentConfig(path)
if err != nil {
t.Fatalf("load: %v", err)
}
if cfg.Metacrypt.ServerURL != "https://metacrypt.metacircular.net:8443" {
t.Fatalf("metacrypt.server_url: got %q", cfg.Metacrypt.ServerURL)
}
if cfg.Metacrypt.CACert != "/etc/mcp/metacircular-ca.pem" {
t.Fatalf("metacrypt.ca_cert: got %q", cfg.Metacrypt.CACert)
}
if cfg.Metacrypt.Mount != "custom-pki" {
t.Fatalf("metacrypt.mount: got %q", cfg.Metacrypt.Mount)
}
if cfg.Metacrypt.Issuer != "custom-issuer" {
t.Fatalf("metacrypt.issuer: got %q", cfg.Metacrypt.Issuer)
}
if cfg.Metacrypt.TokenPath != "/srv/mcp/metacrypt-token" {
t.Fatalf("metacrypt.token_path: got %q", cfg.Metacrypt.TokenPath)
}
}
func TestAgentConfigMetacryptEnvOverrides(t *testing.T) {
minimal := `
[server]
grpc_addr = "0.0.0.0:9444"
tls_cert = "/srv/mcp/cert.pem"
tls_key = "/srv/mcp/key.pem"
[database]
path = "/srv/mcp/mcp.db"
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcp-agent"
[agent]
node_name = "rift"
`
t.Setenv("MCP_AGENT_METACRYPT_SERVER_URL", "https://override.metacrypt:8443")
t.Setenv("MCP_AGENT_METACRYPT_TOKEN_PATH", "/override/token")
path := writeTempConfig(t, minimal)
cfg, err := LoadAgentConfig(path)
if err != nil {
t.Fatalf("load: %v", err)
}
if cfg.Metacrypt.ServerURL != "https://override.metacrypt:8443" {
t.Fatalf("metacrypt.server_url: got %q", cfg.Metacrypt.ServerURL)
}
if cfg.Metacrypt.TokenPath != "/override/token" {
t.Fatalf("metacrypt.token_path: got %q", cfg.Metacrypt.TokenPath)
}
}
func TestAgentConfigMCNS(t *testing.T) {
cfgStr := `
[server]
grpc_addr = "0.0.0.0:9444"
tls_cert = "/srv/mcp/cert.pem"
tls_key = "/srv/mcp/key.pem"
[database]
path = "/srv/mcp/mcp.db"
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcp-agent"
[agent]
node_name = "rift"
[mcns]
server_url = "https://localhost:28443"
ca_cert = "/srv/mcp/certs/metacircular-ca.pem"
token_path = "/srv/mcp/metacrypt-token"
zone = "custom.zone"
node_addr = "10.0.0.1"
`
path := writeTempConfig(t, cfgStr)
cfg, err := LoadAgentConfig(path)
if err != nil {
t.Fatalf("load: %v", err)
}
if cfg.MCNS.ServerURL != "https://localhost:28443" {
t.Fatalf("mcns.server_url: got %q", cfg.MCNS.ServerURL)
}
if cfg.MCNS.CACert != "/srv/mcp/certs/metacircular-ca.pem" {
t.Fatalf("mcns.ca_cert: got %q", cfg.MCNS.CACert)
}
if cfg.MCNS.Zone != "custom.zone" {
t.Fatalf("mcns.zone: got %q", cfg.MCNS.Zone)
}
if cfg.MCNS.NodeAddr != "10.0.0.1" {
t.Fatalf("mcns.node_addr: got %q", cfg.MCNS.NodeAddr)
}
}
func TestAgentConfigMCNSEnvOverrides(t *testing.T) {
minimal := `
[server]
grpc_addr = "0.0.0.0:9444"
tls_cert = "/srv/mcp/cert.pem"
tls_key = "/srv/mcp/key.pem"
[database]
path = "/srv/mcp/mcp.db"
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcp-agent"
[agent]
node_name = "rift"
`
t.Setenv("MCP_AGENT_MCNS_SERVER_URL", "https://override:28443")
t.Setenv("MCP_AGENT_MCNS_TOKEN_PATH", "/override/token")
t.Setenv("MCP_AGENT_MCNS_NODE_ADDR", "10.0.0.99")
path := writeTempConfig(t, minimal)
cfg, err := LoadAgentConfig(path)
if err != nil {
t.Fatalf("load: %v", err)
}
if cfg.MCNS.ServerURL != "https://override:28443" {
t.Fatalf("mcns.server_url: got %q", cfg.MCNS.ServerURL)
}
if cfg.MCNS.TokenPath != "/override/token" {
t.Fatalf("mcns.token_path: got %q", cfg.MCNS.TokenPath)
}
if cfg.MCNS.NodeAddr != "10.0.0.99" {
t.Fatalf("mcns.node_addr: got %q", cfg.MCNS.NodeAddr)
}
}
func TestDurationParsing(t *testing.T) { func TestDurationParsing(t *testing.T) {
tests := []struct { tests := []struct {
input string input string

168
internal/config/master.go Normal file
View File

@@ -0,0 +1,168 @@
package config
import (
"fmt"
"os"
"time"
toml "github.com/pelletier/go-toml/v2"
)
// MasterConfig is the configuration for the mcp-master daemon.
type MasterConfig struct {
Server ServerConfig `toml:"server"`
Database DatabaseConfig `toml:"database"`
MCIAS MCIASConfig `toml:"mcias"`
Edge EdgeConfig `toml:"edge"`
Registration RegistrationConfig `toml:"registration"`
Timeouts TimeoutsConfig `toml:"timeouts"`
MCNS MCNSConfig `toml:"mcns"`
Log LogConfig `toml:"log"`
Nodes []MasterNodeConfig `toml:"nodes"`
// Master holds the master's own MCIAS service token for dialing agents.
Master MasterSettings `toml:"master"`
}
// MasterSettings holds settings specific to the master's own identity.
type MasterSettings struct {
// ServiceTokenPath is the path to the MCIAS service token file
// used by the master to authenticate to agents.
ServiceTokenPath string `toml:"service_token_path"`
// CACert is the path to the CA certificate for verifying agent TLS.
CACert string `toml:"ca_cert"`
}
// EdgeConfig holds settings for edge route management.
type EdgeConfig struct {
// AllowedDomains is the list of domains that public hostnames
// must fall under. Validation uses proper domain label matching.
AllowedDomains []string `toml:"allowed_domains"`
}
// RegistrationConfig holds agent registration settings.
type RegistrationConfig struct {
// AllowedAgents is the list of MCIAS service identities permitted
// to register with the master (e.g., "agent-rift", "agent-svc").
AllowedAgents []string `toml:"allowed_agents"`
// MaxNodes is the maximum number of registered nodes.
MaxNodes int `toml:"max_nodes"`
}
// TimeoutsConfig holds timeout durations for master operations.
type TimeoutsConfig struct {
Deploy Duration `toml:"deploy"`
EdgeRoute Duration `toml:"edge_route"`
HealthCheck Duration `toml:"health_check"`
Undeploy Duration `toml:"undeploy"`
Snapshot Duration `toml:"snapshot"`
}
// MasterNodeConfig is a bootstrap node entry in the master config.
type MasterNodeConfig struct {
Name string `toml:"name"`
Address string `toml:"address"`
Role string `toml:"role"` // "worker", "edge", or "master"
}
// LoadMasterConfig reads and validates a master configuration file.
func LoadMasterConfig(path string) (*MasterConfig, error) {
data, err := os.ReadFile(path) //nolint:gosec // config path from trusted CLI flag
if err != nil {
return nil, fmt.Errorf("read config %q: %w", path, err)
}
var cfg MasterConfig
if err := toml.Unmarshal(data, &cfg); err != nil {
return nil, fmt.Errorf("parse config %q: %w", path, err)
}
applyMasterDefaults(&cfg)
applyMasterEnvOverrides(&cfg)
if err := validateMasterConfig(&cfg); err != nil {
return nil, fmt.Errorf("validate config: %w", err)
}
return &cfg, nil
}
func applyMasterDefaults(cfg *MasterConfig) {
if cfg.Log.Level == "" {
cfg.Log.Level = "info"
}
if cfg.Registration.MaxNodes == 0 {
cfg.Registration.MaxNodes = 16
}
if cfg.Timeouts.Deploy.Duration == 0 {
cfg.Timeouts.Deploy.Duration = 5 * time.Minute
}
if cfg.Timeouts.EdgeRoute.Duration == 0 {
cfg.Timeouts.EdgeRoute.Duration = 30 * time.Second
}
if cfg.Timeouts.HealthCheck.Duration == 0 {
cfg.Timeouts.HealthCheck.Duration = 5 * time.Second
}
if cfg.Timeouts.Undeploy.Duration == 0 {
cfg.Timeouts.Undeploy.Duration = 2 * time.Minute
}
if cfg.Timeouts.Snapshot.Duration == 0 {
cfg.Timeouts.Snapshot.Duration = 10 * time.Minute
}
if cfg.MCNS.Zone == "" {
cfg.MCNS.Zone = "svc.mcp.metacircular.net"
}
for i := range cfg.Nodes {
if cfg.Nodes[i].Role == "" {
cfg.Nodes[i].Role = "worker"
}
}
}
func applyMasterEnvOverrides(cfg *MasterConfig) {
if v := os.Getenv("MCP_MASTER_SERVER_GRPC_ADDR"); v != "" {
cfg.Server.GRPCAddr = v
}
if v := os.Getenv("MCP_MASTER_SERVER_TLS_CERT"); v != "" {
cfg.Server.TLSCert = v
}
if v := os.Getenv("MCP_MASTER_SERVER_TLS_KEY"); v != "" {
cfg.Server.TLSKey = v
}
if v := os.Getenv("MCP_MASTER_DATABASE_PATH"); v != "" {
cfg.Database.Path = v
}
if v := os.Getenv("MCP_MASTER_LOG_LEVEL"); v != "" {
cfg.Log.Level = v
}
}
func validateMasterConfig(cfg *MasterConfig) error {
if cfg.Server.GRPCAddr == "" {
return fmt.Errorf("server.grpc_addr is required")
}
if cfg.Server.TLSCert == "" {
return fmt.Errorf("server.tls_cert is required")
}
if cfg.Server.TLSKey == "" {
return fmt.Errorf("server.tls_key is required")
}
if cfg.Database.Path == "" {
return fmt.Errorf("database.path is required")
}
if cfg.MCIAS.ServerURL == "" {
return fmt.Errorf("mcias.server_url is required")
}
if cfg.MCIAS.ServiceName == "" {
return fmt.Errorf("mcias.service_name is required")
}
if len(cfg.Nodes) == 0 {
return fmt.Errorf("at least one [[nodes]] entry is required")
}
if cfg.Master.ServiceTokenPath == "" {
return fmt.Errorf("master.service_token_path is required")
}
return nil
}

View File

@@ -0,0 +1,190 @@
// Package master implements the mcp-master orchestrator.
package master
import (
"context"
"crypto/tls"
"crypto/x509"
"fmt"
"os"
"strings"
"sync"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/metadata"
)
// AgentClient wraps a gRPC connection to a single mcp-agent.
type AgentClient struct {
conn *grpc.ClientConn
client mcpv1.McpAgentServiceClient
Node string
}
// DialAgent connects to an agent at the given address using TLS 1.3.
// The token is attached to every outgoing RPC via metadata.
func DialAgent(address, caCertPath, token string) (*AgentClient, error) {
tlsConfig := &tls.Config{
MinVersion: tls.VersionTLS13,
}
if caCertPath != "" {
caCert, err := os.ReadFile(caCertPath) //nolint:gosec // trusted config path
if err != nil {
return nil, fmt.Errorf("read CA cert %q: %w", caCertPath, err)
}
pool := x509.NewCertPool()
if !pool.AppendCertsFromPEM(caCert) {
return nil, fmt.Errorf("invalid CA cert %q", caCertPath)
}
tlsConfig.RootCAs = pool
}
conn, err := grpc.NewClient(
address,
grpc.WithTransportCredentials(credentials.NewTLS(tlsConfig)),
grpc.WithUnaryInterceptor(agentTokenInterceptor(token)),
grpc.WithStreamInterceptor(agentStreamTokenInterceptor(token)),
)
if err != nil {
return nil, fmt.Errorf("dial agent %q: %w", address, err)
}
return &AgentClient{
conn: conn,
client: mcpv1.NewMcpAgentServiceClient(conn),
}, nil
}
// Close closes the underlying gRPC connection.
func (c *AgentClient) Close() error {
if c == nil || c.conn == nil {
return nil
}
return c.conn.Close()
}
// Deploy forwards a deploy request to the agent.
func (c *AgentClient) Deploy(ctx context.Context, req *mcpv1.DeployRequest) (*mcpv1.DeployResponse, error) {
return c.client.Deploy(ctx, req)
}
// UndeployService forwards an undeploy request to the agent.
func (c *AgentClient) UndeployService(ctx context.Context, req *mcpv1.UndeployServiceRequest) (*mcpv1.UndeployServiceResponse, error) {
return c.client.UndeployService(ctx, req)
}
// GetServiceStatus queries a service's status on the agent.
func (c *AgentClient) GetServiceStatus(ctx context.Context, req *mcpv1.GetServiceStatusRequest) (*mcpv1.GetServiceStatusResponse, error) {
return c.client.GetServiceStatus(ctx, req)
}
// ListServices lists all services on the agent.
func (c *AgentClient) ListServices(ctx context.Context, req *mcpv1.ListServicesRequest) (*mcpv1.ListServicesResponse, error) {
return c.client.ListServices(ctx, req)
}
// SetupEdgeRoute sets up an edge route on the agent.
func (c *AgentClient) SetupEdgeRoute(ctx context.Context, req *mcpv1.SetupEdgeRouteRequest) (*mcpv1.SetupEdgeRouteResponse, error) {
return c.client.SetupEdgeRoute(ctx, req)
}
// RemoveEdgeRoute removes an edge route from the agent.
func (c *AgentClient) RemoveEdgeRoute(ctx context.Context, req *mcpv1.RemoveEdgeRouteRequest) (*mcpv1.RemoveEdgeRouteResponse, error) {
return c.client.RemoveEdgeRoute(ctx, req)
}
// ListEdgeRoutes lists edge routes on the agent.
func (c *AgentClient) ListEdgeRoutes(ctx context.Context, req *mcpv1.ListEdgeRoutesRequest) (*mcpv1.ListEdgeRoutesResponse, error) {
return c.client.ListEdgeRoutes(ctx, req)
}
// HealthCheck checks the agent's health.
func (c *AgentClient) HealthCheck(ctx context.Context, req *mcpv1.HealthCheckRequest) (*mcpv1.HealthCheckResponse, error) {
return c.client.HealthCheck(ctx, req)
}
// agentTokenInterceptor attaches the bearer token to outgoing RPCs.
func agentTokenInterceptor(token string) grpc.UnaryClientInterceptor {
return func(ctx context.Context, method string, req, reply any, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error {
ctx = metadata.AppendToOutgoingContext(ctx, "authorization", "Bearer "+token)
return invoker(ctx, method, req, reply, cc, opts...)
}
}
func agentStreamTokenInterceptor(token string) grpc.StreamClientInterceptor {
return func(ctx context.Context, desc *grpc.StreamDesc, cc *grpc.ClientConn, method string, streamer grpc.Streamer, opts ...grpc.CallOption) (grpc.ClientStream, error) {
ctx = metadata.AppendToOutgoingContext(ctx, "authorization", "Bearer "+token)
return streamer(ctx, desc, cc, method, opts...)
}
}
// AgentPool manages connections to multiple agents, keyed by node name.
type AgentPool struct {
mu sync.RWMutex
clients map[string]*AgentClient
caCert string
token string
}
// NewAgentPool creates a pool with the given CA cert and service token.
func NewAgentPool(caCertPath, token string) *AgentPool {
return &AgentPool{
clients: make(map[string]*AgentClient),
caCert: caCertPath,
token: token,
}
}
// AddNode dials an agent and adds it to the pool.
func (p *AgentPool) AddNode(name, address string) error {
client, err := DialAgent(address, p.caCert, p.token)
if err != nil {
return fmt.Errorf("add node %s: %w", name, err)
}
client.Node = name
p.mu.Lock()
defer p.mu.Unlock()
// Close existing connection if re-adding.
if old, ok := p.clients[name]; ok {
_ = old.Close()
}
p.clients[name] = client
return nil
}
// Get returns the agent client for a node.
func (p *AgentPool) Get(name string) (*AgentClient, error) {
p.mu.RLock()
defer p.mu.RUnlock()
client, ok := p.clients[name]
if !ok {
return nil, fmt.Errorf("node %q not found in pool", name)
}
return client, nil
}
// Close closes all agent connections.
func (p *AgentPool) Close() {
p.mu.Lock()
defer p.mu.Unlock()
for _, c := range p.clients {
_ = c.Close()
}
p.clients = make(map[string]*AgentClient)
}
// LoadServiceToken reads a token from a file path.
func LoadServiceToken(path string) (string, error) {
data, err := os.ReadFile(path) //nolint:gosec // trusted config path
if err != nil {
return "", fmt.Errorf("read service token %q: %w", path, err)
}
return strings.TrimSpace(string(data)), nil
}

222
internal/master/deploy.go Normal file
View File

@@ -0,0 +1,222 @@
package master
import (
"context"
"fmt"
"net"
"strings"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/masterdb"
)
// Deploy handles the MasterDeployRequest: places the service, forwards to
// the agent, registers DNS, and coordinates edge routing.
func (m *Master) Deploy(ctx context.Context, req *mcpv1.MasterDeployRequest) (*mcpv1.MasterDeployResponse, error) {
spec := req.GetService()
if spec == nil || spec.GetName() == "" {
return nil, fmt.Errorf("service spec with name is required")
}
serviceName := spec.GetName()
tier := spec.GetTier()
if tier == "" {
tier = "worker"
}
m.Logger.Info("Deploy", "service", serviceName, "tier", tier, "node_override", spec.GetNode())
resp := &mcpv1.MasterDeployResponse{}
// Step 1: Place service.
nodeName := spec.GetNode()
if nodeName == "" {
var err error
switch tier {
case "core":
nodeName, err = FindMasterNode(m.DB)
default:
nodeName, err = PickNode(m.DB)
}
if err != nil {
resp.Error = fmt.Sprintf("placement failed: %v", err)
return resp, nil
}
}
resp.Node = nodeName
node, err := masterdb.GetNode(m.DB, nodeName)
if err != nil || node == nil {
resp.Error = fmt.Sprintf("node %q not found", nodeName)
return resp, nil
}
// Resolve the node's address to an IP for DNS registration.
// Node addresses may be Tailscale DNS names (e.g., rift.scylla-hammerhead.ts.net:9444)
// but MCNS needs an IP address for A records.
nodeHost, _, err := net.SplitHostPort(node.Address)
if err != nil {
resp.Error = fmt.Sprintf("invalid node address %q: %v", node.Address, err)
return resp, nil
}
// If nodeHost is not an IP, resolve it.
if net.ParseIP(nodeHost) == nil {
ips, lookupErr := net.LookupHost(nodeHost)
if lookupErr != nil || len(ips) == 0 {
m.Logger.Warn("cannot resolve node address", "host", nodeHost, "err", lookupErr)
} else {
nodeHost = ips[0]
}
}
// Step 2: Forward deploy to the agent.
client, err := m.Pool.Get(nodeName)
if err != nil {
resp.Error = fmt.Sprintf("agent connection: %v", err)
return resp, nil
}
deployCtx, deployCancel := context.WithTimeout(ctx, m.Config.Timeouts.Deploy.Duration)
defer deployCancel()
deployResp, err := client.Deploy(deployCtx, &mcpv1.DeployRequest{
Service: spec,
})
if err != nil {
resp.DeployResult = &mcpv1.StepResult{Step: "deploy", Error: err.Error()}
resp.Error = fmt.Sprintf("agent deploy failed: %v", err)
return resp, nil
}
resp.DeployResult = &mcpv1.StepResult{Step: "deploy", Success: true}
// Check agent-side results for failures.
for _, cr := range deployResp.GetResults() {
if !cr.GetSuccess() {
resp.DeployResult.Success = false
resp.DeployResult.Error = fmt.Sprintf("component %s: %s", cr.GetName(), cr.GetError())
resp.Error = resp.DeployResult.Error
return resp, nil
}
}
// Step 3: Register DNS — Tailnet IP from the node address.
if m.DNS != nil {
if err := m.DNS.EnsureRecord(ctx, serviceName, nodeHost); err != nil {
m.Logger.Warn("DNS registration failed", "service", serviceName, "err", err)
resp.DnsResult = &mcpv1.StepResult{Step: "dns", Error: err.Error()}
} else {
resp.DnsResult = &mcpv1.StepResult{Step: "dns", Success: true}
}
}
// Record placement.
if err := masterdb.CreatePlacement(m.DB, serviceName, nodeName, tier); err != nil {
m.Logger.Error("record placement", "service", serviceName, "err", err)
}
// Steps 4-9: Detect public routes and coordinate edge routing.
edgeResult := m.setupEdgeRoutes(ctx, spec, serviceName, nodeHost)
if edgeResult != nil {
resp.EdgeRouteResult = edgeResult
}
// Compute overall success.
resp.Success = true
if resp.DeployResult != nil && !resp.DeployResult.Success {
resp.Success = false
}
if resp.EdgeRouteResult != nil && !resp.EdgeRouteResult.Success {
resp.Success = false
}
m.Logger.Info("deploy complete", "service", serviceName, "node", nodeName, "success", resp.Success)
return resp, nil
}
// setupEdgeRoutes detects public routes and coordinates edge routing.
func (m *Master) setupEdgeRoutes(ctx context.Context, spec *mcpv1.ServiceSpec, serviceName, nodeHost string) *mcpv1.StepResult {
var publicRoutes []*mcpv1.RouteSpec
for _, comp := range spec.GetComponents() {
for _, route := range comp.GetRoutes() {
if route.GetPublic() && route.GetHostname() != "" {
publicRoutes = append(publicRoutes, route)
}
}
}
if len(publicRoutes) == 0 {
return nil
}
// Find the edge node.
edgeNodeName, err := FindEdgeNode(m.DB)
if err != nil {
return &mcpv1.StepResult{Step: "edge_route", Error: fmt.Sprintf("no edge node: %v", err)}
}
edgeClient, err := m.Pool.Get(edgeNodeName)
if err != nil {
return &mcpv1.StepResult{Step: "edge_route", Error: fmt.Sprintf("edge agent connection: %v", err)}
}
var lastErr string
for _, route := range publicRoutes {
hostname := route.GetHostname()
// Validate hostname against allowed domains.
if !m.isAllowedDomain(hostname) {
lastErr = fmt.Sprintf("hostname %q not under an allowed domain", hostname)
m.Logger.Warn("edge route rejected", "hostname", hostname, "reason", lastErr)
continue
}
// Construct the backend hostname: <component>.svc.mcp.<zone>
// For simplicity, use the service name as the component name.
zone := "metacircular.net"
if m.DNS != nil && m.DNS.Zone() != "" {
zone = m.DNS.Zone()
}
backendHostname := serviceName + "." + zone
edgeCtx, edgeCancel := context.WithTimeout(ctx, m.Config.Timeouts.EdgeRoute.Duration)
_, setupErr := edgeClient.SetupEdgeRoute(edgeCtx, &mcpv1.SetupEdgeRouteRequest{
Hostname: hostname,
BackendHostname: backendHostname,
BackendPort: route.GetPort(),
BackendTls: true,
})
edgeCancel()
if setupErr != nil {
lastErr = fmt.Sprintf("setup edge route %s: %v", hostname, setupErr)
m.Logger.Warn("edge route setup failed", "hostname", hostname, "err", setupErr)
continue
}
// Record edge route in master DB.
if dbErr := masterdb.CreateEdgeRoute(m.DB, hostname, serviceName, edgeNodeName, backendHostname, int(route.GetPort())); dbErr != nil {
m.Logger.Warn("record edge route", "hostname", hostname, "err", dbErr)
}
m.Logger.Info("edge route established", "hostname", hostname, "edge_node", edgeNodeName)
}
if lastErr != "" {
return &mcpv1.StepResult{Step: "edge_route", Error: lastErr}
}
return &mcpv1.StepResult{Step: "edge_route", Success: true}
}
// isAllowedDomain checks if hostname falls under one of the configured
// allowed domains using proper domain label matching.
func (m *Master) isAllowedDomain(hostname string) bool {
if len(m.Config.Edge.AllowedDomains) == 0 {
return true // no restrictions configured
}
for _, domain := range m.Config.Edge.AllowedDomains {
if hostname == domain || strings.HasSuffix(hostname, "."+domain) {
return true
}
}
return false
}

252
internal/master/dns.go Normal file
View File

@@ -0,0 +1,252 @@
package master
import (
"bytes"
"context"
"crypto/tls"
"crypto/x509"
"encoding/json"
"fmt"
"io"
"log/slog"
"net/http"
"os"
"strings"
"time"
"git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/mc/mcp/internal/config"
)
// DNSClient creates and removes A records in MCNS. Unlike the agent's
// DNSRegistrar, the master registers records for different node IPs
// (the nodeAddr is a per-call parameter, not a fixed config value).
type DNSClient struct {
serverURL string
token string
zone string
httpClient *http.Client
logger *slog.Logger
}
type dnsRecord struct {
ID int `json:"ID"`
Name string `json:"Name"`
Type string `json:"Type"`
Value string `json:"Value"`
TTL int `json:"TTL"`
}
// NewDNSClient creates a DNS client. Returns (nil, nil) if serverURL is empty.
func NewDNSClient(cfg config.MCNSConfig, logger *slog.Logger) (*DNSClient, error) {
if cfg.ServerURL == "" {
logger.Info("mcns not configured, DNS registration disabled")
return nil, nil
}
token, err := auth.LoadToken(cfg.TokenPath)
if err != nil {
return nil, fmt.Errorf("load mcns token: %w", err)
}
httpClient, err := newHTTPClient(cfg.CACert)
if err != nil {
return nil, fmt.Errorf("create mcns HTTP client: %w", err)
}
logger.Info("master DNS client enabled", "server", cfg.ServerURL, "zone", cfg.Zone)
return &DNSClient{
serverURL: strings.TrimRight(cfg.ServerURL, "/"),
token: token,
zone: cfg.Zone,
httpClient: httpClient,
logger: logger,
}, nil
}
// Zone returns the configured DNS zone.
func (d *DNSClient) Zone() string {
if d == nil {
return ""
}
return d.zone
}
// EnsureRecord ensures an A record exists for serviceName pointing to nodeAddr.
func (d *DNSClient) EnsureRecord(ctx context.Context, serviceName, nodeAddr string) error {
if d == nil {
return nil
}
existing, err := d.listRecords(ctx, serviceName)
if err != nil {
return fmt.Errorf("list DNS records: %w", err)
}
for _, r := range existing {
if r.Value == nodeAddr {
d.logger.Debug("DNS record exists", "service", serviceName, "value", r.Value)
return nil
}
}
if len(existing) > 0 {
d.logger.Info("updating DNS record", "service", serviceName,
"old_value", existing[0].Value, "new_value", nodeAddr)
return d.updateRecord(ctx, existing[0].ID, serviceName, nodeAddr)
}
d.logger.Info("creating DNS record", "service", serviceName,
"record", serviceName+"."+d.zone, "value", nodeAddr)
return d.createRecord(ctx, serviceName, nodeAddr)
}
// RemoveRecord removes A records for serviceName.
func (d *DNSClient) RemoveRecord(ctx context.Context, serviceName string) error {
if d == nil {
return nil
}
existing, err := d.listRecords(ctx, serviceName)
if err != nil {
return fmt.Errorf("list DNS records: %w", err)
}
for _, r := range existing {
d.logger.Info("removing DNS record", "service", serviceName, "id", r.ID)
if err := d.deleteRecord(ctx, r.ID); err != nil {
return err
}
}
return nil
}
func (d *DNSClient) listRecords(ctx context.Context, serviceName string) ([]dnsRecord, error) {
url := fmt.Sprintf("%s/v1/zones/%s/records?name=%s&type=A", d.serverURL, d.zone, serviceName)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, fmt.Errorf("create list request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return nil, fmt.Errorf("list records: %w", err)
}
defer func() { _ = resp.Body.Close() }()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("read list response: %w", err)
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("list records: mcns returned %d: %s", resp.StatusCode, string(body))
}
var envelope struct {
Records []dnsRecord `json:"records"`
}
if err := json.Unmarshal(body, &envelope); err != nil {
return nil, fmt.Errorf("parse list response: %w", err)
}
return envelope.Records, nil
}
func (d *DNSClient) createRecord(ctx context.Context, serviceName, nodeAddr string) error {
reqBody, _ := json.Marshal(map[string]interface{}{
"name": serviceName, "type": "A", "value": nodeAddr, "ttl": 300,
})
url := fmt.Sprintf("%s/v1/zones/%s/records", d.serverURL, d.zone)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(reqBody))
if err != nil {
return fmt.Errorf("create record request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return fmt.Errorf("create record: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusCreated && resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return fmt.Errorf("create record: mcns returned %d: %s", resp.StatusCode, string(respBody))
}
return nil
}
func (d *DNSClient) updateRecord(ctx context.Context, recordID int, serviceName, nodeAddr string) error {
reqBody, _ := json.Marshal(map[string]interface{}{
"name": serviceName, "type": "A", "value": nodeAddr, "ttl": 300,
})
url := fmt.Sprintf("%s/v1/zones/%s/records/%d", d.serverURL, d.zone, recordID)
req, err := http.NewRequestWithContext(ctx, http.MethodPut, url, bytes.NewReader(reqBody))
if err != nil {
return fmt.Errorf("create update request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return fmt.Errorf("update record: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return fmt.Errorf("update record: mcns returned %d: %s", resp.StatusCode, string(respBody))
}
return nil
}
func (d *DNSClient) deleteRecord(ctx context.Context, recordID int) error {
url := fmt.Sprintf("%s/v1/zones/%s/records/%d", d.serverURL, d.zone, recordID)
req, err := http.NewRequestWithContext(ctx, http.MethodDelete, url, nil)
if err != nil {
return fmt.Errorf("create delete request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+d.token)
resp, err := d.httpClient.Do(req)
if err != nil {
return fmt.Errorf("delete record: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusNoContent && resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return fmt.Errorf("delete record: mcns returned %d: %s", resp.StatusCode, string(respBody))
}
return nil
}
func newHTTPClient(caCertPath string) (*http.Client, error) {
tlsConfig := &tls.Config{
MinVersion: tls.VersionTLS13,
}
if caCertPath != "" {
caCert, err := os.ReadFile(caCertPath) //nolint:gosec // path from trusted config
if err != nil {
return nil, fmt.Errorf("read CA cert %q: %w", caCertPath, err)
}
pool := x509.NewCertPool()
if !pool.AppendCertsFromPEM(caCert) {
return nil, fmt.Errorf("parse CA cert %q: no valid certificates found", caCertPath)
}
tlsConfig.RootCAs = pool
}
return &http.Client{
Timeout: 30 * time.Second,
Transport: &http.Transport{
TLSClientConfig: tlsConfig,
},
}, nil
}

159
internal/master/master.go Normal file
View File

@@ -0,0 +1,159 @@
package master
import (
"context"
"crypto/tls"
"database/sql"
"fmt"
"log/slog"
"net"
"os"
"os/signal"
"syscall"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/auth"
"git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/mc/mcp/internal/masterdb"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
)
// Master is the MCP cluster master. It coordinates multi-node deployments,
// manages edge routes, and stores cluster state.
type Master struct {
mcpv1.UnimplementedMcpMasterServiceServer
Config *config.MasterConfig
DB *sql.DB
Pool *AgentPool
DNS *DNSClient
Logger *slog.Logger
Version string
}
// Run starts the master: opens the database, bootstraps nodes, sets up the
// gRPC server with TLS and auth, and blocks until SIGINT/SIGTERM.
func Run(cfg *config.MasterConfig, version string) error {
logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: parseLogLevel(cfg.Log.Level),
}))
// Open master database.
db, err := masterdb.Open(cfg.Database.Path)
if err != nil {
return fmt.Errorf("open master database: %w", err)
}
defer func() { _ = db.Close() }()
// Bootstrap nodes from config.
for _, n := range cfg.Nodes {
if err := masterdb.UpsertNode(db, n.Name, n.Address, n.Role, "amd64"); err != nil {
return fmt.Errorf("bootstrap node %s: %w", n.Name, err)
}
logger.Info("bootstrapped node", "name", n.Name, "address", n.Address, "role", n.Role)
}
// Load service token for dialing agents.
token, err := LoadServiceToken(cfg.Master.ServiceTokenPath)
if err != nil {
return fmt.Errorf("load service token: %w", err)
}
// Create agent connection pool.
pool := NewAgentPool(cfg.Master.CACert, token)
for _, n := range cfg.Nodes {
if addErr := pool.AddNode(n.Name, n.Address); addErr != nil {
logger.Warn("failed to connect to agent", "node", n.Name, "err", addErr)
// Non-fatal: the node may come up later.
}
}
// Create DNS client.
dns, err := NewDNSClient(cfg.MCNS, logger)
if err != nil {
return fmt.Errorf("create DNS client: %w", err)
}
m := &Master{
Config: cfg,
DB: db,
Pool: pool,
DNS: dns,
Logger: logger,
Version: version,
}
// TLS.
tlsCert, err := tls.LoadX509KeyPair(cfg.Server.TLSCert, cfg.Server.TLSKey)
if err != nil {
return fmt.Errorf("load TLS cert: %w", err)
}
tlsConfig := &tls.Config{
Certificates: []tls.Certificate{tlsCert},
MinVersion: tls.VersionTLS13,
}
// Auth interceptor (same as agent — validates MCIAS tokens).
validator, err := auth.NewMCIASValidator(cfg.MCIAS.ServerURL, cfg.MCIAS.CACert)
if err != nil {
return fmt.Errorf("create MCIAS validator: %w", err)
}
// gRPC server.
server := grpc.NewServer(
grpc.Creds(credentials.NewTLS(tlsConfig)),
grpc.ChainUnaryInterceptor(
auth.AuthInterceptor(validator),
),
grpc.ChainStreamInterceptor(
auth.StreamAuthInterceptor(validator),
),
)
mcpv1.RegisterMcpMasterServiceServer(server, m)
// Listen.
lis, err := net.Listen("tcp", cfg.Server.GRPCAddr)
if err != nil {
return fmt.Errorf("listen %q: %w", cfg.Server.GRPCAddr, err)
}
logger.Info("master starting",
"addr", cfg.Server.GRPCAddr,
"version", version,
"nodes", len(cfg.Nodes),
)
// Signal handling.
ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
defer stop()
errCh := make(chan error, 1)
go func() {
errCh <- server.Serve(lis)
}()
select {
case <-ctx.Done():
logger.Info("shutting down")
server.GracefulStop()
pool.Close()
return nil
case err := <-errCh:
pool.Close()
return fmt.Errorf("serve: %w", err)
}
}
func parseLogLevel(level string) slog.Level {
switch level {
case "debug":
return slog.LevelDebug
case "warn":
return slog.LevelWarn
case "error":
return slog.LevelError
default:
return slog.LevelInfo
}
}

View File

@@ -0,0 +1,64 @@
package master
import (
"database/sql"
"fmt"
"sort"
"git.wntrmute.dev/mc/mcp/internal/masterdb"
)
// PickNode selects the best worker node for a new service deployment.
// Algorithm: fewest placed services, ties broken alphabetically.
func PickNode(db *sql.DB) (string, error) {
workers, err := masterdb.ListWorkerNodes(db)
if err != nil {
return "", fmt.Errorf("list workers: %w", err)
}
if len(workers) == 0 {
return "", fmt.Errorf("no worker nodes available")
}
counts, err := masterdb.CountPlacementsPerNode(db)
if err != nil {
return "", fmt.Errorf("count placements: %w", err)
}
// Sort: fewest placements first, then alphabetically.
sort.Slice(workers, func(i, j int) bool {
ci := counts[workers[i].Name]
cj := counts[workers[j].Name]
if ci != cj {
return ci < cj
}
return workers[i].Name < workers[j].Name
})
return workers[0].Name, nil
}
// FindMasterNode returns the name of the node with role "master".
func FindMasterNode(db *sql.DB) (string, error) {
nodes, err := masterdb.ListNodes(db)
if err != nil {
return "", fmt.Errorf("list nodes: %w", err)
}
for _, n := range nodes {
if n.Role == "master" {
return n.Name, nil
}
}
return "", fmt.Errorf("no master node found")
}
// FindEdgeNode returns the name of the first edge node.
func FindEdgeNode(db *sql.DB) (string, error) {
edges, err := masterdb.ListEdgeNodes(db)
if err != nil {
return "", fmt.Errorf("list edge nodes: %w", err)
}
if len(edges) == 0 {
return "", fmt.Errorf("no edge nodes available")
}
return edges[0].Name, nil
}

130
internal/master/status.go Normal file
View File

@@ -0,0 +1,130 @@
package master
import (
"context"
"fmt"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/masterdb"
)
// Status returns the status of services across the fleet.
func (m *Master) Status(ctx context.Context, req *mcpv1.MasterStatusRequest) (*mcpv1.MasterStatusResponse, error) {
m.Logger.Debug("Status", "service", req.GetServiceName())
resp := &mcpv1.MasterStatusResponse{}
// If a specific service is requested, look up its placement.
if name := req.GetServiceName(); name != "" {
placement, err := masterdb.GetPlacement(m.DB, name)
if err != nil {
return nil, fmt.Errorf("lookup placement: %w", err)
}
if placement == nil {
return resp, nil // empty — service not found
}
ss := m.getServiceStatus(ctx, placement)
resp.Services = append(resp.Services, ss)
return resp, nil
}
// All services.
placements, err := masterdb.ListPlacements(m.DB)
if err != nil {
return nil, fmt.Errorf("list placements: %w", err)
}
for _, p := range placements {
ss := m.getServiceStatus(ctx, p)
resp.Services = append(resp.Services, ss)
}
return resp, nil
}
func (m *Master) getServiceStatus(ctx context.Context, p *masterdb.Placement) *mcpv1.ServiceStatus {
ss := &mcpv1.ServiceStatus{
Name: p.ServiceName,
Node: p.Node,
Tier: p.Tier,
Status: "unknown",
}
// Query the agent for live status.
client, err := m.Pool.Get(p.Node)
if err != nil {
ss.Status = "unreachable"
return ss
}
statusCtx, cancel := context.WithTimeout(ctx, m.Config.Timeouts.HealthCheck.Duration)
defer cancel()
agentResp, err := client.GetServiceStatus(statusCtx, &mcpv1.GetServiceStatusRequest{
Name: p.ServiceName,
})
if err != nil {
ss.Status = "unreachable"
return ss
}
// Map agent status to master status.
for _, info := range agentResp.GetServices() {
if info.GetName() == p.ServiceName {
if info.GetActive() {
ss.Status = "running"
} else {
ss.Status = "stopped"
}
break
}
}
// Attach edge route info.
edgeRoutes, err := masterdb.ListEdgeRoutesForService(m.DB, p.ServiceName)
if err == nil {
for _, er := range edgeRoutes {
ss.EdgeRoutes = append(ss.EdgeRoutes, &mcpv1.EdgeRouteStatus{
Hostname: er.Hostname,
EdgeNode: er.EdgeNode,
})
}
}
return ss
}
// ListNodes returns all nodes in the registry with placement counts.
func (m *Master) ListNodes(_ context.Context, _ *mcpv1.ListNodesRequest) (*mcpv1.ListNodesResponse, error) {
m.Logger.Debug("ListNodes")
nodes, err := masterdb.ListNodes(m.DB)
if err != nil {
return nil, fmt.Errorf("list nodes: %w", err)
}
counts, err := masterdb.CountPlacementsPerNode(m.DB)
if err != nil {
return nil, fmt.Errorf("count placements: %w", err)
}
resp := &mcpv1.ListNodesResponse{}
for _, n := range nodes {
ni := &mcpv1.NodeInfo{
Name: n.Name,
Role: n.Role,
Address: n.Address,
Arch: n.Arch,
Status: n.Status,
Containers: int32(n.Containers), //nolint:gosec // small number
Services: int32(counts[n.Name]), //nolint:gosec // small number
}
if n.LastHeartbeat != nil {
ni.LastHeartbeat = n.LastHeartbeat.Format("2006-01-02T15:04:05Z")
}
resp.Nodes = append(resp.Nodes, ni)
}
return resp, nil
}

View File

@@ -0,0 +1,94 @@
package master
import (
"context"
"fmt"
mcpv1 "git.wntrmute.dev/mc/mcp/gen/mcp/v1"
"git.wntrmute.dev/mc/mcp/internal/masterdb"
)
// Undeploy handles MasterUndeployRequest: removes edge routes, DNS, then
// forwards the undeploy to the worker agent.
func (m *Master) Undeploy(ctx context.Context, req *mcpv1.MasterUndeployRequest) (*mcpv1.MasterUndeployResponse, error) {
serviceName := req.GetServiceName()
if serviceName == "" {
return nil, fmt.Errorf("service_name is required")
}
m.Logger.Info("Undeploy", "service", serviceName)
// Look up placement.
placement, err := masterdb.GetPlacement(m.DB, serviceName)
if err != nil {
return &mcpv1.MasterUndeployResponse{Error: fmt.Sprintf("lookup placement: %v", err)}, nil
}
if placement == nil {
return &mcpv1.MasterUndeployResponse{Error: fmt.Sprintf("service %q not found in placements", serviceName)}, nil
}
// Step 1: Undeploy on worker first (stops the backend).
client, err := m.Pool.Get(placement.Node)
if err != nil {
return &mcpv1.MasterUndeployResponse{Error: fmt.Sprintf("agent connection: %v", err)}, nil
}
undeployCtx, undeployCancel := context.WithTimeout(ctx, m.Config.Timeouts.Undeploy.Duration)
defer undeployCancel()
_, undeployErr := client.UndeployService(undeployCtx, &mcpv1.UndeployServiceRequest{
Name: serviceName,
})
if undeployErr != nil {
m.Logger.Warn("agent undeploy failed", "service", serviceName, "node", placement.Node, "err", undeployErr)
// Continue — still clean up edge routes and records.
}
// Step 2: Remove edge routes.
edgeRoutes, err := masterdb.ListEdgeRoutesForService(m.DB, serviceName)
if err != nil {
m.Logger.Warn("list edge routes for undeploy", "service", serviceName, "err", err)
}
for _, er := range edgeRoutes {
edgeClient, getErr := m.Pool.Get(er.EdgeNode)
if getErr != nil {
m.Logger.Warn("edge agent connection", "edge_node", er.EdgeNode, "err", getErr)
continue
}
edgeCtx, edgeCancel := context.WithTimeout(ctx, m.Config.Timeouts.EdgeRoute.Duration)
_, removeErr := edgeClient.RemoveEdgeRoute(edgeCtx, &mcpv1.RemoveEdgeRouteRequest{
Hostname: er.Hostname,
})
edgeCancel()
if removeErr != nil {
m.Logger.Warn("remove edge route", "hostname", er.Hostname, "err", removeErr)
} else {
m.Logger.Info("edge route removed", "hostname", er.Hostname, "edge_node", er.EdgeNode)
}
}
// Step 3: Remove DNS.
if m.DNS != nil {
if dnsErr := m.DNS.RemoveRecord(ctx, serviceName); dnsErr != nil {
m.Logger.Warn("DNS removal failed", "service", serviceName, "err", dnsErr)
}
}
// Step 4: Clean up records.
_ = masterdb.DeleteEdgeRoutesForService(m.DB, serviceName)
_ = masterdb.DeletePlacement(m.DB, serviceName)
success := undeployErr == nil
var errMsg string
if !success {
errMsg = fmt.Sprintf("agent undeploy: %v", undeployErr)
}
m.Logger.Info("undeploy complete", "service", serviceName, "success", success)
return &mcpv1.MasterUndeployResponse{
Success: success,
Error: errMsg,
}, nil
}

106
internal/masterdb/db.go Normal file
View File

@@ -0,0 +1,106 @@
// Package masterdb provides the SQLite database for the mcp-master daemon.
// It stores the cluster-wide node registry, service placements, and edge routes.
// This is separate from the agent's registry (internal/registry/) because the
// master and agent have fundamentally different schemas.
package masterdb
import (
"database/sql"
"fmt"
_ "modernc.org/sqlite"
)
// Open opens the master database at the given path and runs migrations.
func Open(path string) (*sql.DB, error) {
db, err := sql.Open("sqlite", path)
if err != nil {
return nil, fmt.Errorf("open database: %w", err)
}
for _, pragma := range []string{
"PRAGMA journal_mode = WAL",
"PRAGMA foreign_keys = ON",
"PRAGMA busy_timeout = 5000",
} {
if _, err := db.Exec(pragma); err != nil {
_ = db.Close()
return nil, fmt.Errorf("exec %q: %w", pragma, err)
}
}
if err := migrate(db); err != nil {
_ = db.Close()
return nil, fmt.Errorf("migrate: %w", err)
}
return db, nil
}
func migrate(db *sql.DB) error {
_, err := db.Exec(`
CREATE TABLE IF NOT EXISTS schema_migrations (
version INTEGER PRIMARY KEY,
applied_at TEXT NOT NULL DEFAULT (datetime('now'))
);
`)
if err != nil {
return fmt.Errorf("create migrations table: %w", err)
}
for i, m := range migrations {
version := i + 1
var count int
if err := db.QueryRow("SELECT COUNT(*) FROM schema_migrations WHERE version = ?", version).Scan(&count); err != nil {
return fmt.Errorf("check migration %d: %w", version, err)
}
if count > 0 {
continue
}
if _, err := db.Exec(m); err != nil {
return fmt.Errorf("run migration %d: %w", version, err)
}
if _, err := db.Exec("INSERT INTO schema_migrations (version) VALUES (?)", version); err != nil {
return fmt.Errorf("record migration %d: %w", version, err)
}
}
return nil
}
var migrations = []string{
// Migration 1: cluster state
`
CREATE TABLE IF NOT EXISTS nodes (
name TEXT PRIMARY KEY,
address TEXT NOT NULL,
role TEXT NOT NULL DEFAULT 'worker',
arch TEXT NOT NULL DEFAULT 'amd64',
status TEXT NOT NULL DEFAULT 'unknown',
containers INTEGER NOT NULL DEFAULT 0,
last_heartbeat TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE IF NOT EXISTS placements (
service_name TEXT PRIMARY KEY,
node TEXT NOT NULL REFERENCES nodes(name),
tier TEXT NOT NULL DEFAULT 'worker',
deployed_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE IF NOT EXISTS edge_routes (
hostname TEXT PRIMARY KEY,
service_name TEXT NOT NULL,
edge_node TEXT NOT NULL REFERENCES nodes(name),
backend_hostname TEXT NOT NULL,
backend_port INTEGER NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX IF NOT EXISTS idx_edge_routes_service
ON edge_routes(service_name);
`,
}

View File

@@ -0,0 +1,185 @@
package masterdb
import (
"database/sql"
"path/filepath"
"testing"
)
func openTestDB(t *testing.T) *sql.DB {
t.Helper()
path := filepath.Join(t.TempDir(), "test.db")
db, err := Open(path)
if err != nil {
t.Fatalf("Open: %v", err)
}
t.Cleanup(func() { _ = db.Close() })
return db
}
func TestOpenAndMigrate(t *testing.T) {
openTestDB(t)
}
func TestNodeCRUD(t *testing.T) {
db := openTestDB(t)
if err := UpsertNode(db, "rift", "100.95.252.120:9444", "master", "amd64"); err != nil {
t.Fatalf("UpsertNode: %v", err)
}
if err := UpsertNode(db, "svc", "100.106.232.4:9555", "edge", "amd64"); err != nil {
t.Fatalf("UpsertNode: %v", err)
}
if err := UpsertNode(db, "orion", "100.1.2.3:9444", "worker", "amd64"); err != nil {
t.Fatalf("UpsertNode: %v", err)
}
// Get.
n, err := GetNode(db, "rift")
if err != nil {
t.Fatalf("GetNode: %v", err)
}
if n == nil || n.Address != "100.95.252.120:9444" {
t.Errorf("GetNode(rift) = %+v", n)
}
// Get nonexistent.
n, err = GetNode(db, "nonexistent")
if err != nil {
t.Fatalf("GetNode: %v", err)
}
if n != nil {
t.Errorf("expected nil for nonexistent node")
}
// List all.
nodes, err := ListNodes(db)
if err != nil {
t.Fatalf("ListNodes: %v", err)
}
if len(nodes) != 3 {
t.Errorf("ListNodes: got %d, want 3", len(nodes))
}
// List workers (includes master role).
workers, err := ListWorkerNodes(db)
if err != nil {
t.Fatalf("ListWorkerNodes: %v", err)
}
if len(workers) != 2 {
t.Errorf("ListWorkerNodes: got %d, want 2 (rift+orion)", len(workers))
}
// List edge.
edges, err := ListEdgeNodes(db)
if err != nil {
t.Fatalf("ListEdgeNodes: %v", err)
}
if len(edges) != 1 || edges[0].Name != "svc" {
t.Errorf("ListEdgeNodes: got %v", edges)
}
// Update status.
if err := UpdateNodeStatus(db, "rift", "healthy"); err != nil {
t.Fatalf("UpdateNodeStatus: %v", err)
}
n, _ = GetNode(db, "rift")
if n.Status != "healthy" {
t.Errorf("status = %q, want healthy", n.Status)
}
}
func TestPlacementCRUD(t *testing.T) {
db := openTestDB(t)
_ = UpsertNode(db, "rift", "100.95.252.120:9444", "master", "amd64")
_ = UpsertNode(db, "orion", "100.1.2.3:9444", "worker", "amd64")
if err := CreatePlacement(db, "mcq", "rift", "worker"); err != nil {
t.Fatalf("CreatePlacement: %v", err)
}
if err := CreatePlacement(db, "mcdoc", "orion", "worker"); err != nil {
t.Fatalf("CreatePlacement: %v", err)
}
p, err := GetPlacement(db, "mcq")
if err != nil {
t.Fatalf("GetPlacement: %v", err)
}
if p == nil || p.Node != "rift" {
t.Errorf("GetPlacement(mcq) = %+v", p)
}
p, _ = GetPlacement(db, "nonexistent")
if p != nil {
t.Errorf("expected nil for nonexistent placement")
}
counts, err := CountPlacementsPerNode(db)
if err != nil {
t.Fatalf("CountPlacementsPerNode: %v", err)
}
if counts["rift"] != 1 || counts["orion"] != 1 {
t.Errorf("counts = %v", counts)
}
placements, err := ListPlacements(db)
if err != nil {
t.Fatalf("ListPlacements: %v", err)
}
if len(placements) != 2 {
t.Errorf("ListPlacements: got %d", len(placements))
}
if err := DeletePlacement(db, "mcq"); err != nil {
t.Fatalf("DeletePlacement: %v", err)
}
p, _ = GetPlacement(db, "mcq")
if p != nil {
t.Errorf("expected nil after delete")
}
}
func TestEdgeRouteCRUD(t *testing.T) {
db := openTestDB(t)
_ = UpsertNode(db, "svc", "100.106.232.4:9555", "edge", "amd64")
if err := CreateEdgeRoute(db, "mcq.metacircular.net", "mcq", "svc", "mcq.svc.mcp.metacircular.net", 8443); err != nil {
t.Fatalf("CreateEdgeRoute: %v", err)
}
if err := CreateEdgeRoute(db, "docs.metacircular.net", "mcdoc", "svc", "mcdoc.svc.mcp.metacircular.net", 443); err != nil {
t.Fatalf("CreateEdgeRoute: %v", err)
}
routes, err := ListEdgeRoutes(db)
if err != nil {
t.Fatalf("ListEdgeRoutes: %v", err)
}
if len(routes) != 2 {
t.Errorf("ListEdgeRoutes: got %d", len(routes))
}
routes, err = ListEdgeRoutesForService(db, "mcq")
if err != nil {
t.Fatalf("ListEdgeRoutesForService: %v", err)
}
if len(routes) != 1 || routes[0].Hostname != "mcq.metacircular.net" {
t.Errorf("ListEdgeRoutesForService(mcq) = %v", routes)
}
if err := DeleteEdgeRoute(db, "mcq.metacircular.net"); err != nil {
t.Fatalf("DeleteEdgeRoute: %v", err)
}
routes, _ = ListEdgeRoutes(db)
if len(routes) != 1 {
t.Errorf("expected 1 route after delete, got %d", len(routes))
}
_ = CreateEdgeRoute(db, "docs2.metacircular.net", "mcdoc", "svc", "mcdoc.svc.mcp.metacircular.net", 443)
if err := DeleteEdgeRoutesForService(db, "mcdoc"); err != nil {
t.Fatalf("DeleteEdgeRoutesForService: %v", err)
}
routes, _ = ListEdgeRoutes(db)
if len(routes) != 0 {
t.Errorf("expected 0 routes after service delete, got %d", len(routes))
}
}

View File

@@ -0,0 +1,95 @@
package masterdb
import (
"database/sql"
"fmt"
"time"
)
// EdgeRoute records a public route managed by the master.
type EdgeRoute struct {
Hostname string
ServiceName string
EdgeNode string
BackendHostname string
BackendPort int
CreatedAt time.Time
}
// CreateEdgeRoute inserts or replaces an edge route record.
func CreateEdgeRoute(db *sql.DB, hostname, serviceName, edgeNode, backendHostname string, backendPort int) error {
_, err := db.Exec(`
INSERT INTO edge_routes (hostname, service_name, edge_node, backend_hostname, backend_port, created_at)
VALUES (?, ?, ?, ?, ?, datetime('now'))
ON CONFLICT(hostname) DO UPDATE SET
service_name = excluded.service_name,
edge_node = excluded.edge_node,
backend_hostname = excluded.backend_hostname,
backend_port = excluded.backend_port
`, hostname, serviceName, edgeNode, backendHostname, backendPort)
if err != nil {
return fmt.Errorf("create edge route %s: %w", hostname, err)
}
return nil
}
// ListEdgeRoutes returns all edge routes.
func ListEdgeRoutes(db *sql.DB) ([]*EdgeRoute, error) {
return queryEdgeRoutes(db, `SELECT hostname, service_name, edge_node, backend_hostname, backend_port, created_at FROM edge_routes ORDER BY hostname`)
}
// ListEdgeRoutesForService returns edge routes for a specific service.
func ListEdgeRoutesForService(db *sql.DB, serviceName string) ([]*EdgeRoute, error) {
rows, err := db.Query(`
SELECT hostname, service_name, edge_node, backend_hostname, backend_port, created_at
FROM edge_routes WHERE service_name = ? ORDER BY hostname
`, serviceName)
if err != nil {
return nil, fmt.Errorf("list edge routes for %s: %w", serviceName, err)
}
defer func() { _ = rows.Close() }()
return scanEdgeRoutes(rows)
}
// DeleteEdgeRoute removes a single edge route by hostname.
func DeleteEdgeRoute(db *sql.DB, hostname string) error {
_, err := db.Exec(`DELETE FROM edge_routes WHERE hostname = ?`, hostname)
if err != nil {
return fmt.Errorf("delete edge route %s: %w", hostname, err)
}
return nil
}
// DeleteEdgeRoutesForService removes all edge routes for a service.
func DeleteEdgeRoutesForService(db *sql.DB, serviceName string) error {
_, err := db.Exec(`DELETE FROM edge_routes WHERE service_name = ?`, serviceName)
if err != nil {
return fmt.Errorf("delete edge routes for %s: %w", serviceName, err)
}
return nil
}
func queryEdgeRoutes(db *sql.DB, query string) ([]*EdgeRoute, error) {
rows, err := db.Query(query)
if err != nil {
return nil, fmt.Errorf("query edge routes: %w", err)
}
defer func() { _ = rows.Close() }()
return scanEdgeRoutes(rows)
}
func scanEdgeRoutes(rows *sql.Rows) ([]*EdgeRoute, error) {
var routes []*EdgeRoute
for rows.Next() {
var r EdgeRoute
var createdAt string
if err := rows.Scan(&r.Hostname, &r.ServiceName, &r.EdgeNode, &r.BackendHostname, &r.BackendPort, &createdAt); err != nil {
return nil, fmt.Errorf("scan edge route: %w", err)
}
r.CreatedAt, _ = time.Parse("2006-01-02 15:04:05", createdAt)
routes = append(routes, &r)
}
return routes, rows.Err()
}

103
internal/masterdb/nodes.go Normal file
View File

@@ -0,0 +1,103 @@
package masterdb
import (
"database/sql"
"fmt"
"time"
)
// Node represents a registered node in the cluster.
type Node struct {
Name string
Address string
Role string
Arch string
Status string
Containers int
LastHeartbeat *time.Time
}
// UpsertNode inserts or updates a node in the registry.
func UpsertNode(db *sql.DB, name, address, role, arch string) error {
_, err := db.Exec(`
INSERT INTO nodes (name, address, role, arch, updated_at)
VALUES (?, ?, ?, ?, datetime('now'))
ON CONFLICT(name) DO UPDATE SET
address = excluded.address,
role = excluded.role,
arch = excluded.arch,
updated_at = datetime('now')
`, name, address, role, arch)
if err != nil {
return fmt.Errorf("upsert node %s: %w", name, err)
}
return nil
}
// GetNode returns a single node by name.
func GetNode(db *sql.DB, name string) (*Node, error) {
var n Node
var lastHB sql.NullString
err := db.QueryRow(`
SELECT name, address, role, arch, status, containers, last_heartbeat
FROM nodes WHERE name = ?
`, name).Scan(&n.Name, &n.Address, &n.Role, &n.Arch, &n.Status, &n.Containers, &lastHB)
if err == sql.ErrNoRows {
return nil, nil
}
if err != nil {
return nil, fmt.Errorf("get node %s: %w", name, err)
}
if lastHB.Valid {
t, _ := time.Parse("2006-01-02 15:04:05", lastHB.String)
n.LastHeartbeat = &t
}
return &n, nil
}
// ListNodes returns all nodes.
func ListNodes(db *sql.DB) ([]*Node, error) {
return queryNodes(db, `SELECT name, address, role, arch, status, containers, last_heartbeat FROM nodes ORDER BY name`)
}
// ListWorkerNodes returns nodes with role "worker" or "master" (master is also a worker).
func ListWorkerNodes(db *sql.DB) ([]*Node, error) {
return queryNodes(db, `SELECT name, address, role, arch, status, containers, last_heartbeat FROM nodes WHERE role IN ('worker', 'master') ORDER BY name`)
}
// ListEdgeNodes returns nodes with role "edge".
func ListEdgeNodes(db *sql.DB) ([]*Node, error) {
return queryNodes(db, `SELECT name, address, role, arch, status, containers, last_heartbeat FROM nodes WHERE role = 'edge' ORDER BY name`)
}
func queryNodes(db *sql.DB, query string) ([]*Node, error) {
rows, err := db.Query(query)
if err != nil {
return nil, fmt.Errorf("query nodes: %w", err)
}
defer func() { _ = rows.Close() }()
var nodes []*Node
for rows.Next() {
var n Node
var lastHB sql.NullString
if err := rows.Scan(&n.Name, &n.Address, &n.Role, &n.Arch, &n.Status, &n.Containers, &lastHB); err != nil {
return nil, fmt.Errorf("scan node: %w", err)
}
if lastHB.Valid {
t, _ := time.Parse("2006-01-02 15:04:05", lastHB.String)
n.LastHeartbeat = &t
}
nodes = append(nodes, &n)
}
return nodes, rows.Err()
}
// UpdateNodeStatus updates a node's status field.
func UpdateNodeStatus(db *sql.DB, name, status string) error {
_, err := db.Exec(`UPDATE nodes SET status = ?, updated_at = datetime('now') WHERE name = ?`, status, name)
if err != nil {
return fmt.Errorf("update node status %s: %w", name, err)
}
return nil
}

View File

@@ -0,0 +1,99 @@
package masterdb
import (
"database/sql"
"fmt"
"time"
)
// Placement records which node hosts which service.
type Placement struct {
ServiceName string
Node string
Tier string
DeployedAt time.Time
}
// CreatePlacement inserts or replaces a placement record.
func CreatePlacement(db *sql.DB, serviceName, node, tier string) error {
_, err := db.Exec(`
INSERT INTO placements (service_name, node, tier, deployed_at)
VALUES (?, ?, ?, datetime('now'))
ON CONFLICT(service_name) DO UPDATE SET
node = excluded.node,
tier = excluded.tier,
deployed_at = datetime('now')
`, serviceName, node, tier)
if err != nil {
return fmt.Errorf("create placement %s: %w", serviceName, err)
}
return nil
}
// GetPlacement returns the placement for a service.
func GetPlacement(db *sql.DB, serviceName string) (*Placement, error) {
var p Placement
var deployedAt string
err := db.QueryRow(`
SELECT service_name, node, tier, deployed_at
FROM placements WHERE service_name = ?
`, serviceName).Scan(&p.ServiceName, &p.Node, &p.Tier, &deployedAt)
if err == sql.ErrNoRows {
return nil, nil
}
if err != nil {
return nil, fmt.Errorf("get placement %s: %w", serviceName, err)
}
p.DeployedAt, _ = time.Parse("2006-01-02 15:04:05", deployedAt)
return &p, nil
}
// ListPlacements returns all placements.
func ListPlacements(db *sql.DB) ([]*Placement, error) {
rows, err := db.Query(`SELECT service_name, node, tier, deployed_at FROM placements ORDER BY service_name`)
if err != nil {
return nil, fmt.Errorf("list placements: %w", err)
}
defer func() { _ = rows.Close() }()
var placements []*Placement
for rows.Next() {
var p Placement
var deployedAt string
if err := rows.Scan(&p.ServiceName, &p.Node, &p.Tier, &deployedAt); err != nil {
return nil, fmt.Errorf("scan placement: %w", err)
}
p.DeployedAt, _ = time.Parse("2006-01-02 15:04:05", deployedAt)
placements = append(placements, &p)
}
return placements, rows.Err()
}
// DeletePlacement removes a placement record.
func DeletePlacement(db *sql.DB, serviceName string) error {
_, err := db.Exec(`DELETE FROM placements WHERE service_name = ?`, serviceName)
if err != nil {
return fmt.Errorf("delete placement %s: %w", serviceName, err)
}
return nil
}
// CountPlacementsPerNode returns a map of node name → number of placed services.
func CountPlacementsPerNode(db *sql.DB) (map[string]int, error) {
rows, err := db.Query(`SELECT node, COUNT(*) FROM placements GROUP BY node`)
if err != nil {
return nil, fmt.Errorf("count placements: %w", err)
}
defer func() { _ = rows.Close() }()
counts := make(map[string]int)
for rows.Next() {
var node string
var count int
if err := rows.Scan(&node, &count); err != nil {
return nil, fmt.Errorf("scan count: %w", err)
}
counts[node] = count
}
return counts, rows.Err()
}

View File

@@ -8,8 +8,8 @@ import (
"os/exec" "os/exec"
"time" "time"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
) )
// Alerter evaluates state transitions and fires alerts for drift or flapping. // Alerter evaluates state transitions and fires alerts for drift or flapping.

View File

@@ -7,9 +7,9 @@ import (
"log/slog" "log/slog"
"time" "time"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
// Monitor watches container states and compares them to the registry, // Monitor watches container states and compares them to the registry,

View File

@@ -9,9 +9,9 @@ import (
"testing" "testing"
"time" "time"
"git.wntrmute.dev/kyle/mcp/internal/config" "git.wntrmute.dev/mc/mcp/internal/config"
"git.wntrmute.dev/kyle/mcp/internal/registry" "git.wntrmute.dev/mc/mcp/internal/registry"
"git.wntrmute.dev/kyle/mcp/internal/runtime" "git.wntrmute.dev/mc/mcp/internal/runtime"
) )
func openTestDB(t *testing.T) *sql.DB { func openTestDB(t *testing.T) *sql.DB {
@@ -47,6 +47,10 @@ func (f *fakeRuntime) Pull(_ context.Context, _ string) error { re
func (f *fakeRuntime) Run(_ context.Context, _ runtime.ContainerSpec) error { return nil } func (f *fakeRuntime) Run(_ context.Context, _ runtime.ContainerSpec) error { return nil }
func (f *fakeRuntime) Stop(_ context.Context, _ string) error { return nil } func (f *fakeRuntime) Stop(_ context.Context, _ string) error { return nil }
func (f *fakeRuntime) Remove(_ context.Context, _ string) error { return nil } func (f *fakeRuntime) Remove(_ context.Context, _ string) error { return nil }
func (f *fakeRuntime) Build(_ context.Context, _, _, _ string) error { return nil }
func (f *fakeRuntime) Push(_ context.Context, _ string) error { return nil }
func (f *fakeRuntime) ImageExists(_ context.Context, _ string) (bool, error) { return true, nil }
func (f *fakeRuntime) Inspect(_ context.Context, _ string) (runtime.ContainerInfo, error) { func (f *fakeRuntime) Inspect(_ context.Context, _ string) (runtime.ContainerInfo, error) {
return runtime.ContainerInfo{}, nil return runtime.ContainerInfo{}, nil

View File

@@ -6,6 +6,15 @@ import (
"time" "time"
) )
// Route represents a route entry for a component in the registry.
type Route struct {
Name string
Port int
Mode string
Hostname string
HostPort int // agent-assigned host port (0 = not yet allocated)
}
// Component represents a component in the registry. // Component represents a component in the registry.
type Component struct { type Component struct {
Name string Name string
@@ -20,6 +29,7 @@ type Component struct {
Ports []string Ports []string
Volumes []string Volumes []string
Cmd []string Cmd []string
Routes []Route
CreatedAt time.Time CreatedAt time.Time
UpdatedAt time.Time UpdatedAt time.Time
} }
@@ -51,6 +61,9 @@ func CreateComponent(db *sql.DB, c *Component) error {
if err := setCmd(tx, c.Service, c.Name, c.Cmd); err != nil { if err := setCmd(tx, c.Service, c.Name, c.Cmd); err != nil {
return err return err
} }
if err := setRoutes(tx, c.Service, c.Name, c.Routes); err != nil {
return err
}
return tx.Commit() return tx.Commit()
} }
@@ -84,6 +97,10 @@ func GetComponent(db *sql.DB, service, name string) (*Component, error) {
if err != nil { if err != nil {
return nil, err return nil, err
} }
c.Routes, err = getRoutes(db, service, name)
if err != nil {
return nil, err
}
return c, nil return c, nil
} }
@@ -115,6 +132,7 @@ func ListComponents(db *sql.DB, service string) ([]Component, error) {
c.Ports, _ = getPorts(db, c.Service, c.Name) c.Ports, _ = getPorts(db, c.Service, c.Name)
c.Volumes, _ = getVolumes(db, c.Service, c.Name) c.Volumes, _ = getVolumes(db, c.Service, c.Name)
c.Cmd, _ = getCmd(db, c.Service, c.Name) c.Cmd, _ = getCmd(db, c.Service, c.Name)
c.Routes, _ = getRoutes(db, c.Service, c.Name)
components = append(components, c) components = append(components, c)
} }
@@ -168,6 +186,9 @@ func UpdateComponentSpec(db *sql.DB, c *Component) error {
if err := setCmd(tx, c.Service, c.Name, c.Cmd); err != nil { if err := setCmd(tx, c.Service, c.Name, c.Cmd); err != nil {
return err return err
} }
if err := setRoutes(tx, c.Service, c.Name, c.Routes); err != nil {
return err
}
return tx.Commit() return tx.Commit()
} }
@@ -274,3 +295,85 @@ func getCmd(db *sql.DB, service, component string) ([]string, error) {
} }
return cmd, rows.Err() return cmd, rows.Err()
} }
// helper: set route definitions (delete + re-insert)
func setRoutes(tx *sql.Tx, service, component string, routes []Route) error {
if _, err := tx.Exec("DELETE FROM component_routes WHERE service = ? AND component = ?", service, component); err != nil {
return fmt.Errorf("clear routes %q/%q: %w", service, component, err)
}
for _, r := range routes {
mode := r.Mode
if mode == "" {
mode = "l4"
}
name := r.Name
if name == "" {
name = "default"
}
if _, err := tx.Exec(
"INSERT INTO component_routes (service, component, name, port, mode, hostname, host_port) VALUES (?, ?, ?, ?, ?, ?, ?)",
service, component, name, r.Port, mode, r.Hostname, r.HostPort,
); err != nil {
return fmt.Errorf("insert route %q/%q %q: %w", service, component, name, err)
}
}
return nil
}
func getRoutes(db *sql.DB, service, component string) ([]Route, error) {
rows, err := db.Query(
"SELECT name, port, mode, hostname, host_port FROM component_routes WHERE service = ? AND component = ? ORDER BY name",
service, component,
)
if err != nil {
return nil, fmt.Errorf("get routes %q/%q: %w", service, component, err)
}
defer func() { _ = rows.Close() }()
var routes []Route
for rows.Next() {
var r Route
if err := rows.Scan(&r.Name, &r.Port, &r.Mode, &r.Hostname, &r.HostPort); err != nil {
return nil, err
}
routes = append(routes, r)
}
return routes, rows.Err()
}
// UpdateRouteHostPort updates the agent-assigned host port for a specific route.
func UpdateRouteHostPort(db *sql.DB, service, component, routeName string, hostPort int) error {
res, err := db.Exec(
"UPDATE component_routes SET host_port = ? WHERE service = ? AND component = ? AND name = ?",
hostPort, service, component, routeName,
)
if err != nil {
return fmt.Errorf("update route host_port %q/%q/%q: %w", service, component, routeName, err)
}
n, _ := res.RowsAffected()
if n == 0 {
return fmt.Errorf("update route host_port %q/%q/%q: %w", service, component, routeName, sql.ErrNoRows)
}
return nil
}
// GetRouteHostPorts returns a map of route name to assigned host port for a component.
func GetRouteHostPorts(db *sql.DB, service, component string) (map[string]int, error) {
rows, err := db.Query(
"SELECT name, host_port FROM component_routes WHERE service = ? AND component = ?",
service, component,
)
if err != nil {
return nil, fmt.Errorf("get route host ports %q/%q: %w", service, component, err)
}
defer func() { _ = rows.Close() }()
result := make(map[string]int)
for rows.Next() {
var name string
var port int
if err := rows.Scan(&name, &port); err != nil {
return nil, err
}
result[name] = port
}
return result, rows.Err()
}

View File

@@ -127,4 +127,33 @@ var migrations = []string{
CREATE INDEX IF NOT EXISTS idx_events_component_time CREATE INDEX IF NOT EXISTS idx_events_component_time
ON events(service, component, timestamp); ON events(service, component, timestamp);
`, `,
// Migration 2: component routes
`
CREATE TABLE IF NOT EXISTS component_routes (
service TEXT NOT NULL,
component TEXT NOT NULL,
name TEXT NOT NULL,
port INTEGER NOT NULL,
mode TEXT NOT NULL DEFAULT 'l4',
hostname TEXT NOT NULL DEFAULT '',
host_port INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (service, component, name),
FOREIGN KEY (service, component) REFERENCES components(service, name) ON DELETE CASCADE
);
`,
// Migration 3: service comment
`ALTER TABLE services ADD COLUMN comment TEXT NOT NULL DEFAULT '';`,
// Migration 4: edge routes (v2 — public routes managed by the master)
`CREATE TABLE IF NOT EXISTS edge_routes (
hostname TEXT NOT NULL PRIMARY KEY,
backend_hostname TEXT NOT NULL,
backend_port INTEGER NOT NULL,
tls_cert TEXT NOT NULL DEFAULT '',
tls_key TEXT NOT NULL DEFAULT '',
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);`,
} }

View File

@@ -0,0 +1,93 @@
package registry
import (
"database/sql"
"fmt"
"time"
)
// EdgeRoute represents a public edge route managed by the master.
type EdgeRoute struct {
Hostname string
BackendHostname string
BackendPort int
TLSCert string
TLSKey string
CreatedAt time.Time
UpdatedAt time.Time
}
// CreateEdgeRoute inserts or replaces an edge route.
func CreateEdgeRoute(db *sql.DB, hostname, backendHostname string, backendPort int, tlsCert, tlsKey string) error {
_, err := db.Exec(`
INSERT INTO edge_routes (hostname, backend_hostname, backend_port, tls_cert, tls_key, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, datetime('now'), datetime('now'))
ON CONFLICT(hostname) DO UPDATE SET
backend_hostname = excluded.backend_hostname,
backend_port = excluded.backend_port,
tls_cert = excluded.tls_cert,
tls_key = excluded.tls_key,
updated_at = datetime('now')
`, hostname, backendHostname, backendPort, tlsCert, tlsKey)
if err != nil {
return fmt.Errorf("create edge route %s: %w", hostname, err)
}
return nil
}
// GetEdgeRoute returns a single edge route by hostname.
func GetEdgeRoute(db *sql.DB, hostname string) (*EdgeRoute, error) {
var r EdgeRoute
var createdAt, updatedAt string
err := db.QueryRow(`
SELECT hostname, backend_hostname, backend_port, tls_cert, tls_key, created_at, updated_at
FROM edge_routes WHERE hostname = ?
`, hostname).Scan(&r.Hostname, &r.BackendHostname, &r.BackendPort, &r.TLSCert, &r.TLSKey, &createdAt, &updatedAt)
if err == sql.ErrNoRows {
return nil, nil
}
if err != nil {
return nil, fmt.Errorf("get edge route %s: %w", hostname, err)
}
r.CreatedAt, _ = time.Parse("2006-01-02 15:04:05", createdAt)
r.UpdatedAt, _ = time.Parse("2006-01-02 15:04:05", updatedAt)
return &r, nil
}
// ListEdgeRoutes returns all edge routes.
func ListEdgeRoutes(db *sql.DB) ([]*EdgeRoute, error) {
rows, err := db.Query(`
SELECT hostname, backend_hostname, backend_port, tls_cert, tls_key, created_at, updated_at
FROM edge_routes ORDER BY hostname
`)
if err != nil {
return nil, fmt.Errorf("list edge routes: %w", err)
}
defer func() { _ = rows.Close() }()
var routes []*EdgeRoute
for rows.Next() {
var r EdgeRoute
var createdAt, updatedAt string
if err := rows.Scan(&r.Hostname, &r.BackendHostname, &r.BackendPort, &r.TLSCert, &r.TLSKey, &createdAt, &updatedAt); err != nil {
return nil, fmt.Errorf("scan edge route: %w", err)
}
r.CreatedAt, _ = time.Parse("2006-01-02 15:04:05", createdAt)
r.UpdatedAt, _ = time.Parse("2006-01-02 15:04:05", updatedAt)
routes = append(routes, &r)
}
return routes, rows.Err()
}
// DeleteEdgeRoute removes an edge route by hostname.
func DeleteEdgeRoute(db *sql.DB, hostname string) error {
result, err := db.Exec(`DELETE FROM edge_routes WHERE hostname = ?`, hostname)
if err != nil {
return fmt.Errorf("delete edge route %s: %w", hostname, err)
}
n, _ := result.RowsAffected()
if n == 0 {
return fmt.Errorf("edge route %s not found", hostname)
}
return nil
}

View File

@@ -83,6 +83,15 @@ func CountEvents(db *sql.DB, service, component string, since time.Time) (int, e
return count, nil return count, nil
} }
// DeleteComponentEvents deletes all events for a specific component.
func DeleteComponentEvents(db *sql.DB, service, component string) error {
_, err := db.Exec("DELETE FROM events WHERE service = ? AND component = ?", service, component)
if err != nil {
return fmt.Errorf("delete events %q/%q: %w", service, component, err)
}
return nil
}
// PruneEvents deletes events older than the given time. // PruneEvents deletes events older than the given time.
func PruneEvents(db *sql.DB, before time.Time) (int64, error) { func PruneEvents(db *sql.DB, before time.Time) (int64, error) {
res, err := db.Exec( res, err := db.Exec(

View File

@@ -237,6 +237,160 @@ func TestCascadeDelete(t *testing.T) {
} }
} }
func TestComponentRoutes(t *testing.T) {
db := openTestDB(t)
if err := CreateService(db, "svc", true); err != nil {
t.Fatalf("create service: %v", err)
}
// Create component with routes
c := &Component{
Name: "api",
Service: "svc",
Image: "img:v1",
Restart: "unless-stopped",
DesiredState: "running",
ObservedState: "unknown",
Routes: []Route{
{Name: "rest", Port: 8443, Mode: "l7", Hostname: "api.example.com"},
{Name: "grpc", Port: 9443, Mode: "l4"},
},
}
if err := CreateComponent(db, c); err != nil {
t.Fatalf("create component: %v", err)
}
// Get and verify routes
got, err := GetComponent(db, "svc", "api")
if err != nil {
t.Fatalf("get: %v", err)
}
if len(got.Routes) != 2 {
t.Fatalf("routes: got %d, want 2", len(got.Routes))
}
// Routes are ordered by name: grpc, rest
if got.Routes[0].Name != "grpc" || got.Routes[0].Port != 9443 || got.Routes[0].Mode != "l4" {
t.Fatalf("route[0]: got %+v", got.Routes[0])
}
if got.Routes[1].Name != "rest" || got.Routes[1].Port != 8443 || got.Routes[1].Mode != "l7" || got.Routes[1].Hostname != "api.example.com" {
t.Fatalf("route[1]: got %+v", got.Routes[1])
}
// Update routes via UpdateComponentSpec
c.Routes = []Route{{Name: "http", Port: 8080, Mode: "l7"}}
if err := UpdateComponentSpec(db, c); err != nil {
t.Fatalf("update spec: %v", err)
}
got, _ = GetComponent(db, "svc", "api")
if len(got.Routes) != 1 || got.Routes[0].Name != "http" {
t.Fatalf("updated routes: got %+v", got.Routes)
}
// List components includes routes
comps, err := ListComponents(db, "svc")
if err != nil {
t.Fatalf("list: %v", err)
}
if len(comps) != 1 || len(comps[0].Routes) != 1 {
t.Fatalf("list routes: got %d components, %d routes", len(comps), len(comps[0].Routes))
}
}
func TestRouteHostPort(t *testing.T) {
db := openTestDB(t)
if err := CreateService(db, "svc", true); err != nil {
t.Fatalf("create service: %v", err)
}
c := &Component{
Name: "api",
Service: "svc",
Image: "img:v1",
Restart: "unless-stopped",
DesiredState: "running",
ObservedState: "unknown",
Routes: []Route{
{Name: "rest", Port: 8443, Mode: "l7"},
{Name: "grpc", Port: 9443, Mode: "l4"},
},
}
if err := CreateComponent(db, c); err != nil {
t.Fatalf("create component: %v", err)
}
// Initially host_port is 0
ports, err := GetRouteHostPorts(db, "svc", "api")
if err != nil {
t.Fatalf("get host ports: %v", err)
}
if ports["rest"] != 0 || ports["grpc"] != 0 {
t.Fatalf("initial host ports should be 0: %+v", ports)
}
// Update host ports
if err := UpdateRouteHostPort(db, "svc", "api", "rest", 12345); err != nil {
t.Fatalf("update rest: %v", err)
}
if err := UpdateRouteHostPort(db, "svc", "api", "grpc", 12346); err != nil {
t.Fatalf("update grpc: %v", err)
}
ports, _ = GetRouteHostPorts(db, "svc", "api")
if ports["rest"] != 12345 {
t.Fatalf("rest host_port: got %d, want 12345", ports["rest"])
}
if ports["grpc"] != 12346 {
t.Fatalf("grpc host_port: got %d, want 12346", ports["grpc"])
}
// Verify host_port is visible via GetComponent
got, _ := GetComponent(db, "svc", "api")
for _, r := range got.Routes {
if r.Name == "rest" && r.HostPort != 12345 {
t.Fatalf("GetComponent rest host_port: got %d", r.HostPort)
}
if r.Name == "grpc" && r.HostPort != 12346 {
t.Fatalf("GetComponent grpc host_port: got %d", r.HostPort)
}
}
// Update nonexistent route should fail
err = UpdateRouteHostPort(db, "svc", "api", "nonexistent", 99999)
if err == nil {
t.Fatal("expected error updating nonexistent route")
}
}
func TestRouteCascadeDelete(t *testing.T) {
db := openTestDB(t)
if err := CreateService(db, "svc", true); err != nil {
t.Fatalf("create service: %v", err)
}
c := &Component{
Name: "api", Service: "svc", Image: "img:v1",
Restart: "unless-stopped", DesiredState: "running", ObservedState: "unknown",
Routes: []Route{{Name: "rest", Port: 8443, Mode: "l4"}},
}
if err := CreateComponent(db, c); err != nil {
t.Fatalf("create component: %v", err)
}
// Delete service cascades to routes
if err := DeleteService(db, "svc"); err != nil {
t.Fatalf("delete service: %v", err)
}
// Routes table should be empty
ports, err := GetRouteHostPorts(db, "svc", "api")
if err != nil {
t.Fatalf("get routes after cascade: %v", err)
}
if len(ports) != 0 {
t.Fatalf("routes should be empty after cascade, got %d", len(ports))
}
}
func TestEvents(t *testing.T) { func TestEvents(t *testing.T) {
db := openTestDB(t) db := openTestDB(t)

View File

@@ -3,7 +3,9 @@ package runtime
import ( import (
"context" "context"
"encoding/json" "encoding/json"
"errors"
"fmt" "fmt"
"os"
"os/exec" "os/exec"
"strings" "strings"
"time" "time"
@@ -49,6 +51,9 @@ func (p *Podman) BuildRunArgs(spec ContainerSpec) []string {
for _, vol := range spec.Volumes { for _, vol := range spec.Volumes {
args = append(args, "-v", vol) args = append(args, "-v", vol)
} }
for _, env := range spec.Env {
args = append(args, "-e", env)
}
args = append(args, spec.Image) args = append(args, spec.Image)
args = append(args, spec.Cmd...) args = append(args, spec.Cmd...)
@@ -174,12 +179,125 @@ func (p *Podman) Inspect(ctx context.Context, name string) (ContainerInfo, error
return info, nil return info, nil
} }
// Logs returns an exec.Cmd that streams container logs. For containers
// using the journald log driver, it tries journalctl first (podman logs
// can't read journald outside the originating user session). If journalctl
// can't access the journal, it falls back to podman logs.
func (p *Podman) Logs(ctx context.Context, containerName string, tail int, follow, timestamps bool, since string) *exec.Cmd {
// Check if this container uses the journald log driver.
inspectCmd := exec.CommandContext(ctx, p.command(), "inspect", "--format", "{{.HostConfig.LogConfig.Type}}", containerName) //nolint:gosec
if out, err := inspectCmd.Output(); err == nil && strings.TrimSpace(string(out)) == "journald" {
if p.journalAccessible(ctx, containerName) {
return p.journalLogs(ctx, containerName, tail, follow, since)
}
}
return p.podmanLogs(ctx, containerName, tail, follow, timestamps, since)
}
// journalAccessible probes whether journalctl can read logs for the container.
func (p *Podman) journalAccessible(ctx context.Context, containerName string) bool {
args := []string{"--no-pager", "-n", "0"}
if os.Getuid() != 0 {
args = append(args, "--user")
}
args = append(args, "CONTAINER_NAME="+containerName)
cmd := exec.CommandContext(ctx, "journalctl", args...) //nolint:gosec
return cmd.Run() == nil
}
// journalLogs returns a journalctl command filtered by container name.
func (p *Podman) journalLogs(ctx context.Context, containerName string, tail int, follow bool, since string) *exec.Cmd {
args := []string{"--no-pager", "--output", "cat"}
if os.Getuid() != 0 {
args = append(args, "--user")
}
args = append(args, "CONTAINER_NAME="+containerName)
if tail > 0 {
args = append(args, "--lines", fmt.Sprintf("%d", tail))
}
if follow {
args = append(args, "--follow")
}
if since != "" {
args = append(args, "--since", since)
}
return exec.CommandContext(ctx, "journalctl", args...) //nolint:gosec // args built programmatically
}
// podmanLogs returns a podman logs command.
func (p *Podman) podmanLogs(ctx context.Context, containerName string, tail int, follow, timestamps bool, since string) *exec.Cmd {
args := []string{"logs"}
if tail > 0 {
args = append(args, "--tail", fmt.Sprintf("%d", tail))
}
if follow {
args = append(args, "--follow")
}
if timestamps {
args = append(args, "--timestamps")
}
if since != "" {
args = append(args, "--since", since)
}
args = append(args, containerName)
return exec.CommandContext(ctx, p.command(), args...) //nolint:gosec // args built programmatically
}
// Login authenticates to a container registry using the given token as
// the password. This enables non-interactive push with service account
// tokens (MCR accepts MCIAS JWTs as passwords).
func (p *Podman) Login(ctx context.Context, registry, username, token string) error {
cmd := exec.CommandContext(ctx, p.command(), "login", "--username", username, "--password-stdin", registry) //nolint:gosec // args built programmatically
cmd.Stdin = strings.NewReader(token)
if out, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("podman login %q: %w: %s", registry, err, out)
}
return nil
}
// Build builds a container image from a Dockerfile.
func (p *Podman) Build(ctx context.Context, image, contextDir, dockerfile string) error {
args := []string{"build", "-t", image, "-f", dockerfile, contextDir}
cmd := exec.CommandContext(ctx, p.command(), args...) //nolint:gosec // args built programmatically
cmd.Dir = contextDir
if out, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("podman build %q: %w: %s", image, err, out)
}
return nil
}
// Push pushes a container image to a remote registry.
func (p *Podman) Push(ctx context.Context, image string) error {
cmd := exec.CommandContext(ctx, p.command(), "push", image) //nolint:gosec // args built programmatically
if out, err := cmd.CombinedOutput(); err != nil {
return fmt.Errorf("podman push %q: %w: %s", image, err, out)
}
return nil
}
// ImageExists checks whether an image tag exists in a remote registry.
// Uses skopeo inspect which works for both regular images and multi-arch
// manifests, unlike podman manifest inspect which only handles manifests.
func (p *Podman) ImageExists(ctx context.Context, image string) (bool, error) {
cmd := exec.CommandContext(ctx, "skopeo", "inspect", "--tls-verify=false", "docker://"+image) //nolint:gosec // args built programmatically
if err := cmd.Run(); err != nil {
var exitErr *exec.ExitError
if ok := errors.As(err, &exitErr); ok && exitErr.ExitCode() != 0 {
return false, nil
}
return false, fmt.Errorf("skopeo inspect %q: %w", image, err)
}
return true, nil
}
// podmanPSEntry is a single entry from podman ps --format json. // podmanPSEntry is a single entry from podman ps --format json.
type podmanPSEntry struct { type podmanPSEntry struct {
Names []string `json:"Names"` Names []string `json:"Names"`
Image string `json:"Image"` Image string `json:"Image"`
State string `json:"State"` State string `json:"State"`
Command string `json:"Command"` Command []string `json:"Command"`
StartedAt int64 `json:"StartedAt"`
} }
// List returns information about all containers. // List returns information about all containers.
@@ -201,12 +319,16 @@ func (p *Podman) List(ctx context.Context) ([]ContainerInfo, error) {
if len(e.Names) > 0 { if len(e.Names) > 0 {
name = e.Names[0] name = e.Names[0]
} }
infos = append(infos, ContainerInfo{ info := ContainerInfo{
Name: name, Name: name,
Image: e.Image, Image: e.Image,
State: e.State, State: e.State,
Version: ExtractVersion(e.Image), Version: ExtractVersion(e.Image),
}) }
if e.StartedAt > 0 {
info.Started = time.Unix(e.StartedAt, 0)
}
infos = append(infos, info)
} }
return infos, nil return infos, nil

View File

@@ -16,6 +16,7 @@ type ContainerSpec struct {
Ports []string // "host:container" port mappings Ports []string // "host:container" port mappings
Volumes []string // "host:container" volume mounts Volumes []string // "host:container" volume mounts
Cmd []string // command and arguments Cmd []string // command and arguments
Env []string // environment variables (KEY=VALUE)
} }
// ContainerInfo describes the observed state of a running or stopped container. // ContainerInfo describes the observed state of a running or stopped container.
@@ -33,7 +34,9 @@ type ContainerInfo struct {
Started time.Time // when the container started (zero if not running) Started time.Time // when the container started (zero if not running)
} }
// Runtime is the container runtime abstraction. // Runtime is the container runtime abstraction. The first six methods are
// used by the agent for container lifecycle. The last three are used by the
// CLI for building and pushing images.
type Runtime interface { type Runtime interface {
Pull(ctx context.Context, image string) error Pull(ctx context.Context, image string) error
Run(ctx context.Context, spec ContainerSpec) error Run(ctx context.Context, spec ContainerSpec) error
@@ -41,6 +44,10 @@ type Runtime interface {
Remove(ctx context.Context, name string) error Remove(ctx context.Context, name string) error
Inspect(ctx context.Context, name string) (ContainerInfo, error) Inspect(ctx context.Context, name string) (ContainerInfo, error)
List(ctx context.Context) ([]ContainerInfo, error) List(ctx context.Context) ([]ContainerInfo, error)
Build(ctx context.Context, image, contextDir, dockerfile string) error
Push(ctx context.Context, image string) error
ImageExists(ctx context.Context, image string) (bool, error)
} }
// ExtractVersion parses the tag from an image reference. // ExtractVersion parses the tag from an image reference.

View File

@@ -76,6 +76,39 @@ func TestBuildRunArgs(t *testing.T) {
}) })
}) })
t.Run("env vars", func(t *testing.T) {
spec := ContainerSpec{
Name: "test-app",
Image: "img:latest",
Env: []string{"PORT=12345", "PORT_GRPC=12346"},
}
requireEqualArgs(t, p.BuildRunArgs(spec), []string{
"run", "-d", "--name", "test-app",
"-e", "PORT=12345", "-e", "PORT_GRPC=12346",
"img:latest",
})
})
t.Run("full spec with env", func(t *testing.T) {
// Route-allocated ports: host port = container port (matches $PORT).
spec := ContainerSpec{
Name: "svc-api",
Image: "img:latest",
Network: "net",
Ports: []string{"127.0.0.1:12345:12345"},
Volumes: []string{"/srv:/srv"},
Env: []string{"PORT=12345"},
}
requireEqualArgs(t, p.BuildRunArgs(spec), []string{
"run", "-d", "--name", "svc-api",
"--network", "net",
"-p", "127.0.0.1:12345:12345",
"-v", "/srv:/srv",
"-e", "PORT=12345",
"img:latest",
})
})
t.Run("cmd after image", func(t *testing.T) { t.Run("cmd after image", func(t *testing.T) {
spec := ContainerSpec{ spec := ContainerSpec{
Name: "test-app", Name: "test-app",

Some files were not shown because too many files have changed in this diff Show More