Add boot sequencing to agent

The agent reads [[boot.sequence]] stages from its config and starts
services in dependency order before accepting gRPC connections. Each
stage waits for its services to pass health checks before proceeding:

- tcp: TCP connect to the container's mapped port
- grpc: standard gRPC health check

Foundation stage (stage 0): blocks and retries indefinitely if health
fails — all downstream services depend on it.
Non-foundation stages: log warning and proceed on failure.

Uses the recover logic to start containers from the registry, then
health-checks to verify readiness.

Config example:
  [[boot.sequence]]
  name = "foundation"
  services = ["mcias", "mcns"]
  timeout = "120s"
  health = "tcp"

Architecture v2 Phase 4 feature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-04 11:53:11 -07:00
parent 9d543998dc
commit fa4d022bc1
3 changed files with 231 additions and 0 deletions

View File

@@ -19,6 +19,23 @@ type AgentConfig struct {
MCNS MCNSConfig `toml:"mcns"`
Monitor MonitorConfig `toml:"monitor"`
Log LogConfig `toml:"log"`
Boot BootConfig `toml:"boot"`
}
// BootConfig holds the boot sequence for the master node.
// Each stage's services must be healthy before the next stage starts.
// Worker and edge nodes don't use this — they wait for the master.
type BootConfig struct {
Sequence []BootStage `toml:"sequence"`
}
// BootStage defines a group of services that must be started and healthy
// before the next stage begins.
type BootStage struct {
Name string `toml:"name"`
Services []string `toml:"services"`
Timeout Duration `toml:"timeout"`
Health string `toml:"health"` // "tcp", "grpc", or "http"
}
// MetacryptConfig holds the Metacrypt CA integration settings for