Add boot sequencing to agent
The agent reads [[boot.sequence]] stages from its config and starts services in dependency order before accepting gRPC connections. Each stage waits for its services to pass health checks before proceeding: - tcp: TCP connect to the container's mapped port - grpc: standard gRPC health check Foundation stage (stage 0): blocks and retries indefinitely if health fails — all downstream services depend on it. Non-foundation stages: log warning and proceed on failure. Uses the recover logic to start containers from the registry, then health-checks to verify readiness. Config example: [[boot.sequence]] name = "foundation" services = ["mcias", "mcns"] timeout = "120s" health = "tcp" Architecture v2 Phase 4 feature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -119,6 +119,18 @@ func Run(cfg *config.AgentConfig, version string) error {
|
||||
"runtime", cfg.Agent.ContainerRuntime,
|
||||
)
|
||||
|
||||
// Run boot sequence before starting the gRPC server.
|
||||
// On the master node, this starts foundation services (MCIAS, MCNS)
|
||||
// before core services, ensuring dependencies are met.
|
||||
if len(cfg.Boot.Sequence) > 0 {
|
||||
bootCtx, bootCancel := context.WithCancel(context.Background())
|
||||
defer bootCancel()
|
||||
if err := a.RunBootSequence(bootCtx); err != nil {
|
||||
logger.Error("boot sequence failed", "err", err)
|
||||
// Continue starting the gRPC server — partial boot is better than no agent.
|
||||
}
|
||||
}
|
||||
|
||||
mon.Start()
|
||||
|
||||
ctx, stop := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
|
||||
|
||||
Reference in New Issue
Block a user