Add multi-address fallback for node connectivity

NodeConfig and MasterNodeConfig gain an optional addresses[] field
for fallback addresses tried in order after the primary address.
Provides resilience when Tailscale DNS is down or a node is only
reachable via LAN.

- dialAgentMulti: tries each address with a 3s health check, returns
  first success
- forEachNode: uses multi-address dialing
- AgentPool.AddNodeMulti: master tries all addresses when connecting
- AllAddresses(): deduplicates primary + fallback addresses

Config example:
  [[nodes]]
  name = "rift"
  address = "rift.scylla-hammerhead.ts.net:9444"
  addresses = ["100.95.252.120:9444", "192.168.88.181:9444"]

Existing configs without addresses[] work unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-03 09:45:50 -07:00
parent 5da307cab5
commit f9f6f339f4
7 changed files with 114 additions and 22 deletions

View File

@@ -63,7 +63,7 @@ func Run(cfg *config.MasterConfig, version string) error {
// Create agent connection pool.
pool := NewAgentPool(cfg.Master.CACert, token)
for _, n := range cfg.Nodes {
if addErr := pool.AddNode(n.Name, n.Address); addErr != nil {
if addErr := pool.AddNodeMulti(n.Name, n.AllAddresses()); addErr != nil {
logger.Warn("failed to connect to agent", "node", n.Name, "err", addErr)
// Non-fatal: the node may come up later.
}