All nodes now list 1.1.1.1 and 8.8.8.8 as fallback nameservers after
MCNS. When MCNS is down, internal names (.svc.mcp.metacircular.net)
fail but external DNS (google.com, github.com, etc.) keeps working.
Lesson from 2026-04-03 incident: without fallbacks, MCNS failure
caused total DNS blackout including external services, forcing
Tailscale to be disabled to restore any DNS resolution.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
UID 995 conflicted with sshd on orion. Pin to 850 (the 800-899 range
is unused on all nodes and well below NixOS auto-assign range).
Pin GID to 850 as well for consistency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
UID 995 conflicted with sshd on orion. Let NixOS auto-assign the UID
for the mcp system user. Use systemd's %U specifier for XDG_RUNTIME_DIR
instead of the hardcoded UID.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The FIDO2 crypttab options are already on the correct UUID-named device
in hardware-configuration.nix; the "crypted" name only applies to
disko-provisioned hosts (rift).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The module used explicit `config = { ... }` but also had duplicate
networking.nameservers and services.resolved.domains at the top level,
causing a NixOS module evaluation error. Merged the Tailscale nameserver
into the config block and removed the duplicates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The master runs as an MCP-managed container, deployed via
mcp deploy mcp-master --direct. The systemd unit was a temporary
bootstrap mechanism.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Runs the MCP v2 master as a systemd service on rift. Uses
ConditionPathExists so the unit is a no-op on worker nodes
(like orion) that import mcp.nix but don't have the binary.
Starts after mcp-agent.service. Security hardened like the agent
but with ProtectHome=true (master doesn't need /run/user).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace fragile environment.etc.crypttab.text with
boot.initrd.luks.devices for the second SSD, matching
the pattern used for the root drive.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The agent binary is now managed by the operator (scp + install to
/srv/mcp/mcp-agent), not by the Nix flake. This allows agent upgrades
without a full NixOS rebuild.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Podman/skopeo don't use the system CA bundle for registry TLS — they
use /etc/containers/certs.d/<host:port>/ca.crt. Add the WNTRMUTE CA
there so podman push/pull to MCR works without --tls-verify=false.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>