Commit Graph

1 Commits

Author SHA1 Message Date
ad34904b0b Add incident report and disaster recovery runbook
incident-2026-04-03-uid-change.md: Detailed post-mortem of the UID
change cascading failure that took down all services on rift. Documents
the timeline, root causes, recovery steps, and lessons learned.

disaster-recovery.md: Step-by-step runbook for bootstrapping the
platform from zero when all containers are gone. Covers the boot
order (MCNS → mc-proxy/MCR/Metacrypt → master → apps), exact podman
run commands for each service, common errors, and verification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 09:18:45 -07:00