# RUNBOOK.md — eng-pad-server ## 1. Service Overview eng-pad-server receives engineering notebook data from the Engineering Pad Android app via gRPC, stores it in SQLite, and serves read-only views through a web UI. Single authenticated user. **Host**: deimos.wntrmute.net **URL**: https://pad.metacircular.net **Ports**: 443 (nginx → 8080 web UI), 8443 (REST/TLS), 9443 (gRPC/TLS) **Data**: `/srv/eng-pad-server/` **Config**: `/srv/eng-pad-server/eng-pad-server.toml` **TLS**: Let's Encrypt (`/etc/letsencrypt/live/pad.metacircular.net/`), copied to `/srv/eng-pad-server/certs/` **Container**: `eng-pad-server` (Docker, `--restart unless-stopped`) ## 2. Health Checks 1. Check container is running: ``` docker ps | grep eng-pad-server ``` 2. Check web UI responds: ``` curl -s https://pad.metacircular.net/login | head -1 ``` 3. Check container logs: ``` docker logs eng-pad-server --tail 20 ``` ## 3. Common Operations ### Start / Stop / Restart ``` docker start eng-pad-server docker stop eng-pad-server docker restart eng-pad-server ``` ### View Logs ``` docker logs eng-pad-server -f ``` ### Deploy New Version ```bash # From local machine: rsync -az --exclude='.git' --exclude='srv/' . deimos.wntrmute.net:/tmp/eng-pad-server-build/ ssh deimos.wntrmute.net "cd /tmp/eng-pad-server-build && \ docker build -t eng-pad-server . && \ docker stop eng-pad-server && docker rm eng-pad-server && \ docker run -d --name eng-pad-server --restart unless-stopped \ -p 127.0.0.1:8090:8080 -p 8443:8443 -p 9443:9443 \ -v /srv/eng-pad-server:/srv/eng-pad-server eng-pad-server" ``` ### Create User ``` docker exec -it eng-pad-server \ eng-pad-server init -c /srv/eng-pad-server/eng-pad-server.toml ``` ### Reset User Password ``` docker exec -it eng-pad-server \ eng-pad-server passwd -c /srv/eng-pad-server/eng-pad-server.toml ``` ### Manual Backup ``` docker exec eng-pad-server \ eng-pad-server snapshot -c /srv/eng-pad-server/eng-pad-server.toml ``` Backup saved to `/srv/eng-pad-server/backups/`. ### Renew TLS Certificates After certbot renews the Let's Encrypt cert: ``` sudo cp /etc/letsencrypt/live/pad.metacircular.net/{fullchain,privkey}.pem \ /srv/eng-pad-server/certs/ docker restart eng-pad-server ``` ### Register a FIDO2/U2F Security Key 1. Log in to the web UI at https://pad.metacircular.net with password. 2. Navigate to `/keys`. 3. Enter a name for the key (e.g., "YubiKey 5"). 4. Click "Register" and touch the key when prompted. ## 4. Alerting No automated alerting is configured. Monitor via: - `docker ps | grep eng-pad-server` — container health - `docker logs eng-pad-server --since 1h 2>&1 | grep ERROR` — errors - Backup age: `ls -lt /srv/eng-pad-server/backups/ | head` ## 5. Incident Procedures ### Service Won't Start 1. Check logs: ``` docker logs eng-pad-server --tail 50 ``` 2. Common causes: - Config file missing or invalid → fix `/srv/eng-pad-server/eng-pad-server.toml` - TLS cert/key missing → re-copy from Let's Encrypt (see Renew TLS above) - Port already in use → `ss -tlnp | grep -E '8443|9443|8090'` - Database locked → check for zombie processes: `fuser /srv/eng-pad-server/eng-pad-server.db` ### Database Corruption 1. Stop the container: ``` docker stop eng-pad-server ``` 2. Check integrity: ``` sqlite3 /srv/eng-pad-server/eng-pad-server.db "PRAGMA integrity_check" ``` 3. If corrupted, restore from backup: ``` cp /srv/eng-pad-server/backups/eng-pad-server-LATEST.db /srv/eng-pad-server/eng-pad-server.db ``` 4. Restart: ``` docker start eng-pad-server ``` ### Certificate Expiry 1. Check expiry: ``` openssl x509 -in /srv/eng-pad-server/certs/fullchain.pem -noout -dates ``` 2. Renew via certbot (see "Renew TLS Certificates" above). 3. Restart the container (picks up new certs on start). ### Disk Full 1. Check disk usage: ``` df -h /srv/eng-pad-server/ du -sh /srv/eng-pad-server/* ``` 2. Prune old backups: ``` ls -t /srv/eng-pad-server/backups/ | tail -n +8 | xargs -I{} rm /srv/eng-pad-server/backups/{} ``` 3. Compact the database: ``` sqlite3 /srv/eng-pad-server/eng-pad-server.db "VACUUM" ``` ### Sync Fails from Android App 1. Verify the app has the correct server URL (`pad.metacircular.net:9443`). 2. Use "Test Connection" in the app's sync settings for a specific error. 3. Check gRPC port is open: `ss -tlnp | grep 9443` 4. Check firewall: `sudo ufw status | grep 9443` (must be ALLOW). 5. Check TLS cert is valid: `openssl x509 -in /srv/eng-pad-server/certs/fullchain.pem -noout -dates` 6. Check server logs for auth failures: `docker logs eng-pad-server 2>&1 | grep -i error` ## 6. Escalation If the runbook doesn't resolve the issue: 1. Check ARCHITECTURE.md for system design context. 2. Check AUDIT.md for known security considerations. 3. Review recent commits for changes that may have introduced the issue.