# RUNBOOK.md Operational procedures for mc-proxy. Written for operators, not developers. ## Service Overview mc-proxy is a Layer 4 TLS SNI proxy. It routes incoming TLS connections to backend services based on the SNI hostname. It does not terminate TLS or inspect application-layer traffic. A global firewall blocks connections by IP, CIDR, or GeoIP country before routing. ## Health Checks ### Via gRPC (requires admin API enabled) ```bash mc-proxy status -c /srv/mc-proxy/mc-proxy.toml ``` Expected output: ``` mc-proxy v0.1.0 uptime: 4h32m10s connections: 1247 :443 routes=2 active=12 :8443 routes=1 active=3 :9443 routes=1 active=0 ``` ### Via systemd ```bash systemctl status mc-proxy journalctl -u mc-proxy -n 50 --no-pager ``` ### Via process ```bash ss -tlnp | grep mc-proxy ``` Verify all configured listener ports are in LISTEN state. ## Common Operations ### Start / Stop / Restart ```bash systemctl start mc-proxy systemctl stop mc-proxy systemctl restart mc-proxy ``` Stopping the service triggers graceful shutdown: new connections are refused, in-flight connections drain for up to `shutdown_timeout` (default 30s), then remaining connections are force-closed. ### View Logs ```bash # Recent logs journalctl -u mc-proxy -n 100 --no-pager # Follow live journalctl -u mc-proxy -f # Filter by severity journalctl -u mc-proxy -p err ``` ### Reload GeoIP Database Send SIGHUP to reload the GeoIP database without restarting: ```bash systemctl kill -s HUP mc-proxy ``` Or: ```bash kill -HUP $(pidof mc-proxy) ``` Verify in logs: ``` level=INFO msg="received SIGHUP, reloading GeoIP database" ``` ### Create a Database Backup ```bash # Manual backup mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml # Manual backup to a specific path mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml -o /tmp/mc-proxy-backup.db ``` Automated daily backups run via the systemd timer: ```bash # Check timer status systemctl list-timers mc-proxy-backup.timer # Run backup manually via systemd systemctl start mc-proxy-backup.service # View backup logs journalctl -u mc-proxy-backup.service -n 20 --no-pager ``` Backups are stored in `/srv/mc-proxy/backups/` and pruned after 30 days. ### Restore from Backup 1. Stop the service: ```bash systemctl stop mc-proxy ``` 2. Replace the database: ```bash cp /srv/mc-proxy/backups/mc-proxy-.db /srv/mc-proxy/mc-proxy.db chown mc-proxy:mc-proxy /srv/mc-proxy/mc-proxy.db chmod 0600 /srv/mc-proxy/mc-proxy.db ``` 3. Start the service: ```bash systemctl start mc-proxy ``` 4. Verify health: ```bash mc-proxy status -c /srv/mc-proxy/mc-proxy.toml ``` ### Manage Routes at Runtime (gRPC) Routes can be added and removed at runtime via the gRPC admin API using `grpcurl` or any gRPC client. ```bash # List routes for a listener grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/ListRoutes \ -d '{"listener_addr": ":443"}' # Add a route grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/AddRoute \ -d '{"listener_addr": ":443", "route": {"hostname": "new.metacircular.net", "backend": "127.0.0.1:38443"}}' # Remove a route grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/RemoveRoute \ -d '{"listener_addr": ":443", "hostname": "old.metacircular.net"}' ``` ### Manage Firewall Rules at Runtime (gRPC) ```bash # List rules grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/GetFirewallRules # Block an IP grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \ -d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "203.0.113.50"}}' # Block a CIDR grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \ -d '{"rule": {"type": "FIREWALL_RULE_TYPE_CIDR", "value": "198.51.100.0/24"}}' # Block a country grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \ -d '{"rule": {"type": "FIREWALL_RULE_TYPE_COUNTRY", "value": "RU"}}' # Remove a rule grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/RemoveFirewallRule \ -d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "203.0.113.50"}}' ``` ## Deployment with MCP mc-proxy runs on rift as a single container managed by MCP. The service definition lives at `~/.config/mcp/services/mc-proxy.toml` on rift (reference copy at `deploy/mc-proxy-rift.toml` in this repo). The container mounts `/srv/mc-proxy` which holds the config file, SQLite database, GeoIP database, and TLS certificates for backends. It runs as `--user 0:0` under rootless podman. Listeners: `:443` (L7 terminating), `:8443` (L4 passthrough), `:9443` (L4 passthrough). ### Deploy or Update ```bash mcp deploy mc-proxy ``` ### Restart / Stop ```bash mcp restart mc-proxy mcp stop mc-proxy ``` ### Check Status ```bash mcp ps mcp status mc-proxy ``` ### View Logs ```bash ssh rift 'doas su - mcp -s /bin/sh -c "podman logs mc-proxy"' ``` ### Update Routes Edit the config at `/srv/mc-proxy/mc-proxy.toml` on rift, then restart: ```bash mcp restart mc-proxy ``` Routes added at runtime via the gRPC admin API are persisted in the database and survive restarts. Editing the TOML config is only necessary for changing listener definitions or static seed routes. ## Incident Procedures ### Proxy Not Starting 1. Check logs for the error: ```bash journalctl -u mc-proxy -n 50 --no-pager ``` 2. Common causes: - **"database.path is required"** — config file missing or malformed. - **"firewall: geoip_db is required"** — country blocks configured but GeoIP database missing. - **"address already in use"** — another process holds the port. ```bash ss -tlnp | grep ':' ``` - **Permission denied on database** — check ownership: ```bash ls -la /srv/mc-proxy/mc-proxy.db chown mc-proxy:mc-proxy /srv/mc-proxy/mc-proxy.db ``` ### High Connection Count / Resource Exhaustion 1. Check active connections: ```bash mc-proxy status -c /srv/mc-proxy/mc-proxy.toml ``` 2. Check system-level connection count: ```bash ss -tn | grep -c ':' ``` 3. If under attack, add firewall rules via gRPC to block the source: ```bash grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \ localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \ -d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": ""}}' ``` 4. If many IPs from one region, consider a country block or CIDR block. ### Database Corruption 1. Stop the service: ```bash systemctl stop mc-proxy ``` 2. Check database integrity: ```bash sqlite3 /srv/mc-proxy/mc-proxy.db "PRAGMA integrity_check;" ``` 3. If corrupted, restore from the most recent backup (see [Restore from Backup](#restore-from-backup)). 4. If no backups exist, delete the database and restart. The service will re-seed from the TOML configuration: ```bash rm /srv/mc-proxy/mc-proxy.db systemctl start mc-proxy ``` Note: any routes or firewall rules added at runtime via gRPC will be lost. ### GeoIP Database Stale or Missing 1. Download a fresh copy of GeoLite2-Country.mmdb from MaxMind. 2. Place it at the configured path: ```bash cp GeoLite2-Country.mmdb /srv/mc-proxy/GeoLite2-Country.mmdb chown mc-proxy:mc-proxy /srv/mc-proxy/GeoLite2-Country.mmdb ``` 3. Reload without restart: ```bash systemctl kill -s HUP mc-proxy ``` ### Certificate Expiry (gRPC Admin API) The gRPC admin API uses TLS certificates from `/srv/mc-proxy/certs/`. Certificates are loaded at startup; replacing them requires a restart. 1. Replace the certificates: ```bash cp new-cert.pem /srv/mc-proxy/certs/cert.pem cp new-key.pem /srv/mc-proxy/certs/key.pem chown mc-proxy:mc-proxy /srv/mc-proxy/certs/*.pem chmod 0600 /srv/mc-proxy/certs/key.pem ``` 2. Restart: ```bash systemctl restart mc-proxy ``` Note: certificate expiry does not affect the proxy listeners — they do not terminate TLS. ### Backend Unreachable If a backend service is down, connections to routes pointing at that backend will fail at the dial phase and the client receives a TCP RST. mc-proxy logs the dial failure at `warn` level. 1. Check logs for dial errors: ```bash journalctl -u mc-proxy -n 100 --no-pager | grep "dial" ``` 2. Verify the backend is running: ```bash ss -tlnp | grep ':' ``` 3. This is not an mc-proxy issue — fix the backend service. ## Escalation If the runbook does not resolve the issue: 1. Collect logs: `journalctl -u mc-proxy --since "1 hour ago" > /tmp/mc-proxy-logs.txt` 2. Collect status: `mc-proxy status -c /srv/mc-proxy/mc-proxy.toml > /tmp/mc-proxy-status.txt` 3. Collect database state: `mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml -o /tmp/mc-proxy-escalation.db` 4. Escalate with the collected artifacts.