Files
mc-proxy/RUNBOOK.md
Kyle Isom dc1816b159 Add MCP deployment section to RUNBOOK.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:09:18 -07:00

9.1 KiB

RUNBOOK.md

Operational procedures for mc-proxy. Written for operators, not developers.

Service Overview

mc-proxy is a Layer 4 TLS SNI proxy. It routes incoming TLS connections to backend services based on the SNI hostname. It does not terminate TLS or inspect application-layer traffic. A global firewall blocks connections by IP, CIDR, or GeoIP country before routing.

Health Checks

Via gRPC (requires admin API enabled)

mc-proxy status -c /srv/mc-proxy/mc-proxy.toml

Expected output:

mc-proxy v0.1.0
uptime:      4h32m10s
connections: 1247

  :443   routes=2  active=12
  :8443  routes=1  active=3
  :9443  routes=1  active=0

Via systemd

systemctl status mc-proxy
journalctl -u mc-proxy -n 50 --no-pager

Via process

ss -tlnp | grep mc-proxy

Verify all configured listener ports are in LISTEN state.

Common Operations

Start / Stop / Restart

systemctl start mc-proxy
systemctl stop mc-proxy
systemctl restart mc-proxy

Stopping the service triggers graceful shutdown: new connections are refused, in-flight connections drain for up to shutdown_timeout (default 30s), then remaining connections are force-closed.

View Logs

# Recent logs
journalctl -u mc-proxy -n 100 --no-pager

# Follow live
journalctl -u mc-proxy -f

# Filter by severity
journalctl -u mc-proxy -p err

Reload GeoIP Database

Send SIGHUP to reload the GeoIP database without restarting:

systemctl kill -s HUP mc-proxy

Or:

kill -HUP $(pidof mc-proxy)

Verify in logs:

level=INFO msg="received SIGHUP, reloading GeoIP database"

Create a Database Backup

# Manual backup
mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml

# Manual backup to a specific path
mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml -o /tmp/mc-proxy-backup.db

Automated daily backups run via the systemd timer:

# Check timer status
systemctl list-timers mc-proxy-backup.timer

# Run backup manually via systemd
systemctl start mc-proxy-backup.service

# View backup logs
journalctl -u mc-proxy-backup.service -n 20 --no-pager

Backups are stored in /srv/mc-proxy/backups/ and pruned after 30 days.

Restore from Backup

  1. Stop the service:
    systemctl stop mc-proxy
    
  2. Replace the database:
    cp /srv/mc-proxy/backups/mc-proxy-<timestamp>.db /srv/mc-proxy/mc-proxy.db
    chown mc-proxy:mc-proxy /srv/mc-proxy/mc-proxy.db
    chmod 0600 /srv/mc-proxy/mc-proxy.db
    
  3. Start the service:
    systemctl start mc-proxy
    
  4. Verify health:
    mc-proxy status -c /srv/mc-proxy/mc-proxy.toml
    

Manage Routes at Runtime (gRPC)

Routes can be added and removed at runtime via the gRPC admin API using grpcurl or any gRPC client.

# List routes for a listener
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/ListRoutes \
  -d '{"listener_addr": ":443"}'

# Add a route
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/AddRoute \
  -d '{"listener_addr": ":443", "route": {"hostname": "new.metacircular.net", "backend": "127.0.0.1:38443"}}'

# Remove a route
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/RemoveRoute \
  -d '{"listener_addr": ":443", "hostname": "old.metacircular.net"}'

Manage Firewall Rules at Runtime (gRPC)

# List rules
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/GetFirewallRules

# Block an IP
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
  -d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "203.0.113.50"}}'

# Block a CIDR
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
  -d '{"rule": {"type": "FIREWALL_RULE_TYPE_CIDR", "value": "198.51.100.0/24"}}'

# Block a country
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
  -d '{"rule": {"type": "FIREWALL_RULE_TYPE_COUNTRY", "value": "RU"}}'

# Remove a rule
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
  localhost:9090 mc_proxy.v1.ProxyAdminService/RemoveFirewallRule \
  -d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "203.0.113.50"}}'

Deployment with MCP

mc-proxy runs on rift as a single container managed by MCP. The service definition lives at ~/.config/mcp/services/mc-proxy.toml on rift (reference copy at deploy/mc-proxy-rift.toml in this repo). The container mounts /srv/mc-proxy which holds the config file, SQLite database, GeoIP database, and TLS certificates for backends. It runs as --user 0:0 under rootless podman.

Listeners: :443 (L7 terminating), :8443 (L4 passthrough), :9443 (L4 passthrough).

Deploy or Update

mcp deploy mc-proxy

Restart / Stop

mcp restart mc-proxy
mcp stop mc-proxy

Check Status

mcp ps
mcp status mc-proxy

View Logs

ssh rift 'doas su - mcp -s /bin/sh -c "podman logs mc-proxy"'

Update Routes

Edit the config at /srv/mc-proxy/mc-proxy.toml on rift, then restart:

mcp restart mc-proxy

Routes added at runtime via the gRPC admin API are persisted in the database and survive restarts. Editing the TOML config is only necessary for changing listener definitions or static seed routes.

Incident Procedures

Proxy Not Starting

  1. Check logs for the error:
    journalctl -u mc-proxy -n 50 --no-pager
    
  2. Common causes:
    • "database.path is required" — config file missing or malformed.
    • "firewall: geoip_db is required" — country blocks configured but GeoIP database missing.
    • "address already in use" — another process holds the port.
      ss -tlnp | grep ':<port>'
      
    • Permission denied on database — check ownership:
      ls -la /srv/mc-proxy/mc-proxy.db
      chown mc-proxy:mc-proxy /srv/mc-proxy/mc-proxy.db
      

High Connection Count / Resource Exhaustion

  1. Check active connections:
    mc-proxy status -c /srv/mc-proxy/mc-proxy.toml
    
  2. Check system-level connection count:
    ss -tn | grep -c ':<port>'
    
  3. If under attack, add firewall rules via gRPC to block the source:
    grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
      localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
      -d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "<attacker-ip>"}}'
    
  4. If many IPs from one region, consider a country block or CIDR block.

Database Corruption

  1. Stop the service:
    systemctl stop mc-proxy
    
  2. Check database integrity:
    sqlite3 /srv/mc-proxy/mc-proxy.db "PRAGMA integrity_check;"
    
  3. If corrupted, restore from the most recent backup (see Restore from Backup).
  4. If no backups exist, delete the database and restart. The service will re-seed from the TOML configuration:
    rm /srv/mc-proxy/mc-proxy.db
    systemctl start mc-proxy
    
    Note: any routes or firewall rules added at runtime via gRPC will be lost.

GeoIP Database Stale or Missing

  1. Download a fresh copy of GeoLite2-Country.mmdb from MaxMind.
  2. Place it at the configured path:
    cp GeoLite2-Country.mmdb /srv/mc-proxy/GeoLite2-Country.mmdb
    chown mc-proxy:mc-proxy /srv/mc-proxy/GeoLite2-Country.mmdb
    
  3. Reload without restart:
    systemctl kill -s HUP mc-proxy
    

Certificate Expiry (gRPC Admin API)

The gRPC admin API uses TLS certificates from /srv/mc-proxy/certs/. Certificates are loaded at startup; replacing them requires a restart.

  1. Replace the certificates:
    cp new-cert.pem /srv/mc-proxy/certs/cert.pem
    cp new-key.pem /srv/mc-proxy/certs/key.pem
    chown mc-proxy:mc-proxy /srv/mc-proxy/certs/*.pem
    chmod 0600 /srv/mc-proxy/certs/key.pem
    
  2. Restart:
    systemctl restart mc-proxy
    

Note: certificate expiry does not affect the proxy listeners — they do not terminate TLS.

Backend Unreachable

If a backend service is down, connections to routes pointing at that backend will fail at the dial phase and the client receives a TCP RST. mc-proxy logs the dial failure at warn level.

  1. Check logs for dial errors:
    journalctl -u mc-proxy -n 100 --no-pager | grep "dial"
    
  2. Verify the backend is running:
    ss -tlnp | grep ':<backend-port>'
    
  3. This is not an mc-proxy issue — fix the backend service.

Escalation

If the runbook does not resolve the issue:

  1. Collect logs: journalctl -u mc-proxy --since "1 hour ago" > /tmp/mc-proxy-logs.txt
  2. Collect status: mc-proxy status -c /srv/mc-proxy/mc-proxy.toml > /tmp/mc-proxy-status.txt
  3. Collect database state: mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml -o /tmp/mc-proxy-escalation.db
  4. Escalate with the collected artifacts.