Add documentation, Docker setup, and tests for server and gRPC packages
Rewrite README with project overview and quick start. Add RUNBOOK with operational procedures and incident playbooks. Fix Dockerfile for Go 1.25 with version injection. Add docker-compose.yml. Clean up golangci.yaml for mc-proxy. Add server tests (10) covering the full proxy pipeline with TCP echo backends, and grpcserver tests (13) covering all admin API RPCs with bufconn and write-through DB verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
304
RUNBOOK.md
Normal file
304
RUNBOOK.md
Normal file
@@ -0,0 +1,304 @@
|
||||
# RUNBOOK.md
|
||||
|
||||
Operational procedures for mc-proxy. Written for operators, not developers.
|
||||
|
||||
## Service Overview
|
||||
|
||||
mc-proxy is a Layer 4 TLS SNI proxy. It routes incoming TLS connections to
|
||||
backend services based on the SNI hostname. It does not terminate TLS or
|
||||
inspect application-layer traffic. A global firewall blocks connections by
|
||||
IP, CIDR, or GeoIP country before routing.
|
||||
|
||||
## Health Checks
|
||||
|
||||
### Via gRPC (requires admin API enabled)
|
||||
|
||||
```bash
|
||||
mc-proxy status -c /srv/mc-proxy/mc-proxy.toml
|
||||
```
|
||||
|
||||
Expected output:
|
||||
|
||||
```
|
||||
mc-proxy v0.1.0
|
||||
uptime: 4h32m10s
|
||||
connections: 1247
|
||||
|
||||
:443 routes=2 active=12
|
||||
:8443 routes=1 active=3
|
||||
:9443 routes=1 active=0
|
||||
```
|
||||
|
||||
### Via systemd
|
||||
|
||||
```bash
|
||||
systemctl status mc-proxy
|
||||
journalctl -u mc-proxy -n 50 --no-pager
|
||||
```
|
||||
|
||||
### Via process
|
||||
|
||||
```bash
|
||||
ss -tlnp | grep mc-proxy
|
||||
```
|
||||
|
||||
Verify all configured listener ports are in LISTEN state.
|
||||
|
||||
## Common Operations
|
||||
|
||||
### Start / Stop / Restart
|
||||
|
||||
```bash
|
||||
systemctl start mc-proxy
|
||||
systemctl stop mc-proxy
|
||||
systemctl restart mc-proxy
|
||||
```
|
||||
|
||||
Stopping the service triggers graceful shutdown: new connections are refused,
|
||||
in-flight connections drain for up to `shutdown_timeout` (default 30s), then
|
||||
remaining connections are force-closed.
|
||||
|
||||
### View Logs
|
||||
|
||||
```bash
|
||||
# Recent logs
|
||||
journalctl -u mc-proxy -n 100 --no-pager
|
||||
|
||||
# Follow live
|
||||
journalctl -u mc-proxy -f
|
||||
|
||||
# Filter by severity
|
||||
journalctl -u mc-proxy -p err
|
||||
```
|
||||
|
||||
### Reload GeoIP Database
|
||||
|
||||
Send SIGHUP to reload the GeoIP database without restarting:
|
||||
|
||||
```bash
|
||||
systemctl kill -s HUP mc-proxy
|
||||
```
|
||||
|
||||
Or:
|
||||
|
||||
```bash
|
||||
kill -HUP $(pidof mc-proxy)
|
||||
```
|
||||
|
||||
Verify in logs:
|
||||
|
||||
```
|
||||
level=INFO msg="received SIGHUP, reloading GeoIP database"
|
||||
```
|
||||
|
||||
### Create a Database Backup
|
||||
|
||||
```bash
|
||||
# Manual backup
|
||||
mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml
|
||||
|
||||
# Manual backup to a specific path
|
||||
mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml -o /tmp/mc-proxy-backup.db
|
||||
```
|
||||
|
||||
Automated daily backups run via the systemd timer:
|
||||
|
||||
```bash
|
||||
# Check timer status
|
||||
systemctl list-timers mc-proxy-backup.timer
|
||||
|
||||
# Run backup manually via systemd
|
||||
systemctl start mc-proxy-backup.service
|
||||
|
||||
# View backup logs
|
||||
journalctl -u mc-proxy-backup.service -n 20 --no-pager
|
||||
```
|
||||
|
||||
Backups are stored in `/srv/mc-proxy/backups/` and pruned after 30 days.
|
||||
|
||||
### Restore from Backup
|
||||
|
||||
1. Stop the service:
|
||||
```bash
|
||||
systemctl stop mc-proxy
|
||||
```
|
||||
2. Replace the database:
|
||||
```bash
|
||||
cp /srv/mc-proxy/backups/mc-proxy-<timestamp>.db /srv/mc-proxy/mc-proxy.db
|
||||
chown mc-proxy:mc-proxy /srv/mc-proxy/mc-proxy.db
|
||||
chmod 0600 /srv/mc-proxy/mc-proxy.db
|
||||
```
|
||||
3. Start the service:
|
||||
```bash
|
||||
systemctl start mc-proxy
|
||||
```
|
||||
4. Verify health:
|
||||
```bash
|
||||
mc-proxy status -c /srv/mc-proxy/mc-proxy.toml
|
||||
```
|
||||
|
||||
### Manage Routes at Runtime (gRPC)
|
||||
|
||||
Routes can be added and removed at runtime via the gRPC admin API using
|
||||
`grpcurl` or any gRPC client.
|
||||
|
||||
```bash
|
||||
# List routes for a listener
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/ListRoutes \
|
||||
-d '{"listener_addr": ":443"}'
|
||||
|
||||
# Add a route
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/AddRoute \
|
||||
-d '{"listener_addr": ":443", "route": {"hostname": "new.metacircular.net", "backend": "127.0.0.1:38443"}}'
|
||||
|
||||
# Remove a route
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/RemoveRoute \
|
||||
-d '{"listener_addr": ":443", "hostname": "old.metacircular.net"}'
|
||||
```
|
||||
|
||||
### Manage Firewall Rules at Runtime (gRPC)
|
||||
|
||||
```bash
|
||||
# List rules
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/GetFirewallRules
|
||||
|
||||
# Block an IP
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
|
||||
-d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "203.0.113.50"}}'
|
||||
|
||||
# Block a CIDR
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
|
||||
-d '{"rule": {"type": "FIREWALL_RULE_TYPE_CIDR", "value": "198.51.100.0/24"}}'
|
||||
|
||||
# Block a country
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
|
||||
-d '{"rule": {"type": "FIREWALL_RULE_TYPE_COUNTRY", "value": "RU"}}'
|
||||
|
||||
# Remove a rule
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/RemoveFirewallRule \
|
||||
-d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "203.0.113.50"}}'
|
||||
```
|
||||
|
||||
## Incident Procedures
|
||||
|
||||
### Proxy Not Starting
|
||||
|
||||
1. Check logs for the error:
|
||||
```bash
|
||||
journalctl -u mc-proxy -n 50 --no-pager
|
||||
```
|
||||
2. Common causes:
|
||||
- **"database.path is required"** — config file missing or malformed.
|
||||
- **"firewall: geoip_db is required"** — country blocks configured but GeoIP database missing.
|
||||
- **"address already in use"** — another process holds the port.
|
||||
```bash
|
||||
ss -tlnp | grep ':<port>'
|
||||
```
|
||||
- **Permission denied on database** — check ownership:
|
||||
```bash
|
||||
ls -la /srv/mc-proxy/mc-proxy.db
|
||||
chown mc-proxy:mc-proxy /srv/mc-proxy/mc-proxy.db
|
||||
```
|
||||
|
||||
### High Connection Count / Resource Exhaustion
|
||||
|
||||
1. Check active connections:
|
||||
```bash
|
||||
mc-proxy status -c /srv/mc-proxy/mc-proxy.toml
|
||||
```
|
||||
2. Check system-level connection count:
|
||||
```bash
|
||||
ss -tn | grep -c ':<port>'
|
||||
```
|
||||
3. If under attack, add firewall rules via gRPC to block the source:
|
||||
```bash
|
||||
grpcurl -cacert ca.pem -cert client.pem -key client-key.pem \
|
||||
localhost:9090 mc_proxy.v1.ProxyAdminService/AddFirewallRule \
|
||||
-d '{"rule": {"type": "FIREWALL_RULE_TYPE_IP", "value": "<attacker-ip>"}}'
|
||||
```
|
||||
4. If many IPs from one region, consider a country block or CIDR block.
|
||||
|
||||
### Database Corruption
|
||||
|
||||
1. Stop the service:
|
||||
```bash
|
||||
systemctl stop mc-proxy
|
||||
```
|
||||
2. Check database integrity:
|
||||
```bash
|
||||
sqlite3 /srv/mc-proxy/mc-proxy.db "PRAGMA integrity_check;"
|
||||
```
|
||||
3. If corrupted, restore from the most recent backup (see [Restore from Backup](#restore-from-backup)).
|
||||
4. If no backups exist, delete the database and restart. The service will
|
||||
re-seed from the TOML configuration:
|
||||
```bash
|
||||
rm /srv/mc-proxy/mc-proxy.db
|
||||
systemctl start mc-proxy
|
||||
```
|
||||
Note: any routes or firewall rules added at runtime via gRPC will be lost.
|
||||
|
||||
### GeoIP Database Stale or Missing
|
||||
|
||||
1. Download a fresh copy of GeoLite2-Country.mmdb from MaxMind.
|
||||
2. Place it at the configured path:
|
||||
```bash
|
||||
cp GeoLite2-Country.mmdb /srv/mc-proxy/GeoLite2-Country.mmdb
|
||||
chown mc-proxy:mc-proxy /srv/mc-proxy/GeoLite2-Country.mmdb
|
||||
```
|
||||
3. Reload without restart:
|
||||
```bash
|
||||
systemctl kill -s HUP mc-proxy
|
||||
```
|
||||
|
||||
### Certificate Expiry (gRPC Admin API)
|
||||
|
||||
The gRPC admin API uses TLS certificates from `/srv/mc-proxy/certs/`.
|
||||
Certificates are loaded at startup; replacing them requires a restart.
|
||||
|
||||
1. Replace the certificates:
|
||||
```bash
|
||||
cp new-cert.pem /srv/mc-proxy/certs/cert.pem
|
||||
cp new-key.pem /srv/mc-proxy/certs/key.pem
|
||||
chown mc-proxy:mc-proxy /srv/mc-proxy/certs/*.pem
|
||||
chmod 0600 /srv/mc-proxy/certs/key.pem
|
||||
```
|
||||
2. Restart:
|
||||
```bash
|
||||
systemctl restart mc-proxy
|
||||
```
|
||||
|
||||
Note: certificate expiry does not affect the proxy listeners — they do not
|
||||
terminate TLS.
|
||||
|
||||
### Backend Unreachable
|
||||
|
||||
If a backend service is down, connections to routes pointing at that backend
|
||||
will fail at the dial phase and the client receives a TCP RST. mc-proxy logs
|
||||
the dial failure at `warn` level.
|
||||
|
||||
1. Check logs for dial errors:
|
||||
```bash
|
||||
journalctl -u mc-proxy -n 100 --no-pager | grep "dial"
|
||||
```
|
||||
2. Verify the backend is running:
|
||||
```bash
|
||||
ss -tlnp | grep ':<backend-port>'
|
||||
```
|
||||
3. This is not an mc-proxy issue — fix the backend service.
|
||||
|
||||
## Escalation
|
||||
|
||||
If the runbook does not resolve the issue:
|
||||
|
||||
1. Collect logs: `journalctl -u mc-proxy --since "1 hour ago" > /tmp/mc-proxy-logs.txt`
|
||||
2. Collect status: `mc-proxy status -c /srv/mc-proxy/mc-proxy.toml > /tmp/mc-proxy-status.txt`
|
||||
3. Collect database state: `mc-proxy snapshot -c /srv/mc-proxy/mc-proxy.toml -o /tmp/mc-proxy-escalation.db`
|
||||
4. Escalate with the collected artifacts.
|
||||
Reference in New Issue
Block a user