Files
mc-proxy/ARCHITECTURE.md
Kyle Isom b25e1b0e79 Add per-IP rate limiting and Unix socket support for gRPC admin API
Rate limiting: per-source-IP connection rate limiter in the firewall layer
with configurable limit and sliding window. Blocklisted IPs are rejected
before rate limit evaluation to avoid wasting quota. Unix socket: the gRPC
admin API can now listen on a Unix domain socket (no TLS required), secured
by file permissions (0600), as a simpler alternative for local-only access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 14:37:21 -07:00

452 lines
17 KiB
Markdown

# ARCHITECTURE.md
mc-proxy is a Layer 4 TLS proxy and router for Metacircular Dynamics
services. It inspects the SNI field of incoming TLS ClientHello messages to
determine the target backend, then proxies raw TCP between the client and
the appropriate container. A global firewall evaluates every connection
before routing.
## Table of Contents
1. [System Overview](#system-overview)
2. [Connection Lifecycle](#connection-lifecycle)
3. [Firewall](#firewall)
4. [Routing](#routing)
5. [Configuration](#configuration)
6. [Storage](#storage)
7. [Deployment](#deployment)
8. [Security Model](#security-model)
9. [Future Work](#future-work)
---
## System Overview
```
┌─────────────────────────────────────┐
│ mc-proxy │
Clients ──────┐ │ │
│ │ ┌──────────┐ ┌───────┐ ┌─────┐ │ ┌────────────┐
├────▶│ │ Firewall │──▶│ SNI │──▶│Route│─│────▶│ Backend A │
│ │ │ (global) │ │Extract│ │Table│ │ │ :8443 │
├────▶│ └──────────┘ └───────┘ └─────┘ │ ├────────────┤
│ │ │ RST │ │ │ Backend B │
Clients ──────┘ │ ▼ └────│────▶│ :9443 │
│ (blocked) │ └────────────┘
└─────────────────────────────────────┘
Listener 1 (:443) ─┐
Listener 2 (:8443) ─┼─ Each listener runs the same pipeline
Listener N (:9443) ─┘
```
Key properties:
- **Layer 4 only.** mc-proxy never terminates TLS. It reads just enough of
the ClientHello to extract the SNI hostname, then proxies the raw TCP
stream to the matched backend. The backend handles TLS termination.
- **TLS-only.** Non-TLS connections are not supported. If the first bytes of
a connection are not a TLS ClientHello, the connection is reset.
- **Multiple listeners.** A single mc-proxy instance binds to one or more
ports. Each listener runs the same firewall → SNI → route pipeline.
- **Global firewall.** Firewall rules apply to all listeners uniformly.
There are no per-route firewall rules.
- **No authentication.** mc-proxy is pre-auth infrastructure. It sits in
front of services that handle their own authentication via MCIAS.
---
## Connection Lifecycle
Every inbound connection follows this sequence:
```
1. ACCEPT Listener accepts TCP connection.
2. FIREWALL Check source IP against blocklists:
a. IP/CIDR block check.
b. GeoIP country block check.
If blocked → RST, done.
3. SNI EXTRACT Read the TLS ClientHello (without consuming it).
Extract the SNI hostname.
If no valid ClientHello or no SNI → RST, done.
4. ROUTE LOOKUP Match SNI hostname against the route table.
If no match → RST, done.
5. BACKEND DIAL Open TCP connection to the matched backend address.
If dial fails → RST, done.
6. PROXY Bidirectional byte copy: client ↔ backend.
The buffered ClientHello bytes are forwarded first,
then both directions copy concurrently.
7. CLOSE Either side closes → half-close propagation → done.
```
### SNI Extraction
The proxy peeks at the initial bytes of the connection without consuming
them. It parses just enough of the TLS record layer and ClientHello to
extract the `server_name` extension. The full ClientHello (including the
SNI) is then forwarded to the backend so the backend's TLS handshake
proceeds normally.
If the ClientHello spans multiple TCP segments, the proxy buffers up to
16 KiB (the maximum TLS record size) before giving up.
### Bidirectional Copy
After the backend connection is established, the proxy runs two concurrent
copy loops (client→backend and backend→client). When either direction
encounters an EOF or error:
1. The write side of the opposite direction is half-closed.
2. The remaining direction drains to completion.
3. Both connections are closed.
Timeouts apply to the copy phase to prevent idle connections from
accumulating indefinitely (see [Configuration](#configuration)).
---
## Firewall
The firewall is a global, ordered rule set evaluated on every new
connection before SNI extraction. Rules are evaluated in definition order;
the first matching rule determines the outcome. If no rule matches, the
connection is **allowed** (default allow — the firewall is an explicit
blocklist, not an allowlist).
### Rule Types
| Config Field | Match Field | Example |
|--------------|-------------|---------|
| `blocked_ips` | Source IP address | `"192.0.2.1"` |
| `blocked_cidrs` | Source IP prefix | `"198.51.100.0/24"` |
| `blocked_countries` | Source country (ISO 3166-1 alpha-2) | `"KP"`, `"CN"`, `"IN"`, `"IL"` |
### Blocked Connection Handling
Blocked connections receive a TCP RST. No TLS alert, no HTTP error page, no
indication of why the connection was refused. This is intentional — blocked
sources should receive minimal information.
### GeoIP Database
mc-proxy uses the [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
free database for country-level IP geolocation. The database file is
distributed separately from the binary and referenced by path in the
configuration.
The GeoIP database is loaded into memory at startup and can be reloaded
via `SIGHUP` without restarting the proxy. If the database file is missing
or unreadable at startup and GeoIP rules are configured, the proxy refuses
to start.
---
## Routing
Each listener has its own route table mapping SNI hostnames to backend
addresses. A route entry consists of:
| Field | Type | Description |
|-------|------|-------------|
| `hostname` | string | Exact SNI hostname to match (e.g. `metacrypt.metacircular.net`) |
| `backend` | string | Backend address as `host:port` (e.g. `127.0.0.1:8443`) |
Routes are scoped to the listener that accepted the connection. The same
hostname can appear on different listeners with different backends, allowing
the proxy to route the same service name to different backends depending
on which port the client connected to.
### Match Semantics
- Hostname matching is **exact** and **case-insensitive** (per RFC 6066,
SNI hostnames are DNS names and compared case-insensitively).
- Wildcard matching is not supported in the initial implementation.
- If duplicate hostnames appear within the same listener, the proxy refuses
to start.
- If no route matches an incoming SNI hostname, the connection is reset.
### Route Table Source
Route tables are persisted in the SQLite database. On first run, they are
seeded from the TOML configuration. On subsequent runs, the database is
the source of truth. Routes can be added or removed at runtime via the
gRPC admin API.
---
## gRPC Admin API
The admin API is optional (disabled if `[grpc]` is omitted from the config).
When enabled, it requires TLS and supports optional mTLS for client
authentication. TLS 1.3 is enforced. The API provides runtime management
of routes and firewall rules without restarting the proxy.
### RPCs
| RPC | Description |
|-----|-------------|
| `ListRoutes` | List all routes for a given listener |
| `AddRoute` | Add a route to a listener (write-through to DB) |
| `RemoveRoute` | Remove a route from a listener (write-through to DB) |
| `GetFirewallRules` | List all firewall rules |
| `AddFirewallRule` | Add a firewall rule (write-through to DB) |
| `RemoveFirewallRule` | Remove a firewall rule (write-through to DB) |
| `GetStatus` | Return version, uptime, listener status, connection counts |
### Input Validation
The admin API validates all inputs before persisting:
- **Route backends** must be valid `host:port` tuples.
- **IP firewall rules** must be valid IP addresses (`netip.ParseAddr`).
- **CIDR firewall rules** must be valid prefixes in canonical form.
- **Country firewall rules** must be exactly 2 uppercase letters (ISO 3166-1 alpha-2).
### Security
The gRPC admin API has no MCIAS integration — mc-proxy is pre-auth
infrastructure. Access control relies on:
1. **Network binding**: bind to `127.0.0.1` (default) to restrict to local access.
2. **mTLS**: configure `client_ca` to require client certificates.
If the admin API is exposed on a non-loopback interface without mTLS,
any network client can modify routing and firewall rules.
---
## Configuration
TOML configuration file, loaded at startup. The proxy refuses to start if
required fields are missing or invalid.
```toml
# Database. Required.
[database]
path = "/srv/mc-proxy/mc-proxy.db"
# Listeners. Each has its own route table (seeds DB on first run).
[[listeners]]
addr = ":443"
[[listeners.routes]]
hostname = "metacrypt.metacircular.net"
backend = "127.0.0.1:18443"
[[listeners.routes]]
hostname = "mcias.metacircular.net"
backend = "127.0.0.1:28443"
[[listeners]]
addr = ":8443"
[[listeners.routes]]
hostname = "metacrypt.metacircular.net"
backend = "127.0.0.1:18443"
[[listeners]]
addr = ":9443"
[[listeners.routes]]
hostname = "mcias.metacircular.net"
backend = "127.0.0.1:28443"
# gRPC admin API. Optional — omit or leave addr empty to disable.
# If enabled, tls_cert and tls_key are required (TLS 1.3 only).
# client_ca enables mTLS and is strongly recommended for non-loopback addresses.
# ca_cert is used by the `status` CLI command to verify the server certificate.
[grpc]
addr = "127.0.0.1:9090"
tls_cert = "/srv/mc-proxy/certs/cert.pem"
tls_key = "/srv/mc-proxy/certs/key.pem"
client_ca = "/srv/mc-proxy/certs/ca.pem"
ca_cert = "/srv/mc-proxy/certs/ca.pem"
# Firewall. Global blocklist, evaluated before routing. Default allow.
[firewall]
geoip_db = "/srv/mc-proxy/GeoLite2-Country.mmdb"
blocked_ips = ["192.0.2.1"]
blocked_cidrs = ["198.51.100.0/24"]
blocked_countries = ["KP", "CN", "IN", "IL"]
# Proxy behavior.
[proxy]
connect_timeout = "5s" # Timeout for dialing backend
idle_timeout = "300s" # Close connections idle longer than this
shutdown_timeout = "30s" # Graceful shutdown drain period
# Logging.
[log]
level = "info" # debug, info, warn, error
```
### Environment Variable Overrides
Configuration values can be overridden via environment variables using the
prefix `MCPROXY_` with underscore-separated paths:
```
MCPROXY_LOG_LEVEL=debug
MCPROXY_PROXY_IDLE_TIMEOUT=600s
```
Environment variables cannot define listeners, routes, or firewall rules —
these are structural and must be in the TOML file.
---
## Storage
### SQLite Database
Listeners, routes, and firewall rules are persisted in a SQLite database
(WAL mode, foreign keys enabled, busy timeout 5000ms). The pure-Go driver
`modernc.org/sqlite` is used (no CGo).
**Startup behavior:**
1. Open the database at the configured path. Run migrations.
2. If the database is empty (first run): seed from the TOML config.
3. If the database has data: load from it. TOML listener/route/firewall
fields are ignored.
The TOML config continues to own operational settings: proxy timeouts,
log level, gRPC config, GeoIP database path.
**Write-through pattern:** The gRPC admin API writes to the database first,
then updates in-memory state. If the database write fails, the in-memory
state is not modified.
### Schema
```sql
CREATE TABLE listeners (
id INTEGER PRIMARY KEY,
addr TEXT NOT NULL UNIQUE
);
CREATE TABLE routes (
id INTEGER PRIMARY KEY,
listener_id INTEGER NOT NULL REFERENCES listeners(id) ON DELETE CASCADE,
hostname TEXT NOT NULL,
backend TEXT NOT NULL,
UNIQUE(listener_id, hostname)
);
CREATE TABLE firewall_rules (
id INTEGER PRIMARY KEY,
type TEXT NOT NULL CHECK(type IN ('ip', 'cidr', 'country')),
value TEXT NOT NULL,
UNIQUE(type, value)
);
```
### Data Directory
```
/srv/mc-proxy/
├── mc-proxy.toml Configuration
├── mc-proxy.db SQLite database
├── certs/ TLS certificates (for gRPC admin API)
├── GeoLite2-Country.mmdb GeoIP database (if using country blocks)
└── backups/ Database snapshots
```
mc-proxy does not terminate TLS on the proxy listeners, so no proxy
certificates are needed. The `certs/` directory is for the gRPC admin
API's TLS and optional mTLS keypair.
---
## Deployment
### Binary
Single static binary, built with `CGO_ENABLED=0`. No runtime dependencies
beyond the configuration file and optional GeoIP database.
### Container
Multi-stage Docker build:
1. **Builder**: `golang:<version>-alpine`, static compilation.
2. **Runtime**: `alpine`, non-root user (`mc-proxy`), port exposure
determined by configuration.
### systemd
| File | Purpose |
|------|---------|
| `mc-proxy.service` | Main proxy service |
| `mc-proxy-backup.service` | Oneshot database backup (VACUUM INTO) |
| `mc-proxy-backup.timer` | Daily backup timer (02:00 UTC, 5-minute jitter) |
The proxy binds to privileged ports (443) and should use `AmbientCapabilities=CAP_NET_BIND_SERVICE`
in the systemd unit rather than running as root.
Standard security hardening directives apply per engineering standards
(`NoNewPrivileges=true`, `ProtectSystem=strict`, etc.).
### Graceful Shutdown
On `SIGINT` or `SIGTERM`:
1. Stop accepting new connections on all listeners.
2. Wait for in-flight connections to complete (up to `shutdown_timeout`).
3. Force-close remaining connections.
4. Exit.
On `SIGHUP`:
1. Reload the GeoIP database from disk.
2. Continue serving with the updated database.
Routes and firewall rules can be modified at runtime via the gRPC admin API
(write-through to SQLite). Listener changes (adding/removing ports) require
a full restart. TOML configuration changes (timeouts, log level, GeoIP path)
also require a restart.
---
## Security Model
mc-proxy is infrastructure that sits in front of authenticated services.
It has no authentication or authorization of its own.
### Threat Mitigations
| Threat | Mitigation |
|--------|------------|
| SNI spoofing | Backend performs its own TLS handshake — a spoofed SNI will fail certificate validation at the backend. mc-proxy does not trust SNI for security decisions beyond routing. |
| Resource exhaustion (connection flood) | Idle timeout closes stale connections. Per-listener connection limits (future). Rate limiting (future). |
| GeoIP evasion via IPv6 | GeoLite2 database includes IPv6 mappings. Both IPv4 and IPv6 source addresses are checked. |
| GeoIP evasion via VPN/proxy | Accepted risk. GeoIP blocking is a compliance measure, not a security boundary. Determined adversaries will bypass it. |
| Slowloris / slow ClientHello | Hardcoded 10-second timeout on the SNI extraction phase. If a complete ClientHello is not received within this window, the connection is reset. |
| Backend unavailability | Connect timeout prevents indefinite hangs. Connection is reset if the backend is unreachable. |
| Information leakage | Blocked connections receive only a TCP RST. No version strings, no error messages, no TLS alerts. |
### Security Invariants
1. mc-proxy never terminates TLS. It cannot read application-layer traffic.
2. mc-proxy never modifies the byte stream between client and backend.
3. Firewall rules are always evaluated before any routing decision.
4. The proxy never logs connection content — only metadata (source IP,
SNI hostname, backend, timestamps, bytes transferred).
---
## Future Work
Items are listed roughly in priority order:
| Item | Description |
|------|-------------|
| **MCP integration** | Wire the gRPC admin API into the Metacircular Control Plane for centralized management. |
| **L7 HTTPS support** | TLS-terminating mode for selected routes, enabling HTTP-level features (user-agent blocking, header inspection, request routing). |
| **ACME integration** | Automatic certificate provisioning via Let's Encrypt for L7 routes. |
| **User-agent blocking** | Block connections based on user-agent string (requires L7 mode). |
| **Connection rate limiting** | Per-source-IP rate limits to mitigate connection floods. |
| **Per-listener connection limits** | Cap maximum concurrent connections per listener. |
| **Health check endpoint** | Lightweight TCP or HTTP health check for load balancers and monitoring. |
| **Metrics** | Prometheus-compatible metrics: connections per listener, firewall blocks by rule, backend dial latency, active connections. |