# ARCHITECTURE.md mc-proxy is a Layer 4 TLS proxy and router for Metacircular Dynamics services. It inspects the SNI field of incoming TLS ClientHello messages to determine the target backend, then proxies raw TCP between the client and the appropriate container. A global firewall evaluates every connection before routing. ## Table of Contents 1. [System Overview](#system-overview) 2. [Connection Lifecycle](#connection-lifecycle) 3. [Firewall](#firewall) 4. [Routing](#routing) 5. [Configuration](#configuration) 6. [Storage](#storage) 7. [Deployment](#deployment) 8. [Security Model](#security-model) 9. [Future Work](#future-work) --- ## System Overview ``` ┌─────────────────────────────────────┐ │ mc-proxy │ Clients ──────┐ │ │ │ │ ┌──────────┐ ┌───────┐ ┌─────┐ │ ┌────────────┐ ├────▶│ │ Firewall │──▶│ SNI │──▶│Route│─│────▶│ Backend A │ │ │ │ (global) │ │Extract│ │Table│ │ │ :8443 │ ├────▶│ └──────────┘ └───────┘ └─────┘ │ ├────────────┤ │ │ │ RST │ │ │ Backend B │ Clients ──────┘ │ ▼ └────│────▶│ :9443 │ │ (blocked) │ └────────────┘ └─────────────────────────────────────┘ Listener 1 (:443) ─┐ Listener 2 (:8443) ─┼─ Each listener runs the same pipeline Listener N (:9443) ─┘ ``` Key properties: - **Layer 4 only.** mc-proxy never terminates TLS. It reads just enough of the ClientHello to extract the SNI hostname, then proxies the raw TCP stream to the matched backend. The backend handles TLS termination. - **TLS-only.** Non-TLS connections are not supported. If the first bytes of a connection are not a TLS ClientHello, the connection is reset. - **Multiple listeners.** A single mc-proxy instance binds to one or more ports. Each listener runs the same firewall → SNI → route pipeline. - **Global firewall.** Firewall rules apply to all listeners uniformly. There are no per-route firewall rules. - **No authentication.** mc-proxy is pre-auth infrastructure. It sits in front of services that handle their own authentication via MCIAS. --- ## Connection Lifecycle Every inbound connection follows this sequence: ``` 1. ACCEPT Listener accepts TCP connection. 2. FIREWALL Check source IP against blocklists: a. IP/CIDR block check. b. GeoIP country block check. If blocked → RST, done. 3. SNI EXTRACT Read the TLS ClientHello (without consuming it). Extract the SNI hostname. If no valid ClientHello or no SNI → RST, done. 4. ROUTE LOOKUP Match SNI hostname against the route table. If no match → RST, done. 5. BACKEND DIAL Open TCP connection to the matched backend address. If dial fails → RST, done. 6. PROXY Bidirectional byte copy: client ↔ backend. The buffered ClientHello bytes are forwarded first, then both directions copy concurrently. 7. CLOSE Either side closes → half-close propagation → done. ``` ### SNI Extraction The proxy peeks at the initial bytes of the connection without consuming them. It parses just enough of the TLS record layer and ClientHello to extract the `server_name` extension. The full ClientHello (including the SNI) is then forwarded to the backend so the backend's TLS handshake proceeds normally. If the ClientHello spans multiple TCP segments, the proxy buffers up to 16 KiB (the maximum TLS record size) before giving up. ### Bidirectional Copy After the backend connection is established, the proxy runs two concurrent copy loops (client→backend and backend→client). When either direction encounters an EOF or error: 1. The write side of the opposite direction is half-closed. 2. The remaining direction drains to completion. 3. Both connections are closed. Timeouts apply to the copy phase to prevent idle connections from accumulating indefinitely (see [Configuration](#configuration)). --- ## Firewall The firewall is a global, ordered rule set evaluated on every new connection before SNI extraction. Rules are evaluated in definition order; the first matching rule determines the outcome. If no rule matches, the connection is **allowed** (default allow — the firewall is an explicit blocklist, not an allowlist). ### Rule Types | Config Field | Match Field | Example | |--------------|-------------|---------| | `blocked_ips` | Source IP address | `"192.0.2.1"` | | `blocked_cidrs` | Source IP prefix | `"198.51.100.0/24"` | | `blocked_countries` | Source country (ISO 3166-1 alpha-2) | `"KP"`, `"CN"`, `"IN"`, `"IL"` | ### Blocked Connection Handling Blocked connections receive a TCP RST. No TLS alert, no HTTP error page, no indication of why the connection was refused. This is intentional — blocked sources should receive minimal information. ### GeoIP Database mc-proxy uses the [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data) free database for country-level IP geolocation. The database file is distributed separately from the binary and referenced by path in the configuration. The GeoIP database is loaded into memory at startup and can be reloaded via `SIGHUP` without restarting the proxy. If the database file is missing or unreadable at startup and GeoIP rules are configured, the proxy refuses to start. --- ## Routing Each listener has its own route table mapping SNI hostnames to backend addresses. A route entry consists of: | Field | Type | Description | |-------|------|-------------| | `hostname` | string | Exact SNI hostname to match (e.g. `metacrypt.metacircular.net`) | | `backend` | string | Backend address as `host:port` (e.g. `127.0.0.1:8443`) | Routes are scoped to the listener that accepted the connection. The same hostname can appear on different listeners with different backends, allowing the proxy to route the same service name to different backends depending on which port the client connected to. ### Match Semantics - Hostname matching is **exact** and **case-insensitive** (per RFC 6066, SNI hostnames are DNS names and compared case-insensitively). - Wildcard matching is not supported in the initial implementation. - If duplicate hostnames appear within the same listener, the proxy refuses to start. - If no route matches an incoming SNI hostname, the connection is reset. ### Route Table Source Route tables are persisted in the SQLite database. On first run, they are seeded from the TOML configuration. On subsequent runs, the database is the source of truth. Routes can be added or removed at runtime via the gRPC admin API. --- ## gRPC Admin API The admin API is optional (disabled if `[grpc]` is omitted from the config). It listens on a Unix domain socket for security — access is controlled via filesystem permissions (0600, owner-only). The API provides runtime management of routes and firewall rules without restarting the proxy. ### RPCs | RPC | Description | |-----|-------------| | `ListRoutes` | List all routes for a given listener | | `AddRoute` | Add a route to a listener (write-through to DB) | | `RemoveRoute` | Remove a route from a listener (write-through to DB) | | `GetFirewallRules` | List all firewall rules | | `AddFirewallRule` | Add a firewall rule (write-through to DB) | | `RemoveFirewallRule` | Remove a firewall rule (write-through to DB) | | `GetStatus` | Return version, uptime, listener status, connection counts | | `grpc.health.v1.Health` | Standard gRPC health check (Check, Watch) | ### Input Validation The admin API validates all inputs before persisting: - **Route backends** must be valid `host:port` tuples. - **IP firewall rules** must be valid IP addresses (`netip.ParseAddr`). - **CIDR firewall rules** must be valid prefixes in canonical form. - **Country firewall rules** must be exactly 2 uppercase letters (ISO 3166-1 alpha-2). ### Security The gRPC admin API has no MCIAS integration — mc-proxy is pre-auth infrastructure. Access control relies on Unix socket filesystem permissions: - Socket is created with mode `0600` (read/write for owner only) - Only processes running as the same user can connect - No network exposure — the API is not accessible over TCP --- ## Configuration TOML configuration file, loaded at startup. The proxy refuses to start if required fields are missing or invalid. ```toml # Database. Required. [database] path = "/srv/mc-proxy/mc-proxy.db" # Listeners. Each has its own route table (seeds DB on first run). [[listeners]] addr = ":443" [[listeners.routes]] hostname = "metacrypt.metacircular.net" backend = "127.0.0.1:18443" [[listeners.routes]] hostname = "mcias.metacircular.net" backend = "127.0.0.1:28443" [[listeners]] addr = ":8443" [[listeners.routes]] hostname = "metacrypt.metacircular.net" backend = "127.0.0.1:18443" [[listeners]] addr = ":9443" [[listeners.routes]] hostname = "mcias.metacircular.net" backend = "127.0.0.1:28443" # gRPC admin API. Optional — omit or leave addr empty to disable. # Listens on a Unix socket; access controlled via filesystem permissions. [grpc] addr = "/var/run/mc-proxy.sock" # Firewall. Global blocklist, evaluated before routing. Default allow. [firewall] geoip_db = "/srv/mc-proxy/GeoLite2-Country.mmdb" blocked_ips = ["192.0.2.1"] blocked_cidrs = ["198.51.100.0/24"] blocked_countries = ["KP", "CN", "IN", "IL"] # Proxy behavior. [proxy] connect_timeout = "5s" # Timeout for dialing backend idle_timeout = "300s" # Close connections idle longer than this shutdown_timeout = "30s" # Graceful shutdown drain period # Logging. [log] level = "info" # debug, info, warn, error ``` ### Environment Variable Overrides Configuration values can be overridden via environment variables using the prefix `MCPROXY_` with underscore-separated paths: ``` MCPROXY_LOG_LEVEL=debug MCPROXY_PROXY_IDLE_TIMEOUT=600s ``` Environment variables cannot define listeners, routes, or firewall rules — these are structural and must be in the TOML file. --- ## Storage ### SQLite Database Listeners, routes, and firewall rules are persisted in a SQLite database (WAL mode, foreign keys enabled, busy timeout 5000ms). The pure-Go driver `modernc.org/sqlite` is used (no CGo). **Startup behavior:** 1. Open the database at the configured path. Run migrations. 2. If the database is empty (first run): seed from the TOML config. 3. If the database has data: load from it. TOML listener/route/firewall fields are ignored. The TOML config continues to own operational settings: proxy timeouts, log level, gRPC config, GeoIP database path. **Write-through pattern:** The gRPC admin API writes to the database first, then updates in-memory state. If the database write fails, the in-memory state is not modified. ### Schema ```sql CREATE TABLE listeners ( id INTEGER PRIMARY KEY, addr TEXT NOT NULL UNIQUE ); CREATE TABLE routes ( id INTEGER PRIMARY KEY, listener_id INTEGER NOT NULL REFERENCES listeners(id) ON DELETE CASCADE, hostname TEXT NOT NULL, backend TEXT NOT NULL, UNIQUE(listener_id, hostname) ); CREATE TABLE firewall_rules ( id INTEGER PRIMARY KEY, type TEXT NOT NULL CHECK(type IN ('ip', 'cidr', 'country')), value TEXT NOT NULL, UNIQUE(type, value) ); ``` ### Data Directory ``` /srv/mc-proxy/ ├── mc-proxy.toml Configuration ├── mc-proxy.db SQLite database ├── mc-proxy.sock Unix socket for gRPC admin API ├── GeoLite2-Country.mmdb GeoIP database (if using country blocks) └── backups/ Database snapshots ``` mc-proxy does not terminate TLS on any listener. The proxy listeners pass through raw TLS streams, and the gRPC admin API uses a Unix socket (filesystem permissions for access control). --- ## Deployment ### Binary Single static binary, built with `CGO_ENABLED=0`. No runtime dependencies beyond the configuration file and optional GeoIP database. ### Container Multi-stage Docker build: 1. **Builder**: `golang:-alpine`, static compilation. 2. **Runtime**: `alpine`, non-root user (`mc-proxy`), port exposure determined by configuration. ### systemd | File | Purpose | |------|---------| | `mc-proxy.service` | Main proxy service | | `mc-proxy-backup.service` | Oneshot database backup (VACUUM INTO) | | `mc-proxy-backup.timer` | Daily backup timer (02:00 UTC, 5-minute jitter) | The proxy binds to privileged ports (443) and should use `AmbientCapabilities=CAP_NET_BIND_SERVICE` in the systemd unit rather than running as root. Standard security hardening directives apply per engineering standards (`NoNewPrivileges=true`, `ProtectSystem=strict`, etc.). ### Graceful Shutdown On `SIGINT` or `SIGTERM`: 1. Stop accepting new connections on all listeners. 2. Wait for in-flight connections to complete (up to `shutdown_timeout`). 3. Force-close remaining connections. 4. Exit. On `SIGHUP`: 1. Reload the GeoIP database from disk. 2. Continue serving with the updated database. Routes and firewall rules can be modified at runtime via the gRPC admin API (write-through to SQLite). Listener changes (adding/removing ports) require a full restart. TOML configuration changes (timeouts, log level, GeoIP path) also require a restart. --- ## Security Model mc-proxy is infrastructure that sits in front of authenticated services. It has no authentication or authorization of its own. ### Threat Mitigations | Threat | Mitigation | |--------|------------| | SNI spoofing | Backend performs its own TLS handshake — a spoofed SNI will fail certificate validation at the backend. mc-proxy does not trust SNI for security decisions beyond routing. | | Resource exhaustion (connection flood) | Idle timeout closes stale connections. Per-listener connection limits (future). Rate limiting (future). | | GeoIP evasion via IPv6 | GeoLite2 database includes IPv6 mappings. Both IPv4 and IPv6 source addresses are checked. | | GeoIP evasion via VPN/proxy | Accepted risk. GeoIP blocking is a compliance measure, not a security boundary. Determined adversaries will bypass it. | | Slowloris / slow ClientHello | Hardcoded 10-second timeout on the SNI extraction phase. If a complete ClientHello is not received within this window, the connection is reset. | | Backend unavailability | Connect timeout prevents indefinite hangs. Connection is reset if the backend is unreachable. | | Information leakage | Blocked connections receive only a TCP RST. No version strings, no error messages, no TLS alerts. | ### Security Invariants 1. mc-proxy never terminates TLS. It cannot read application-layer traffic. 2. mc-proxy never modifies the byte stream between client and backend. 3. Firewall rules are always evaluated before any routing decision. 4. The proxy never logs connection content — only metadata (source IP, SNI hostname, backend, timestamps, bytes transferred). --- ## Future Work Items are listed roughly in priority order: | Item | Description | |------|-------------| | **MCP integration** | Wire the gRPC admin API into the Metacircular Control Plane for centralized management. | | **L7 HTTPS support** | TLS-terminating mode for selected routes, enabling HTTP-level features (user-agent blocking, header inspection, request routing). | | **ACME integration** | Automatic certificate provisioning via Let's Encrypt for L7 routes. | | **User-agent blocking** | Block connections based on user-agent string (requires L7 mode). | | **Connection rate limiting** | Per-source-IP rate limits to mitigate connection floods. | | **Per-listener connection limits** | Cap maximum concurrent connections per listener. | | **Metrics** | Prometheus-compatible metrics: connections per listener, firewall blocks by rule, backend dial latency, active connections. |