Layer 4 TLS SNI proxy with global firewall (IP/CIDR/GeoIP blocking), per-listener route tables, bidirectional TCP relay with half-close propagation, and a gRPC admin API (routes, firewall, status) with TLS/mTLS support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
346 lines
13 KiB
Markdown
346 lines
13 KiB
Markdown
# ARCHITECTURE.md
|
|
|
|
mc-proxy is a Layer 4 TLS proxy and router for Metacircular Dynamics
|
|
services. It inspects the SNI field of incoming TLS ClientHello messages to
|
|
determine the target backend, then proxies raw TCP between the client and
|
|
the appropriate container. A global firewall evaluates every connection
|
|
before routing.
|
|
|
|
## Table of Contents
|
|
|
|
1. [System Overview](#system-overview)
|
|
2. [Connection Lifecycle](#connection-lifecycle)
|
|
3. [Firewall](#firewall)
|
|
4. [Routing](#routing)
|
|
5. [Configuration](#configuration)
|
|
6. [Storage](#storage)
|
|
7. [Deployment](#deployment)
|
|
8. [Security Model](#security-model)
|
|
9. [Future Work](#future-work)
|
|
|
|
---
|
|
|
|
## System Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────┐
|
|
│ mc-proxy │
|
|
Clients ──────┐ │ │
|
|
│ │ ┌──────────┐ ┌───────┐ ┌─────┐ │ ┌────────────┐
|
|
├────▶│ │ Firewall │──▶│ SNI │──▶│Route│─│────▶│ Backend A │
|
|
│ │ │ (global) │ │Extract│ │Table│ │ │ :8443 │
|
|
├────▶│ └──────────┘ └───────┘ └─────┘ │ ├────────────┤
|
|
│ │ │ RST │ │ │ Backend B │
|
|
Clients ──────┘ │ ▼ └────│────▶│ :9443 │
|
|
│ (blocked) │ └────────────┘
|
|
└─────────────────────────────────────┘
|
|
|
|
Listener 1 (:443) ─┐
|
|
Listener 2 (:8443) ─┼─ Each listener runs the same pipeline
|
|
Listener N (:9443) ─┘
|
|
```
|
|
|
|
Key properties:
|
|
|
|
- **Layer 4 only.** mc-proxy never terminates TLS. It reads just enough of
|
|
the ClientHello to extract the SNI hostname, then proxies the raw TCP
|
|
stream to the matched backend. The backend handles TLS termination.
|
|
- **TLS-only.** Non-TLS connections are not supported. If the first bytes of
|
|
a connection are not a TLS ClientHello, the connection is reset.
|
|
- **Multiple listeners.** A single mc-proxy instance binds to one or more
|
|
ports. Each listener runs the same firewall → SNI → route pipeline.
|
|
- **Global firewall.** Firewall rules apply to all listeners uniformly.
|
|
There are no per-route firewall rules.
|
|
- **No authentication.** mc-proxy is pre-auth infrastructure. It sits in
|
|
front of services that handle their own authentication via MCIAS.
|
|
|
|
---
|
|
|
|
## Connection Lifecycle
|
|
|
|
Every inbound connection follows this sequence:
|
|
|
|
```
|
|
1. ACCEPT Listener accepts TCP connection.
|
|
2. FIREWALL Check source IP against blocklists:
|
|
a. IP/CIDR block check.
|
|
b. GeoIP country block check.
|
|
If blocked → RST, done.
|
|
3. SNI EXTRACT Read the TLS ClientHello (without consuming it).
|
|
Extract the SNI hostname.
|
|
If no valid ClientHello or no SNI → RST, done.
|
|
4. ROUTE LOOKUP Match SNI hostname against the route table.
|
|
If no match → RST, done.
|
|
5. BACKEND DIAL Open TCP connection to the matched backend address.
|
|
If dial fails → RST, done.
|
|
6. PROXY Bidirectional byte copy: client ↔ backend.
|
|
The buffered ClientHello bytes are forwarded first,
|
|
then both directions copy concurrently.
|
|
7. CLOSE Either side closes → half-close propagation → done.
|
|
```
|
|
|
|
### SNI Extraction
|
|
|
|
The proxy peeks at the initial bytes of the connection without consuming
|
|
them. It parses just enough of the TLS record layer and ClientHello to
|
|
extract the `server_name` extension. The full ClientHello (including the
|
|
SNI) is then forwarded to the backend so the backend's TLS handshake
|
|
proceeds normally.
|
|
|
|
If the ClientHello spans multiple TCP segments, the proxy buffers up to
|
|
16 KiB (the maximum TLS record size) before giving up.
|
|
|
|
### Bidirectional Copy
|
|
|
|
After the backend connection is established, the proxy runs two concurrent
|
|
copy loops (client→backend and backend→client). When either direction
|
|
encounters an EOF or error:
|
|
|
|
1. The write side of the opposite direction is half-closed.
|
|
2. The remaining direction drains to completion.
|
|
3. Both connections are closed.
|
|
|
|
Timeouts apply to the copy phase to prevent idle connections from
|
|
accumulating indefinitely (see [Configuration](#configuration)).
|
|
|
|
---
|
|
|
|
## Firewall
|
|
|
|
The firewall is a global, ordered rule set evaluated on every new
|
|
connection before SNI extraction. Rules are evaluated in definition order;
|
|
the first matching rule determines the outcome. If no rule matches, the
|
|
connection is **allowed** (default allow — the firewall is an explicit
|
|
blocklist, not an allowlist).
|
|
|
|
### Rule Types
|
|
|
|
| Config Field | Match Field | Example |
|
|
|--------------|-------------|---------|
|
|
| `blocked_ips` | Source IP address | `"192.0.2.1"` |
|
|
| `blocked_cidrs` | Source IP prefix | `"198.51.100.0/24"` |
|
|
| `blocked_countries` | Source country (ISO 3166-1 alpha-2) | `"KP"`, `"CN"`, `"IN"`, `"IL"` |
|
|
|
|
### Blocked Connection Handling
|
|
|
|
Blocked connections receive a TCP RST. No TLS alert, no HTTP error page, no
|
|
indication of why the connection was refused. This is intentional — blocked
|
|
sources should receive minimal information.
|
|
|
|
### GeoIP Database
|
|
|
|
mc-proxy uses the [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
|
|
free database for country-level IP geolocation. The database file is
|
|
distributed separately from the binary and referenced by path in the
|
|
configuration.
|
|
|
|
The GeoIP database is loaded into memory at startup and can be reloaded
|
|
via `SIGHUP` without restarting the proxy. If the database file is missing
|
|
or unreadable at startup and GeoIP rules are configured, the proxy refuses
|
|
to start.
|
|
|
|
---
|
|
|
|
## Routing
|
|
|
|
Each listener has its own route table mapping SNI hostnames to backend
|
|
addresses. A route entry consists of:
|
|
|
|
| Field | Type | Description |
|
|
|-------|------|-------------|
|
|
| `hostname` | string | Exact SNI hostname to match (e.g. `metacrypt.metacircular.net`) |
|
|
| `backend` | string | Backend address as `host:port` (e.g. `127.0.0.1:8443`) |
|
|
|
|
Routes are scoped to the listener that accepted the connection. The same
|
|
hostname can appear on different listeners with different backends, allowing
|
|
the proxy to route the same service name to different backends depending
|
|
on which port the client connected to.
|
|
|
|
### Match Semantics
|
|
|
|
- Hostname matching is **exact** and **case-insensitive** (per RFC 6066,
|
|
SNI hostnames are DNS names and compared case-insensitively).
|
|
- Wildcard matching is not supported in the initial implementation.
|
|
- If duplicate hostnames appear within the same listener, the proxy refuses
|
|
to start.
|
|
- If no route matches an incoming SNI hostname, the connection is reset.
|
|
|
|
### Route Table Source
|
|
|
|
Route tables are defined inline under each listener in the TOML
|
|
configuration file. The design anticipates future migration to a SQLite
|
|
database for dynamic route management via the control plane API.
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
TOML configuration file, loaded at startup. The proxy refuses to start if
|
|
required fields are missing or invalid.
|
|
|
|
```toml
|
|
# Listeners. Each has its own route table.
|
|
[[listeners]]
|
|
addr = ":443"
|
|
|
|
[[listeners.routes]]
|
|
hostname = "metacrypt.metacircular.net"
|
|
backend = "127.0.0.1:18443"
|
|
|
|
[[listeners.routes]]
|
|
hostname = "mcias.metacircular.net"
|
|
backend = "127.0.0.1:28443"
|
|
|
|
[[listeners]]
|
|
addr = ":8443"
|
|
|
|
[[listeners.routes]]
|
|
hostname = "metacrypt.metacircular.net"
|
|
backend = "127.0.0.1:18443"
|
|
|
|
[[listeners]]
|
|
addr = ":9443"
|
|
|
|
[[listeners.routes]]
|
|
hostname = "mcias.metacircular.net"
|
|
backend = "127.0.0.1:28443"
|
|
|
|
# Firewall. Global blocklist, evaluated before routing. Default allow.
|
|
[firewall]
|
|
geoip_db = "/srv/mc-proxy/GeoLite2-Country.mmdb"
|
|
blocked_ips = ["192.0.2.1"]
|
|
blocked_cidrs = ["198.51.100.0/24"]
|
|
blocked_countries = ["KP", "CN", "IN", "IL"]
|
|
|
|
# Proxy behavior.
|
|
[proxy]
|
|
connect_timeout = "5s" # Timeout for dialing backend
|
|
idle_timeout = "300s" # Close connections idle longer than this
|
|
shutdown_timeout = "30s" # Graceful shutdown drain period
|
|
|
|
# Logging.
|
|
[log]
|
|
level = "info" # debug, info, warn, error
|
|
```
|
|
|
|
### Environment Variable Overrides
|
|
|
|
Configuration values can be overridden via environment variables using the
|
|
prefix `MCPROXY_` with underscore-separated paths:
|
|
|
|
```
|
|
MCPROXY_LOG_LEVEL=debug
|
|
MCPROXY_PROXY_IDLE_TIMEOUT=600s
|
|
```
|
|
|
|
Environment variables cannot define listeners, routes, or firewall rules —
|
|
these are structural and must be in the TOML file.
|
|
|
|
---
|
|
|
|
## Storage
|
|
|
|
mc-proxy has minimal storage requirements. There is no database in the
|
|
initial implementation.
|
|
|
|
```
|
|
/srv/mc-proxy/
|
|
├── mc-proxy.toml Configuration
|
|
├── GeoLite2-Country.mmdb GeoIP database (if using country blocks)
|
|
└── backups/ Reserved for future use
|
|
```
|
|
|
|
No TLS certificates are stored — mc-proxy does not terminate TLS.
|
|
|
|
---
|
|
|
|
## Deployment
|
|
|
|
### Binary
|
|
|
|
Single static binary, built with `CGO_ENABLED=0`. No runtime dependencies
|
|
beyond the configuration file and optional GeoIP database.
|
|
|
|
### Container
|
|
|
|
Multi-stage Docker build:
|
|
|
|
1. **Builder**: `golang:<version>-alpine`, static compilation.
|
|
2. **Runtime**: `alpine`, non-root user (`mc-proxy`), port exposure
|
|
determined by configuration.
|
|
|
|
### systemd
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `mc-proxy.service` | Main proxy service |
|
|
|
|
The proxy binds to privileged ports (443) and should use `AmbientCapabilities=CAP_NET_BIND_SERVICE`
|
|
in the systemd unit rather than running as root.
|
|
|
|
Standard security hardening directives apply per engineering standards
|
|
(`NoNewPrivileges=true`, `ProtectSystem=strict`, etc.).
|
|
|
|
### Graceful Shutdown
|
|
|
|
On `SIGINT` or `SIGTERM`:
|
|
|
|
1. Stop accepting new connections on all listeners.
|
|
2. Wait for in-flight connections to complete (up to `shutdown_timeout`).
|
|
3. Force-close remaining connections.
|
|
4. Exit.
|
|
|
|
On `SIGHUP`:
|
|
|
|
1. Reload the GeoIP database from disk.
|
|
2. Continue serving with the updated database.
|
|
|
|
Configuration changes (routes, listeners, firewall rules) require a full
|
|
restart. Hot reload of routing rules is deferred to the future SQLite-backed
|
|
implementation.
|
|
|
|
---
|
|
|
|
## Security Model
|
|
|
|
mc-proxy is infrastructure that sits in front of authenticated services.
|
|
It has no authentication or authorization of its own.
|
|
|
|
### Threat Mitigations
|
|
|
|
| Threat | Mitigation |
|
|
|--------|------------|
|
|
| SNI spoofing | Backend performs its own TLS handshake — a spoofed SNI will fail certificate validation at the backend. mc-proxy does not trust SNI for security decisions beyond routing. |
|
|
| Resource exhaustion (connection flood) | Idle timeout closes stale connections. Per-listener connection limits (future). Rate limiting (future). |
|
|
| GeoIP evasion via IPv6 | GeoLite2 database includes IPv6 mappings. Both IPv4 and IPv6 source addresses are checked. |
|
|
| GeoIP evasion via VPN/proxy | Accepted risk. GeoIP blocking is a compliance measure, not a security boundary. Determined adversaries will bypass it. |
|
|
| Slowloris / slow ClientHello | Timeout on the SNI extraction phase. If a complete ClientHello is not received within a reasonable window (e.g. 10s), the connection is reset. |
|
|
| Backend unavailability | Connect timeout prevents indefinite hangs. Connection is reset if the backend is unreachable. |
|
|
| Information leakage | Blocked connections receive only a TCP RST. No version strings, no error messages, no TLS alerts. |
|
|
|
|
### Security Invariants
|
|
|
|
1. mc-proxy never terminates TLS. It cannot read application-layer traffic.
|
|
2. mc-proxy never modifies the byte stream between client and backend.
|
|
3. Firewall rules are always evaluated before any routing decision.
|
|
4. The proxy never logs connection content — only metadata (source IP,
|
|
SNI hostname, backend, timestamps, bytes transferred).
|
|
|
|
---
|
|
|
|
## Future Work
|
|
|
|
Items are listed roughly in priority order:
|
|
|
|
| Item | Description |
|
|
|------|-------------|
|
|
| **gRPC admin API** | Internal-only API for managing routes and firewall rules at runtime, integrated with the Metacircular Control Plane. |
|
|
| **SQLite route storage** | Migrate route table from TOML to SQLite for dynamic management via the admin API. |
|
|
| **L7 HTTPS support** | TLS-terminating mode for selected routes, enabling HTTP-level features (user-agent blocking, header inspection, request routing). |
|
|
| **ACME integration** | Automatic certificate provisioning via Let's Encrypt for L7 routes. |
|
|
| **User-agent blocking** | Block connections based on user-agent string (requires L7 mode). |
|
|
| **Connection rate limiting** | Per-source-IP rate limits to mitigate connection floods. |
|
|
| **Per-listener connection limits** | Cap maximum concurrent connections per listener. |
|
|
| **Health check endpoint** | Lightweight TCP or HTTP health check for load balancers and monitoring. |
|
|
| **Metrics** | Prometheus-compatible metrics: connections per listener, firewall blocks by rule, backend dial latency, active connections. |
|