Initial implementation of mc-proxy
Layer 4 TLS SNI proxy with global firewall (IP/CIDR/GeoIP blocking), per-listener route tables, bidirectional TCP relay with half-close propagation, and a gRPC admin API (routes, firewall, status) with TLS/mTLS support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
345
ARCHITECTURE.md
Normal file
345
ARCHITECTURE.md
Normal file
@@ -0,0 +1,345 @@
|
||||
# ARCHITECTURE.md
|
||||
|
||||
mc-proxy is a Layer 4 TLS proxy and router for Metacircular Dynamics
|
||||
services. It inspects the SNI field of incoming TLS ClientHello messages to
|
||||
determine the target backend, then proxies raw TCP between the client and
|
||||
the appropriate container. A global firewall evaluates every connection
|
||||
before routing.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [System Overview](#system-overview)
|
||||
2. [Connection Lifecycle](#connection-lifecycle)
|
||||
3. [Firewall](#firewall)
|
||||
4. [Routing](#routing)
|
||||
5. [Configuration](#configuration)
|
||||
6. [Storage](#storage)
|
||||
7. [Deployment](#deployment)
|
||||
8. [Security Model](#security-model)
|
||||
9. [Future Work](#future-work)
|
||||
|
||||
---
|
||||
|
||||
## System Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────┐
|
||||
│ mc-proxy │
|
||||
Clients ──────┐ │ │
|
||||
│ │ ┌──────────┐ ┌───────┐ ┌─────┐ │ ┌────────────┐
|
||||
├────▶│ │ Firewall │──▶│ SNI │──▶│Route│─│────▶│ Backend A │
|
||||
│ │ │ (global) │ │Extract│ │Table│ │ │ :8443 │
|
||||
├────▶│ └──────────┘ └───────┘ └─────┘ │ ├────────────┤
|
||||
│ │ │ RST │ │ │ Backend B │
|
||||
Clients ──────┘ │ ▼ └────│────▶│ :9443 │
|
||||
│ (blocked) │ └────────────┘
|
||||
└─────────────────────────────────────┘
|
||||
|
||||
Listener 1 (:443) ─┐
|
||||
Listener 2 (:8443) ─┼─ Each listener runs the same pipeline
|
||||
Listener N (:9443) ─┘
|
||||
```
|
||||
|
||||
Key properties:
|
||||
|
||||
- **Layer 4 only.** mc-proxy never terminates TLS. It reads just enough of
|
||||
the ClientHello to extract the SNI hostname, then proxies the raw TCP
|
||||
stream to the matched backend. The backend handles TLS termination.
|
||||
- **TLS-only.** Non-TLS connections are not supported. If the first bytes of
|
||||
a connection are not a TLS ClientHello, the connection is reset.
|
||||
- **Multiple listeners.** A single mc-proxy instance binds to one or more
|
||||
ports. Each listener runs the same firewall → SNI → route pipeline.
|
||||
- **Global firewall.** Firewall rules apply to all listeners uniformly.
|
||||
There are no per-route firewall rules.
|
||||
- **No authentication.** mc-proxy is pre-auth infrastructure. It sits in
|
||||
front of services that handle their own authentication via MCIAS.
|
||||
|
||||
---
|
||||
|
||||
## Connection Lifecycle
|
||||
|
||||
Every inbound connection follows this sequence:
|
||||
|
||||
```
|
||||
1. ACCEPT Listener accepts TCP connection.
|
||||
2. FIREWALL Check source IP against blocklists:
|
||||
a. IP/CIDR block check.
|
||||
b. GeoIP country block check.
|
||||
If blocked → RST, done.
|
||||
3. SNI EXTRACT Read the TLS ClientHello (without consuming it).
|
||||
Extract the SNI hostname.
|
||||
If no valid ClientHello or no SNI → RST, done.
|
||||
4. ROUTE LOOKUP Match SNI hostname against the route table.
|
||||
If no match → RST, done.
|
||||
5. BACKEND DIAL Open TCP connection to the matched backend address.
|
||||
If dial fails → RST, done.
|
||||
6. PROXY Bidirectional byte copy: client ↔ backend.
|
||||
The buffered ClientHello bytes are forwarded first,
|
||||
then both directions copy concurrently.
|
||||
7. CLOSE Either side closes → half-close propagation → done.
|
||||
```
|
||||
|
||||
### SNI Extraction
|
||||
|
||||
The proxy peeks at the initial bytes of the connection without consuming
|
||||
them. It parses just enough of the TLS record layer and ClientHello to
|
||||
extract the `server_name` extension. The full ClientHello (including the
|
||||
SNI) is then forwarded to the backend so the backend's TLS handshake
|
||||
proceeds normally.
|
||||
|
||||
If the ClientHello spans multiple TCP segments, the proxy buffers up to
|
||||
16 KiB (the maximum TLS record size) before giving up.
|
||||
|
||||
### Bidirectional Copy
|
||||
|
||||
After the backend connection is established, the proxy runs two concurrent
|
||||
copy loops (client→backend and backend→client). When either direction
|
||||
encounters an EOF or error:
|
||||
|
||||
1. The write side of the opposite direction is half-closed.
|
||||
2. The remaining direction drains to completion.
|
||||
3. Both connections are closed.
|
||||
|
||||
Timeouts apply to the copy phase to prevent idle connections from
|
||||
accumulating indefinitely (see [Configuration](#configuration)).
|
||||
|
||||
---
|
||||
|
||||
## Firewall
|
||||
|
||||
The firewall is a global, ordered rule set evaluated on every new
|
||||
connection before SNI extraction. Rules are evaluated in definition order;
|
||||
the first matching rule determines the outcome. If no rule matches, the
|
||||
connection is **allowed** (default allow — the firewall is an explicit
|
||||
blocklist, not an allowlist).
|
||||
|
||||
### Rule Types
|
||||
|
||||
| Config Field | Match Field | Example |
|
||||
|--------------|-------------|---------|
|
||||
| `blocked_ips` | Source IP address | `"192.0.2.1"` |
|
||||
| `blocked_cidrs` | Source IP prefix | `"198.51.100.0/24"` |
|
||||
| `blocked_countries` | Source country (ISO 3166-1 alpha-2) | `"KP"`, `"CN"`, `"IN"`, `"IL"` |
|
||||
|
||||
### Blocked Connection Handling
|
||||
|
||||
Blocked connections receive a TCP RST. No TLS alert, no HTTP error page, no
|
||||
indication of why the connection was refused. This is intentional — blocked
|
||||
sources should receive minimal information.
|
||||
|
||||
### GeoIP Database
|
||||
|
||||
mc-proxy uses the [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
|
||||
free database for country-level IP geolocation. The database file is
|
||||
distributed separately from the binary and referenced by path in the
|
||||
configuration.
|
||||
|
||||
The GeoIP database is loaded into memory at startup and can be reloaded
|
||||
via `SIGHUP` without restarting the proxy. If the database file is missing
|
||||
or unreadable at startup and GeoIP rules are configured, the proxy refuses
|
||||
to start.
|
||||
|
||||
---
|
||||
|
||||
## Routing
|
||||
|
||||
Each listener has its own route table mapping SNI hostnames to backend
|
||||
addresses. A route entry consists of:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `hostname` | string | Exact SNI hostname to match (e.g. `metacrypt.metacircular.net`) |
|
||||
| `backend` | string | Backend address as `host:port` (e.g. `127.0.0.1:8443`) |
|
||||
|
||||
Routes are scoped to the listener that accepted the connection. The same
|
||||
hostname can appear on different listeners with different backends, allowing
|
||||
the proxy to route the same service name to different backends depending
|
||||
on which port the client connected to.
|
||||
|
||||
### Match Semantics
|
||||
|
||||
- Hostname matching is **exact** and **case-insensitive** (per RFC 6066,
|
||||
SNI hostnames are DNS names and compared case-insensitively).
|
||||
- Wildcard matching is not supported in the initial implementation.
|
||||
- If duplicate hostnames appear within the same listener, the proxy refuses
|
||||
to start.
|
||||
- If no route matches an incoming SNI hostname, the connection is reset.
|
||||
|
||||
### Route Table Source
|
||||
|
||||
Route tables are defined inline under each listener in the TOML
|
||||
configuration file. The design anticipates future migration to a SQLite
|
||||
database for dynamic route management via the control plane API.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
TOML configuration file, loaded at startup. The proxy refuses to start if
|
||||
required fields are missing or invalid.
|
||||
|
||||
```toml
|
||||
# Listeners. Each has its own route table.
|
||||
[[listeners]]
|
||||
addr = ":443"
|
||||
|
||||
[[listeners.routes]]
|
||||
hostname = "metacrypt.metacircular.net"
|
||||
backend = "127.0.0.1:18443"
|
||||
|
||||
[[listeners.routes]]
|
||||
hostname = "mcias.metacircular.net"
|
||||
backend = "127.0.0.1:28443"
|
||||
|
||||
[[listeners]]
|
||||
addr = ":8443"
|
||||
|
||||
[[listeners.routes]]
|
||||
hostname = "metacrypt.metacircular.net"
|
||||
backend = "127.0.0.1:18443"
|
||||
|
||||
[[listeners]]
|
||||
addr = ":9443"
|
||||
|
||||
[[listeners.routes]]
|
||||
hostname = "mcias.metacircular.net"
|
||||
backend = "127.0.0.1:28443"
|
||||
|
||||
# Firewall. Global blocklist, evaluated before routing. Default allow.
|
||||
[firewall]
|
||||
geoip_db = "/srv/mc-proxy/GeoLite2-Country.mmdb"
|
||||
blocked_ips = ["192.0.2.1"]
|
||||
blocked_cidrs = ["198.51.100.0/24"]
|
||||
blocked_countries = ["KP", "CN", "IN", "IL"]
|
||||
|
||||
# Proxy behavior.
|
||||
[proxy]
|
||||
connect_timeout = "5s" # Timeout for dialing backend
|
||||
idle_timeout = "300s" # Close connections idle longer than this
|
||||
shutdown_timeout = "30s" # Graceful shutdown drain period
|
||||
|
||||
# Logging.
|
||||
[log]
|
||||
level = "info" # debug, info, warn, error
|
||||
```
|
||||
|
||||
### Environment Variable Overrides
|
||||
|
||||
Configuration values can be overridden via environment variables using the
|
||||
prefix `MCPROXY_` with underscore-separated paths:
|
||||
|
||||
```
|
||||
MCPROXY_LOG_LEVEL=debug
|
||||
MCPROXY_PROXY_IDLE_TIMEOUT=600s
|
||||
```
|
||||
|
||||
Environment variables cannot define listeners, routes, or firewall rules —
|
||||
these are structural and must be in the TOML file.
|
||||
|
||||
---
|
||||
|
||||
## Storage
|
||||
|
||||
mc-proxy has minimal storage requirements. There is no database in the
|
||||
initial implementation.
|
||||
|
||||
```
|
||||
/srv/mc-proxy/
|
||||
├── mc-proxy.toml Configuration
|
||||
├── GeoLite2-Country.mmdb GeoIP database (if using country blocks)
|
||||
└── backups/ Reserved for future use
|
||||
```
|
||||
|
||||
No TLS certificates are stored — mc-proxy does not terminate TLS.
|
||||
|
||||
---
|
||||
|
||||
## Deployment
|
||||
|
||||
### Binary
|
||||
|
||||
Single static binary, built with `CGO_ENABLED=0`. No runtime dependencies
|
||||
beyond the configuration file and optional GeoIP database.
|
||||
|
||||
### Container
|
||||
|
||||
Multi-stage Docker build:
|
||||
|
||||
1. **Builder**: `golang:<version>-alpine`, static compilation.
|
||||
2. **Runtime**: `alpine`, non-root user (`mc-proxy`), port exposure
|
||||
determined by configuration.
|
||||
|
||||
### systemd
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `mc-proxy.service` | Main proxy service |
|
||||
|
||||
The proxy binds to privileged ports (443) and should use `AmbientCapabilities=CAP_NET_BIND_SERVICE`
|
||||
in the systemd unit rather than running as root.
|
||||
|
||||
Standard security hardening directives apply per engineering standards
|
||||
(`NoNewPrivileges=true`, `ProtectSystem=strict`, etc.).
|
||||
|
||||
### Graceful Shutdown
|
||||
|
||||
On `SIGINT` or `SIGTERM`:
|
||||
|
||||
1. Stop accepting new connections on all listeners.
|
||||
2. Wait for in-flight connections to complete (up to `shutdown_timeout`).
|
||||
3. Force-close remaining connections.
|
||||
4. Exit.
|
||||
|
||||
On `SIGHUP`:
|
||||
|
||||
1. Reload the GeoIP database from disk.
|
||||
2. Continue serving with the updated database.
|
||||
|
||||
Configuration changes (routes, listeners, firewall rules) require a full
|
||||
restart. Hot reload of routing rules is deferred to the future SQLite-backed
|
||||
implementation.
|
||||
|
||||
---
|
||||
|
||||
## Security Model
|
||||
|
||||
mc-proxy is infrastructure that sits in front of authenticated services.
|
||||
It has no authentication or authorization of its own.
|
||||
|
||||
### Threat Mitigations
|
||||
|
||||
| Threat | Mitigation |
|
||||
|--------|------------|
|
||||
| SNI spoofing | Backend performs its own TLS handshake — a spoofed SNI will fail certificate validation at the backend. mc-proxy does not trust SNI for security decisions beyond routing. |
|
||||
| Resource exhaustion (connection flood) | Idle timeout closes stale connections. Per-listener connection limits (future). Rate limiting (future). |
|
||||
| GeoIP evasion via IPv6 | GeoLite2 database includes IPv6 mappings. Both IPv4 and IPv6 source addresses are checked. |
|
||||
| GeoIP evasion via VPN/proxy | Accepted risk. GeoIP blocking is a compliance measure, not a security boundary. Determined adversaries will bypass it. |
|
||||
| Slowloris / slow ClientHello | Timeout on the SNI extraction phase. If a complete ClientHello is not received within a reasonable window (e.g. 10s), the connection is reset. |
|
||||
| Backend unavailability | Connect timeout prevents indefinite hangs. Connection is reset if the backend is unreachable. |
|
||||
| Information leakage | Blocked connections receive only a TCP RST. No version strings, no error messages, no TLS alerts. |
|
||||
|
||||
### Security Invariants
|
||||
|
||||
1. mc-proxy never terminates TLS. It cannot read application-layer traffic.
|
||||
2. mc-proxy never modifies the byte stream between client and backend.
|
||||
3. Firewall rules are always evaluated before any routing decision.
|
||||
4. The proxy never logs connection content — only metadata (source IP,
|
||||
SNI hostname, backend, timestamps, bytes transferred).
|
||||
|
||||
---
|
||||
|
||||
## Future Work
|
||||
|
||||
Items are listed roughly in priority order:
|
||||
|
||||
| Item | Description |
|
||||
|------|-------------|
|
||||
| **gRPC admin API** | Internal-only API for managing routes and firewall rules at runtime, integrated with the Metacircular Control Plane. |
|
||||
| **SQLite route storage** | Migrate route table from TOML to SQLite for dynamic management via the admin API. |
|
||||
| **L7 HTTPS support** | TLS-terminating mode for selected routes, enabling HTTP-level features (user-agent blocking, header inspection, request routing). |
|
||||
| **ACME integration** | Automatic certificate provisioning via Let's Encrypt for L7 routes. |
|
||||
| **User-agent blocking** | Block connections based on user-agent string (requires L7 mode). |
|
||||
| **Connection rate limiting** | Per-source-IP rate limits to mitigate connection floods. |
|
||||
| **Per-listener connection limits** | Cap maximum concurrent connections per listener. |
|
||||
| **Health check endpoint** | Lightweight TCP or HTTP health check for load balancers and monitoring. |
|
||||
| **Metrics** | Prometheus-compatible metrics: connections per listener, firewall blocks by rule, backend dial latency, active connections. |
|
||||
Reference in New Issue
Block a user