Harden deployment and fix PEN-01

- Fix Bearer token extraction to validate prefix (PEN-01) - Add TestExtractBearerFromRequest covering PEN-01 edge cases - Fix flaky TestRenewToken timing (2s → 4s lifetime) - Move default config/install paths to /srv/mcias - Add RUNBOOK.md for operational procedures - Update AUDIT.md with penetration test round 4 Security: extractBearerFromRequest now uses case-insensitive prefix validation instead of fixed-offset slicing, rejecting non-Bearer Authorization schemes that were previously accepted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 22:33:24 -07:00
parent 2a85d4bf2b
commit 1121b7d4fd
14 changed files with 774 additions and 117 deletions
--- a/AUDIT.md
+++ b/AUDIT.md
@@ -1,24 +1,148 @@
 # MCIAS Security Audit Report

-**Date:** 2026-03-14 (updated — all findings remediated)
+**Date:** 2026-03-14 (updated — penetration test round 4)
 **Original audit date:** 2026-03-13
 **Auditor role:** Penetration tester (code review + live instance probing)
-**Scope:** Full codebase and running instance at localhost:8443 — authentication flows, token lifecycle, cryptography, database layer, REST/gRPC/UI servers, authorization, headers, and operational security.
+**Scope:** Full codebase and running instance at mcias.metacircular.net:8443 — authentication flows, token lifecycle, cryptography, database layer, REST/gRPC/UI servers, authorization, headers, and operational security.
 **Methodology:** Static code analysis, live HTTP probing, architectural review.

 ---

 ## Executive Summary

-MCIAS has a strong security posture. All findings from three audit rounds (CRIT-01/CRIT-02, DEF-01 through DEF-10, and SEC-01 through SEC-12) have been remediated. The cryptographic foundations are sound, JWT validation is correct, SQL injection is not possible, XSS is prevented by Go's html/template auto-escaping, and CSRF protection is well-implemented.
+MCIAS has a strong security posture. All findings from the first three audit rounds (CRIT-01/CRIT-02, DEF-01 through DEF-10, and SEC-01 through SEC-12) have been remediated. The cryptographic foundations are sound, JWT validation is correct, SQL injection is not possible, XSS is prevented by Go's html/template auto-escaping, and CSRF protection is well-implemented.

-**All findings from this audit have been remediated.** See the remediation table below for details.
+A fourth-round penetration test (PEN-01 through PEN-07) against the live instance at `mcias.metacircular.net:8443` identified 7 new findings: 2 medium, 2 low, and 3 informational. **Unauthorized access was not achieved** — the system's defense-in-depth held. See the open findings table below for details.
+
+---
+
+## Open Findings (PEN-01 through PEN-07)
+
+Identified during the fourth-round penetration test on 2026-03-14 against the live instance at `mcias.metacircular.net:8443` and the source code at the same commit.
+
+| ID | Severity | Finding | Status |
+|----|----------|---------|--------|
+| PEN-01 | Medium | `extractBearerFromRequest` does not validate "Bearer " prefix | **Fixed** — uses `strings.SplitN` + `strings.EqualFold` prefix validation, matching middleware implementation |
+| PEN-02 | Medium | Security headers missing from live instance responses | **Open** (code/deployment discrepancy) |
+| PEN-03 | Low | CSP `unsafe-inline` on `/docs` Swagger UI endpoint | **Open** |
+| PEN-04 | Info | OpenAPI spec publicly accessible without authentication | **Open** |
+| PEN-05 | Info | gRPC port 9443 publicly accessible | **Open** |
+| PEN-06 | Low | REST login increments lockout counter for missing TOTP code | **Open** |
+| PEN-07 | Info | Rate limiter is per-IP only, no per-account limiting | **Open** |
+
+<details>
+<summary>Finding descriptions (click to expand)</summary>
+
+### PEN-01 — `extractBearerFromRequest` Does Not Validate "Bearer " Prefix (Medium)
+
+**File:** `internal/server/server.go` (lines 1414–1425)
+
+The server-level `extractBearerFromRequest` function extracts the token by slicing the `Authorization` header at offset 7 (`len("Bearer ")`) without first verifying that the header actually starts with `"Bearer "`. Any 8+ character `Authorization` value is accepted — e.g., `Authorization: XXXXXXXX` would extract `X` as the token string.
+
+```go
+// Current (vulnerable):
+if len(auth) <= len(prefix) {
+    return "", fmt.Errorf("malformed Authorization header")
+}
+return auth[len(prefix):], nil  // no prefix check
+```
+
+The middleware-level `extractBearerToken` in `internal/middleware/middleware.go` (lines 303–316) correctly uses `strings.SplitN` and `strings.EqualFold` to validate the prefix. The server-level function should be replaced with a call to the middleware version, or the same validation logic should be applied.
+
+**Impact:** Low in practice because the extracted garbage is then passed to JWT validation which will reject it. However, it violates defense-in-depth: a future change to token validation could widen the attack surface, and the inconsistency between the two extraction functions is a maintenance hazard.
+
+**Recommendation:** Replace `extractBearerFromRequest` with a call to `middleware.extractBearerToken` (after exporting it or moving the function), or replicate the prefix validation.
+
+**Fix:** `extractBearerFromRequest` now uses `strings.SplitN` and `strings.EqualFold` to validate the `"Bearer"` prefix before extracting the token, matching the middleware implementation. Test `TestExtractBearerFromRequest` covers valid tokens, missing headers, non-Bearer schemes (Token, Basic), empty tokens, case-insensitive matching, and the previously-accepted garbage input.
+
+---
+
+### PEN-02 — Security Headers Missing from Live Instance Responses (Medium)
+
+**Live probe:** `https://mcias.metacircular.net:8443/login`
+
+The live instance's `/login` response did not include the security headers (`X-Content-Type-Options`, `Strict-Transport-Security`, `Cache-Control`, `Permissions-Policy`) that the source code's `globalSecurityHeaders` and UI `securityHeaders` middleware should be applying (SEC-04 and SEC-10 fixes).
+
+This is likely a code/deployment discrepancy — the deployed binary may predate the SEC-04/SEC-10 fixes, or the middleware may not be wired into the route chain correctly for all paths.
+
+**Impact:** Without HSTS, browsers will not enforce HTTPS-only access. Without `X-Content-Type-Options: nosniff`, MIME-type sniffing attacks are possible. Without `Cache-Control: no-store`, authenticated responses may be cached by proxies or browsers.
+
+**Recommendation:** Redeploy the current source to the live instance and verify headers with `curl -I`.
+
+---
+
+### PEN-03 — CSP `unsafe-inline` on `/docs` Swagger UI Endpoint (Low)
+
+**File:** `internal/server/server.go` (lines 1450–1452)
+
+The `docsSecurityHeaders` wrapper sets a Content-Security-Policy that includes `script-src 'self' 'unsafe-inline'` and `style-src 'self' 'unsafe-inline'`. This is required by Swagger UI's rendering approach, but it weakens CSP protection on the docs endpoint.
+
+**Impact:** If an attacker can inject content into the Swagger UI page (e.g., via a reflected parameter in the OpenAPI spec URL), inline scripts would execute. The blast radius is limited to the `/docs` path, which requires no authentication (see PEN-04).
+
+**Recommendation:** Consider serving Swagger UI from a separate subdomain or using CSP nonces instead of `unsafe-inline`. Alternatively, accept the risk given the limited scope.
+
+---
+
+### PEN-04 — OpenAPI Spec Publicly Accessible Without Authentication (Informational)
+
+**Live probe:** `GET /openapi.yaml` returns the full API specification without authentication.
+
+The OpenAPI spec reveals all API endpoints, request/response schemas, authentication flows, and error codes. While security-through-obscurity is not a defense, exposing the full API surface to unauthenticated users provides a roadmap for attackers.
+
+**Recommendation:** Consider requiring authentication for `/openapi.yaml` and `/docs`, or accept the risk if the API surface is intended to be public.
+
+---
+
+### PEN-05 — gRPC Port 9443 Publicly Accessible (Informational)
+
+**Live probe:** Port 9443 accepts TLS connections and serves gRPC.
+
+The gRPC interface is accessible from the public internet. While it requires authentication for all RPCs, exposing it increases the attack surface (gRPC-specific vulnerabilities, protocol-level attacks).
+
+**Recommendation:** If gRPC is only used for server-to-server communication, restrict access at the firewall/network level. If it must be public, ensure gRPC-specific rate limiting and monitoring are in place (SEC-06 fix applies here).
+
+---
+
+### PEN-06 — REST Login Increments Lockout Counter for Missing TOTP Code (Low)
+
+**File:** `internal/server/server.go` (lines 271–277)
+
+When a TOTP-enrolled account submits a login request without a TOTP code, the REST handler calls `s.db.RecordLoginFailure(acct.ID)` before returning the `"TOTP code required"` error. This increments the lockout counter even though the user has not actually failed authentication — they simply omitted the TOTP field.
+
+The gRPC handler was fixed for this exact issue in DEF-08, but the REST handler was not updated to match.
+
+```go
+// Current (REST — increments lockout for missing TOTP):
+if acct.TOTPRequired {
+    if req.TOTPCode == "" {
+        s.writeAudit(r, model.EventLoginFail, &acct.ID, nil, `{"reason":"totp_missing"}`)
+        _ = s.db.RecordLoginFailure(acct.ID)  // should not increment
+        middleware.WriteError(w, http.StatusUnauthorized, "TOTP code required", "totp_required")
+        return
+    }
+```
+
+**Impact:** An attacker who knows a username with TOTP enabled can lock the account by sending 10 login requests with a valid password but no TOTP code. The password must be correct (wrong passwords also increment the counter), but this lowers the bar from "must guess TOTP" to "must omit TOTP." More practically, legitimate users with buggy clients that forget the TOTP field could self-lock.
+
+**Recommendation:** Remove the `RecordLoginFailure` call from the TOTP-missing branch, matching the gRPC handler's behavior after the DEF-08 fix.
+
+---
+
+### PEN-07 — Rate Limiter Is Per-IP Only, No Per-Account Limiting (Informational)
+
+The rate limiter uses a per-IP token bucket. An attacker with access to multiple IP addresses (botnet, cloud instances, rotating proxies) can distribute brute-force attempts across IPs to bypass the per-IP limit.
+
+The account lockout mechanism (10 failures in 15 minutes) provides a secondary defense, but it is a blunt instrument — it locks out the legitimate user as well.
+
+**Recommendation:** Consider adding per-account rate limiting as a complement to per-IP limiting. This would cap login attempts per username regardless of source IP, without affecting other users. The account lockout already partially serves this role, but a softer rate limit (e.g., 1 req/s per username) would slow distributed attacks without locking out the user.
+
+</details>

 ---

 ## Remediated Findings (SEC-01 through SEC-12)

-All findings from this audit have been remediated. The original descriptions are preserved below for reference.
+All findings from the SEC audit round have been remediated. The original descriptions are preserved below for reference.

 | ID | Severity | Finding | Status |
 |----|----------|---------|--------|
@@ -186,9 +310,37 @@ These implementation details are exemplary and should be maintained:

 ---

+## Penetration Test — Attacks That Failed (2026-03-14)
+
+The following attacks were attempted against the live instance and failed, confirming the effectiveness of existing defenses:
+
+| Attack | Result |
+|--------|--------|
+| JWT `alg:none` bypass | Rejected — `ValidateToken` enforces `alg=EdDSA` |
+| JWT `alg:HS256` key-confusion | Rejected — only EdDSA accepted |
+| Forged JWT with random Ed25519 key | Rejected — signature verification failed |
+| Username enumeration via timing | Not possible — ~355ms for both existing and non-existing users (dummy Argon2 working) |
+| Username enumeration via error messages | Not possible — identical `"invalid credentials"` for all failure modes |
+| Account lockout enumeration | Not possible — locked accounts return same response as wrong password (SEC-02 fix confirmed) |
+| SQL injection via login fields | Not possible — parameterized queries throughout |
+| JSON body bomb (oversized payload) | Rejected — `http.MaxBytesReader` returns 413 (SEC-05 fix confirmed) |
+| Unknown JSON fields | Rejected — `DisallowUnknownFields` active on decoder |
+| Rate limit bypass | Working correctly — 429 after burst exhaustion, `Retry-After` header present |
+| Admin endpoint access without auth | Properly returns 401 |
+| Directory traversal on static files | Not possible — `noDirListing` wrapper returns 404 (SEC-07 fix confirmed) |
+| Public key endpoint | Returns Ed25519 PKIX key (expected; public by design) |
+
+---
+
 ## Remediation Status

-**All findings remediated.** No open items remain. Next audit should focus on:
+**CRIT/DEF/SEC series:** All 24 findings remediated. No open items.
+
+**PEN series (2026-03-14):** 1 of 7 findings remediated; 6 open (1 medium, 2 low, 3 informational). Unauthorized access was not achieved. Priority remediation items:
+1. **PEN-02** (Medium): Redeploy current source to live instance and verify security headers
+2. **PEN-06** (Low): Remove `RecordLoginFailure` from REST TOTP-missing branch
+
+Next audit should focus on:
+- Verifying PEN-01 through PEN-07 remediation
 - Any new features added since 2026-03-14
 - Dependency updates and CVE review
- Live penetration testing of remediated endpoints