Update AUDIT.md: all SEC findings remediated

- Mark SEC-01 through SEC-12 as fixed with fix descriptions
- Update executive summary to reflect full remediation
- Move original finding descriptions to collapsible section
- Replace remediation priority table with status section

Security: documentation-only change, no code modifications

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-14 21:31:30 -07:00
parent 8f09e0e81a
commit 2a85d4bf2b
3 changed files with 113 additions and 120 deletions

View File

@@ -11,7 +11,8 @@
"Bash(sqlite3 /Users/kyle/src/mcias/run/mcias.db \"PRAGMA table_info\\(policy_rules\\);\" 2>&1)", "Bash(sqlite3 /Users/kyle/src/mcias/run/mcias.db \"PRAGMA table_info\\(policy_rules\\);\" 2>&1)",
"Bash(sqlite3 /Users/kyle/src/mcias/run/mcias.db \"SELECT * FROM schema_version;\" 2>&1; sqlite3 /Users/kyle/src/mcias/run/mcias.db \"SELECT * FROM schema_migrations;\" 2>&1)", "Bash(sqlite3 /Users/kyle/src/mcias/run/mcias.db \"SELECT * FROM schema_version;\" 2>&1; sqlite3 /Users/kyle/src/mcias/run/mcias.db \"SELECT * FROM schema_migrations;\" 2>&1)",
"Bash(go run:*)", "Bash(go run:*)",
"Bash(go list:*)" "Bash(go list:*)",
"Bash(go vet:*)"
] ]
}, },
"hooks": { "hooks": {

Binary file not shown.

230
AUDIT.md
View File

@@ -1,202 +1,194 @@
# MCIAS Security Audit Report # MCIAS Security Audit Report
**Date:** 2026-03-12 **Date:** 2026-03-14 (updated — all findings remediated)
**Scope:** Full codebase — authentication flows, token lifecycle, cryptography, database layer, REST/gRPC/UI servers, authorization, and operational security. **Original audit date:** 2026-03-13
**Methodology:** Static code analysis of all source files with adversarial focus on auth flows, crypto usage, input handling, and inter-component trust boundaries. **Auditor role:** Penetration tester (code review + live instance probing)
**Scope:** Full codebase and running instance at localhost:8443 — authentication flows, token lifecycle, cryptography, database layer, REST/gRPC/UI servers, authorization, headers, and operational security.
**Methodology:** Static code analysis, live HTTP probing, architectural review.
--- ---
## Executive Summary ## Executive Summary
MCIAS demonstrates strong security awareness throughout. The cryptographic foundations are sound, credential handling is careful, and the most common web/API authentication vulnerabilities have been explicitly addressed. The codebase shows consistent attention to defense-in-depth: constant-time comparisons, dummy Argon2 operations for unknown users, algorithm-confusion prevention in JWT validation, parameterized SQL, audit logging, and CSRF protection with HMAC-signed double-submit. MCIAS has a strong security posture. All findings from three audit rounds (CRIT-01/CRIT-02, DEF-01 through DEF-10, and SEC-01 through SEC-12) have been remediated. The cryptographic foundations are sound, JWT validation is correct, SQL injection is not possible, XSS is prevented by Go's html/template auto-escaping, and CSRF protection is well-implemented.
**Two confirmed bugs with real security impact were found**, along with several defense-in-depth gaps that should be addressed before production deployment. The overall security posture is well above average for this class of system. **All findings from this audit have been remediated.** See the remediation table below for details.
--- ---
## Confirmed Vulnerabilities ## Remediated Findings (SEC-01 through SEC-12)
### CRIT-01 — TOTP Replay Attack (Medium-High) All findings from this audit have been remediated. The original descriptions are preserved below for reference.
**File:** `internal/auth/auth.go:208-230`, `internal/grpcserver/auth.go:84`, `internal/ui/handlers_auth.go:152` | ID | Severity | Finding | Status |
|----|----------|---------|--------|
| SEC-01 | Medium | TOTP enrollment did not require password re-authentication | **Fixed** — both REST and gRPC now require current password, with lockout counter on failure |
| SEC-02 | Medium | Account lockout response leaked account existence | **Fixed** — locked accounts now return same 401 `"invalid credentials"` as wrong password, with dummy Argon2 for timing uniformity |
| SEC-03 | Medium | Token renewal had no proximity or re-auth check | **Fixed** — renewal requires token to have consumed ≥50% of its lifetime |
| SEC-04 | Low-Med | REST API responses lacked security headers | **Fixed**`globalSecurityHeaders` middleware applies `X-Content-Type-Options`, HSTS, and `Cache-Control: no-store` to all routes |
| SEC-05 | Low | No request body size limit on REST API | **Fixed**`decodeJSON` wraps body with `http.MaxBytesReader` (1 MiB); max password length enforced |
| SEC-06 | Low | gRPC rate limiter ignored TrustedProxy | **Fixed**`grpcClientIP` extracts real client IP via metadata when peer matches trusted proxy |
| SEC-07 | Low | Static file directory listing enabled | **Fixed**`noDirListing` wrapper returns 404 for directory requests |
| SEC-08 | Low | System token issuance was not atomic | **Fixed**`IssueSystemToken` wraps revoke+track in a single SQLite transaction |
| SEC-09 | Info | Navigation bar exposed admin UI structure to non-admin users | **Fixed** — nav links conditionally rendered with `{{if .IsAdmin}}` |
| SEC-10 | Info | No `Permissions-Policy` header | **Fixed**`Permissions-Policy: camera=(), microphone=(), geolocation=(), payment=()` added |
| SEC-11 | Info | Audit log details used `fmt.Sprintf` instead of `json.Marshal` | **Fixed**`audit.JSON` and `audit.JSONWithRoles` helpers use `json.Marshal` |
| SEC-12 | Info | Default token expiry was 30 days | **Fixed** — default reduced to 7 days (168h); renewal proximity check (SEC-03) further limits exposure |
`ValidateTOTP` accepts any code falling in the current ±1 time-step window (±30 seconds, so a given code is valid for ~90 seconds) but **never records which codes have already been used**. The same valid TOTP code can be submitted an unlimited number of times within that window. There is no `last_used_totp_counter` or `last_used_totp_at` field in the schema. <details>
<summary>Original finding descriptions (click to expand)</summary>
**Attack scenario:** An attacker who has observed a valid TOTP code (e.g. from a compromised session, shoulder surfing, or a MITM that delayed delivery) can reuse that code to authenticate within its validity window. ### SEC-01 — TOTP Enrollment Does Not Require Password Re-authentication (Medium)
**Fix:** Track the last accepted TOTP counter per account in the database. Reject any counter ≤ the last accepted one. This requires a new column (`last_totp_counter INTEGER`) on the `accounts` table and a check-and-update in `ValidateTOTP`'s callers (or within it, with a DB reference passed in). **Files:** `internal/server/server.go`, `internal/grpcserver/auth.go`
`POST /v1/auth/totp/enroll` and the gRPC `EnrollTOTP` RPC originally required only a valid JWT — no password confirmation. If an attacker stole a session token, they could enroll TOTP on the victim's account.
**Fix:** Both endpoints now require the current password, with lockout counter incremented on failure.
--- ---
### CRIT-02 — gRPC `EnrollTOTP` Enables TOTP Before Confirmation (Medium) ### SEC-02 — Account Lockout Response Leaks Account Existence (Medium)
**File:** `internal/grpcserver/auth.go:202` vs `internal/server/server.go:724-728` Locked accounts originally returned HTTP 429 / gRPC `ResourceExhausted` with `"account temporarily locked"`, distinguishable from the HTTP 401 `"invalid credentials"` returned for wrong passwords.
The REST `EnrollTOTP` handler explicitly uses `StorePendingTOTP` (which keeps `totp_required=0`) and a comment at line 724 explains why: **Fix:** All login paths now return the same `"invalid credentials"` response for locked accounts, with dummy Argon2 to maintain timing uniformity.
```go
// Security: use StorePendingTOTP (not SetTOTP) so that totp_required
// is not enabled until the user confirms the code.
```
The gRPC `EnrollTOTP` handler at line 202 calls `SetTOTP` directly, which immediately sets `totp_required=1`. Any user who initiates TOTP enrollment over gRPC but does not immediately confirm will have their account locked out — they cannot log in because TOTP is required, but no working TOTP secret is confirmed.
**Fix:** Change `grpcserver/auth.go:202` from `a.s.db.SetTOTP(...)` to `a.s.db.StorePendingTOTP(...)`, matching the REST server's behavior and the documented intent of those two DB methods.
--- ---
## Defense-in-Depth Gaps ### SEC-03 — Token Renewal Has No Proximity or Re-auth Check (Medium)
### DEF-01 — No Rate Limiting on the UI Login Endpoint (Medium) `POST /v1/auth/renew` originally accepted any valid token regardless of remaining lifetime.
**File:** `internal/ui/ui.go:264` **Fix:** Renewal now requires the token to have consumed ≥50% of its lifetime before it can be renewed.
```go
uiMux.HandleFunc("POST /login", u.handleLoginPost)
```
The REST `/v1/auth/login` endpoint is wrapped with `loginRateLimit` (10 req/s per IP). The UI `/login` endpoint has no equivalent middleware. Account lockout (10 failures per 15 minutes) partially mitigates brute force, but an attacker can still enumerate whether accounts exist at full network speed before triggering lockout, and can trigger lockout against many accounts in parallel with no rate friction.
**Fix:** Apply the same `middleware.RateLimit(10, 10)` to `POST /login` in the UI mux. A simpler option is to wrap the entire `uiMux` with the rate limiter since the UI is also a sensitive surface.
--- ---
### DEF-02`pendingLogins` Map Has No Expiry Cleanup (Low) ### SEC-04REST API Responses Lack Security Headers (Low-Medium)
**File:** `internal/ui/ui.go:57` API endpoints originally returned only `Content-Type` — no `Cache-Control`, `X-Content-Type-Options`, or HSTS.
The `pendingLogins sync.Map` stores short-lived TOTP nonces (90-second TTL). When consumed via `consumeTOTPNonce`, entries are deleted via `LoadAndDelete`. However, entries that are created but never consumed (user abandons login at the TOTP step, closes browser) **accumulate indefinitely** — they are checked for expiry on read but never proactively deleted. **Fix:** `globalSecurityHeaders` middleware applies these headers to all routes (API and UI).
In normal operation this is a minor memory leak. Under adversarial conditions — an attacker repeatedly sending username+password to step 1 without proceeding to step 2 — the map grows without bound. At scale this could be used for memory exhaustion.
**Fix:** Add a background goroutine (matching the pattern in `middleware.RateLimit`) that periodically iterates the map and deletes expired entries. A 5-minute cleanup interval is sufficient given the 90-second TTL.
--- ---
### DEF-03Rate Limiter Uses `RemoteAddr`, Not `X-Forwarded-For` (Low) ### SEC-05No Request Body Size Limit on REST API Endpoints (Low)
**File:** `internal/middleware/middleware.go:200` `decodeJSON` originally read from `r.Body` without any size limit.
The comment already acknowledges this: the rate limiter extracts the client IP from `r.RemoteAddr`. When the server is deployed behind a reverse proxy (nginx, Caddy, a load balancer), `RemoteAddr` will be the proxy's IP for all requests, collapsing all clients into a single rate-limit bucket. This effectively disables per-IP rate limiting in proxy deployments. **Fix:** `http.MaxBytesReader` with 1 MiB limit added to `decodeJSON`. Maximum password length also enforced.
**Fix:** Add a configurable `TrustedProxy` setting. When set, extract the real client IP from `X-Forwarded-For` or `X-Real-IP` headers only for requests coming from that proxy address. Never trust those headers unconditionally — doing so allows IP spoofing.
--- ---
### DEF-04Missing `nbf` (Not Before) Claim on Issued Tokens (Low) ### SEC-06gRPC Rate Limiter Ignores TrustedProxy (Low)
**File:** `internal/token/token.go:73-82` The gRPC rate limiter originally used `peer.FromContext` directly, always getting the proxy IP behind a reverse proxy.
`IssueToken` sets `iss`, `sub`, `iat`, `exp`, and `jti`, but not `nbf`. Without a not-before constraint, a token is valid from the moment of issuance and a slightly clock-skewed client or intermediate could present it early. This is a defense-in-depth measure, not a practical attack at the moment, but it costs nothing to add. **Fix:** `grpcClientIP` now reads from gRPC metadata headers when the peer matches the trusted proxy.
**Fix:** Add `NotBefore: jwt.NewNumericDate(now)` to the `RegisteredClaims` struct. Add the corresponding validation step in `ValidateToken` (using `jwt.WithNotBefore()` or a manual check).
--- ---
### DEF-05No Maximum Token Expiry Ceiling in Config Validation (Low) ### SEC-07Static File Directory Listing Enabled (Low)
**File:** `internal/config/config.go:150-158` `http.FileServerFS` served directory listings by default.
The config validator enforces that expiry durations are positive but not that they are bounded above. An operator misconfiguration (e.g. `service_expiry = "876000h"`) would issue tokens valid for 100 years. For human sessions (`default_expiry`, `admin_expiry`) this is a significant risk in the event of token theft. **Fix:** `noDirListing` wrapper returns 404 for directory requests.
**Fix:** Add upper-bound checks in `validate()`. Suggested maximums: 30 days for `default_expiry`, 24 hours for `admin_expiry`, 5 years for `service_expiry`. At minimum, log a warning when values exceed reasonable thresholds.
--- ---
### DEF-06`GetAccountByUsername` Comment Incorrect re: Case Sensitivity (Informational) ### SEC-08System Token Issuance Is Not Atomic (Low)
**File:** `internal/db/accounts.go:73` `handleTokenIssue` originally performed three sequential non-transactional operations.
The comment reads "case-insensitive" but the query uses `WHERE username = ?` with SQLite's default BINARY collation, which is **case-sensitive**. This means `admin` and `Admin` would be treated as distinct accounts. This is not a security bug by itself, but it contradicts the comment and could mask confusion. **Fix:** `IssueSystemToken` wraps all operations in a single SQLite transaction.
**Fix:** If case-insensitive matching is intended, add `COLLATE NOCASE` to the column definition or the query. If case-sensitive is correct (more common for SSO systems), remove the word "case-insensitive" from the comment.
--- ---
### DEF-07SQLite `synchronous=NORMAL` in WAL Mode (Low) ### SEC-09Navigation Bar Exposes Admin UI Structure to Non-Admin Users (Informational)
**File:** `internal/db/db.go:68` Nav links were rendered for all authenticated users.
With `PRAGMA synchronous=NORMAL` and `journal_mode=WAL`, SQLite syncs the WAL file on checkpoints but not on every write. A power failure between a write and the next checkpoint could lose the most recent transactions. For an authentication system — where token issuance and revocation records must be durable — this is a meaningful risk. **Fix:** Admin nav links wrapped in `{{if .IsAdmin}}` conditional.
**Fix:** Change to `PRAGMA synchronous=FULL`. For a single-node personal SSO the performance impact is negligible; durability of token revocations is worth it.
--- ---
### DEF-08 — gRPC `Login` Counts TOTP-Missing as a Login Failure (Low) ### SEC-10 — No `Permissions-Policy` Header (Informational)
**File:** `internal/grpcserver/auth.go:76-77` The security headers middleware did not include `Permissions-Policy`.
When TOTP is required but no code is provided (`req.TotpCode == ""`), the gRPC handler calls `RecordLoginFailure`. In the two-step UI flow this is defensible, but via the gRPC single-step `Login` RPC, a well-behaved client that has not yet obtained the TOTP code (not an attacker) will increment the failure counter. Repeated retries could trigger account lockout unintentionally. **Fix:** `Permissions-Policy: camera=(), microphone=(), geolocation=(), payment=()` added.
**Fix:** Either document that gRPC clients must always include the TOTP code and treat its omission as a deliberate attempt, or do not count "TOTP code required" as a failure (since the password was verified successfully at that point).
--- ---
### DEF-09 — Security Headers Missing on REST API Docs Endpoints (Informational) ### SEC-11 — Audit Log Details Use `fmt.Sprintf` Instead of `json.Marshal` (Informational)
**File:** `internal/server/server.go:85-94` Audit details were constructed with `fmt.Sprintf` and `%q`, which is fragile for JSON.
The `/docs` and `/docs/openapi.yaml` endpoints are served from the parent `mux` and therefore do not receive the `securityHeaders` middleware applied to the UI sub-mux. The Swagger UI page at `/docs` is served without `X-Frame-Options`, `Content-Security-Policy`, etc. **Fix:** `audit.JSON` and `audit.JSONWithRoles` helpers use `json.Marshal`.
**Fix:** Apply a security-headers middleware to the docs handlers, or move them into the UI sub-mux.
--- ---
### DEF-10Role Strings Not Validated Against an Allowlist (Low) ### SEC-12Default Token Expiry Is 30 Days (Informational / Configuration)
**File:** `internal/db/accounts.go:302-311` (`GrantRole`) Default expiry was 720h (30 days).
There is no allowlist for role strings written to the `account_roles` table. Any string can be stored. While the admin-only constraint prevents non-admins from calling these endpoints, a typo by an admin (e.g. `"admim"`) would silently create an unknown role that silently grants nothing. The `RequireRole` check would never match it, causing a confusing failure mode. **Fix:** Reduced to 168h (7 days). Combined with SEC-03's renewal proximity check, exposure window is significantly reduced.
**Fix:** Maintain a compile-time allowlist of valid roles (e.g. `"admin"`, `"user"`) and reject unknown role names at the handler layer before writing to the database. </details>
--- ---
## Positive Findings ## Previously Remediated Findings (CRIT/DEF series)
The following implementation details are exemplary and should be preserved: The following findings from the initial audit (2026-03-12) were confirmed fixed in the 2026-03-13 audit:
| ID | Finding | Status |
|----|---------|--------|
| CRIT-01 | TOTP replay attack — no counter tracking | **Fixed**`CheckAndUpdateTOTPCounter` with atomic SQL, migration 000007 |
| CRIT-02 | gRPC `EnrollTOTP` called `SetTOTP` instead of `StorePendingTOTP` | **Fixed** — now calls `StorePendingTOTP` |
| DEF-01 | No rate limiting on UI login | **Fixed**`loginRateLimit` applied to `POST /login` |
| DEF-02 | `pendingLogins` map had no expiry cleanup | **Fixed**`cleanupPendingLogins` goroutine runs every 5 minutes |
| DEF-03 | Rate limiter ignored `X-Forwarded-For` | **Fixed**`ClientIP()` respects `TrustedProxy` config |
| DEF-04 | Missing `nbf` claim on tokens | **Fixed**`NotBefore: jwt.NewNumericDate(now)` added |
| DEF-05 | No max token expiry ceiling | **Fixed** — upper bounds enforced in config validation |
| DEF-06 | Incorrect case-sensitivity comment | **Fixed** — comment corrected |
| DEF-07 | SQLite `synchronous=NORMAL` | **Fixed** — changed to `PRAGMA synchronous=FULL` |
| DEF-08 | gRPC counted TOTP-missing as failure | **Fixed** — no longer increments lockout counter |
| DEF-09 | Security headers missing on docs endpoints | **Fixed**`docsSecurityHeaders` wrapper added |
| DEF-10 | Role strings not validated | **Fixed**`model.ValidateRole()` with compile-time allowlist |
---
## Positive Findings (Preserved)
These implementation details are exemplary and should be maintained:
| Area | Detail | | Area | Detail |
|------|--------| |------|--------|
| JWT alg confusion | `ValidateToken` enforces `alg=EdDSA` in the key function, before signature verification — the only correct place | | JWT alg confusion | `ValidateToken` enforces `alg=EdDSA` in the key function before signature verification |
| Constant-time comparisons | `crypto/subtle.ConstantTimeCompare` used consistently for password hashes, TOTP codes, and CSRF tokens | | Constant-time operations | `crypto/subtle.ConstantTimeCompare` for password hashes, CSRF tokens; all three TOTP windows evaluated without early exit |
| Timing uniformity | Dummy Argon2 computed (once, with full production parameters via `sync.Once`) for unknown/inactive users on both REST and gRPC paths | | Timing uniformity | Dummy Argon2 via `sync.Once` for unknown/inactive users on all login paths |
| Token revocation | Every token is tracked by JTI; unknown tokens are rejected (fail-closed) rather than silently accepted | | Token revocation | Fail-closed: untracked tokens are rejected, not silently accepted |
| Token renewal atomicity | `RenewToken` wraps revocation + insertion in a single SQLite transaction | | Token renewal atomicity | `RenewToken` wraps revoke+track in a single SQLite transaction |
| TOTP nonce design | Two-step UI login uses a 128-bit single-use server-side nonce to avoid transmitting the password twice | | TOTP replay prevention | Counter-based replay detection with atomic SQL UPDATE/WHERE |
| CSRF protection | HMAC-SHA256 signed double-submit cookie with `SameSite=Strict` and constant-time validation | | TOTP nonce design | 128-bit single-use server-side nonce; password never retransmitted in step 2 |
| Credential exclusion | `json:"-"` tags on all credential fields; proto messages omit them too | | CSRF protection | HMAC-SHA256 double-submit cookie, domain-separated key derivation, SameSite=Strict, constant-time validation |
| Security headers | All UI responses receive CSP, `X-Content-Type-Options`, `X-Frame-Options`, HSTS, and `Referrer-Policy` | | Credential exclusion | `json:"-"` on all credential fields; password hash never in API responses |
| Account lockout | 10-attempt, 15-minute rolling lockout checked before Argon2 to prevent timing oracle | | Security headers (UI) | CSP (no unsafe-inline), X-Content-Type-Options, X-Frame-Options DENY, HSTS 2yr, Referrer-Policy no-referrer |
| Argon2id parameters | Config validator enforces OWASP 2023 minimums and rejects weakening | | Cookie hardening | HttpOnly + Secure + SameSite=Strict on session cookie |
| SQL injection | All queries use parameterized statements; no string concatenation anywhere | | Account lockout | 10-attempt rolling window, checked before Argon2, with timing-safe dummy hash |
| Audit log | Append-only with actor/target/IP; no delete path provided | | Argon2id parameters | Config validator enforces OWASP 2023 minimums; rejects weakening |
| Master key handling | Env var cleared after reading; signing key zeroed on shutdown | | SQL injection | Zero string concatenation — all queries parameterized |
| Input validation | Username regex + length, password min length, account type enum, role allowlist, JSON strict decoder |
| Audit logging | Append-only, no delete path, credentials never logged, actor/target/IP captured |
| Master key hygiene | Env var cleared after read, key zeroed on shutdown, AES-256-GCM at rest |
| TLS | MinVersion TLS 1.2, X25519 preferred, no plaintext listener, read/write/idle timeouts set |
--- ---
## Remediation Priority ## Remediation Status
| Fixed | Priority | ID | Severity | Action | **All findings remediated.** No open items remain. Next audit should focus on:
|-------|----------|----|----------|--------| - Any new features added since 2026-03-14
| Yes | 1 | CRIT-02 | Medium | Change `grpcserver/auth.go:202` to call `StorePendingTOTP` instead of `SetTOTP` | - Dependency updates and CVE review
| Yes | 2 | CRIT-01 | Medium | Add `last_totp_counter` tracking to prevent TOTP replay within the validity window | - Live penetration testing of remediated endpoints
| Yes | 3 | DEF-01 | Medium | Apply IP rate limiting to the UI `POST /login` endpoint |
| Yes | 4 | DEF-02 | Low | Add background cleanup goroutine for the `pendingLogins` map |
| Yes | 5 | DEF-03 | Low | Support trusted-proxy IP extraction for accurate per-client rate limiting |
| Yes | 6 | DEF-04 | Low | Add `nbf` claim to issued tokens and validate it on receipt |
| Yes | 7 | DEF-05 | Low | Add upper-bound caps on token expiry durations in config validation |
| Yes | 8 | DEF-07 | Low | Change SQLite to `PRAGMA synchronous=FULL` |
| Yes | 9 | DEF-08 | Low | Do not count gRPC TOTP-missing as a login failure |
| Yes | 10 | DEF-10 | Low | Validate role strings against an allowlist before writing to the DB |
| Yes | 11 | DEF-09 | Info | Apply security headers to `/docs` endpoints |
| Yes | 12 | DEF-06 | Info | Correct the misleading "case-insensitive" comment in `GetAccountByUsername` |
---
## Schema Observations
The migration chain (migrations 001006) is sound. Foreign key cascades are appropriate. Indexes are present on all commonly-queried columns. The `failed_logins` table uses a rolling window query approach which is correct.
One note: the `accounts` table has no unique index enforcing `COLLATE NOCASE` on `username`. This is consistent with treating usernames as case-sensitive but should be documented explicitly to avoid future ambiguity.