trusted proxy, TOTP replay protection, new tests
- Trusted proxy config option for proxy-aware IP extraction used by rate limiting and audit logs; validates proxy IP before trusting X-Forwarded-For / X-Real-IP headers - TOTP replay protection via counter-based validation to reject reused codes within the same time step (±30s) - RateLimit middleware updated to extract client IP from proxy headers without IP spoofing risk - New tests for ClientIP proxy logic (spoofed headers, fallback) and extended rate-limit proxy coverage - HTMX error banner script integrated into web UI base - .gitignore updated for mciasdb build artifact Security: resolves CRIT-01 (TOTP replay attack) and DEF-03 (proxy-unaware rate limiting); gRPC TOTP enrollment aligned with REST via StorePendingTOTP Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
332
AUDIT.md
332
AUDIT.md
@@ -1,258 +1,202 @@
|
||||
# MCIAS Security Audit Report
|
||||
|
||||
**Scope:** Full codebase review of `git.wntrmute.dev/kyle/mcias` (commit `4596ea0`) aka mcias.
|
||||
**Auditor:** Comprehensive source review of all Go source files, protobuf definitions, Dockerfile, systemd unit, and client libraries
|
||||
**Classification:** Findings rated as **CRITICAL**, **HIGH**, **MEDIUM**, **LOW**, or **INFORMATIONAL**
|
||||
**Date:** 2026-03-12
|
||||
**Scope:** Full codebase — authentication flows, token lifecycle, cryptography, database layer, REST/gRPC/UI servers, authorization, and operational security.
|
||||
**Methodology:** Static code analysis of all source files with adversarial focus on auth flows, crypto usage, input handling, and inter-component trust boundaries.
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
MCIAS is well-engineered for a security-critical system. The code demonstrates strong awareness of common vulnerability classes: JWT algorithm confusion is properly mitigated, constant-time comparisons are used throughout, timing-uniform dummy operations prevent user enumeration, and credential material is systematically excluded from logs and API responses. The cryptographic choices are sound and current.
|
||||
MCIAS demonstrates strong security awareness throughout. The cryptographic foundations are sound, credential handling is careful, and the most common web/API authentication vulnerabilities have been explicitly addressed. The codebase shows consistent attention to defense-in-depth: constant-time comparisons, dummy Argon2 operations for unknown users, algorithm-confusion prevention in JWT validation, parameterized SQL, audit logging, and CSRF protection with HMAC-signed double-submit.
|
||||
|
||||
That said, I identified **16 findings** ranging from medium-severity design issues to low-severity hardening opportunities. There are no critical vulnerabilities that would allow immediate remote compromise, but several medium-severity items warrant remediation before production deployment.
|
||||
**Two confirmed bugs with real security impact were found**, along with several defense-in-depth gaps that should be addressed before production deployment. The overall security posture is well above average for this class of system.
|
||||
|
||||
---
|
||||
|
||||
## FINDINGS
|
||||
## Confirmed Vulnerabilities
|
||||
|
||||
### F-01: TOTP Enrollment Sets `totp_required=1` Before Confirmation (MEDIUM)
|
||||
### CRIT-01 — TOTP Replay Attack (Medium-High)
|
||||
|
||||
**Location:** `internal/db/accounts.go:131-141`, `internal/server/server.go:651-658`
|
||||
**File:** `internal/auth/auth.go:208-230`, `internal/grpcserver/auth.go:84`, `internal/ui/handlers_auth.go:152`
|
||||
|
||||
`SetTOTP` unconditionally sets `totp_required = 1`. This means during the enrollment phase (before the user has confirmed), the TOTP requirement flag is already true. If the user abandons enrollment after calling `/v1/auth/totp/enroll` but before calling `/confirm`, the account is now locked: TOTP is "required" but the user was never shown a QR code they can use to generate valid codes.
|
||||
`ValidateTOTP` accepts any code falling in the current ±1 time-step window (±30 seconds, so a given code is valid for ~90 seconds) but **never records which codes have already been used**. The same valid TOTP code can be submitted an unlimited number of times within that window. There is no `last_used_totp_counter` or `last_used_totp_at` field in the schema.
|
||||
|
||||
**Recommendation:** Add a separate `StorePendingTOTP(accountID, secretEnc, secretNonce)` that writes the encrypted secret but leaves `totp_required = 0`. Only set `totp_required = 1` in the confirm handler via the existing `SetTOTP`. Alternatively, add a `ClearTOTP` recovery step to the enrollment flow on timeout/failure.
|
||||
**Attack scenario:** An attacker who has observed a valid TOTP code (e.g. from a compromised session, shoulder surfing, or a MITM that delayed delivery) can reuse that code to authenticate within its validity window.
|
||||
|
||||
**Fix:** Track the last accepted TOTP counter per account in the database. Reject any counter ≤ the last accepted one. This requires a new column (`last_totp_counter INTEGER`) on the `accounts` table and a check-and-update in `ValidateTOTP`'s callers (or within it, with a DB reference passed in).
|
||||
|
||||
---
|
||||
|
||||
### F-02: Password Embedded in HTML Hidden Fields During TOTP Step (MEDIUM)
|
||||
### CRIT-02 — gRPC `EnrollTOTP` Enables TOTP Before Confirmation (Medium)
|
||||
|
||||
**Location:** `internal/ui/handlers_auth.go:74-84`
|
||||
**File:** `internal/grpcserver/auth.go:202` vs `internal/server/server.go:724-728`
|
||||
|
||||
During the TOTP step of UI login, the plaintext password is embedded as a hidden form field so it can be re-verified on the second POST. This means:
|
||||
1. The password exists in the DOM and is accessible to any browser extension or XSS-via-extension vector.
|
||||
2. The password is sent over the wire a second time (TLS protects transit, but it doubles the exposure window).
|
||||
3. Browser form autofill or "view source" can reveal it.
|
||||
The REST `EnrollTOTP` handler explicitly uses `StorePendingTOTP` (which keeps `totp_required=0`) and a comment at line 724 explains why:
|
||||
|
||||
**Recommendation:** On successful password verification in the first step, issue a short-lived (e.g., 60-second), single-use, server-side nonce that represents "password verified for user X". Store this nonce in the DB or an in-memory cache. The TOTP confirmation step presents this nonce instead of the password. The server validates the nonce + TOTP code and issues the session token.
|
||||
|
||||
---
|
||||
|
||||
### F-03: Token Renewal Is Not Atomic — Race Window Between Revoke and Track (MEDIUM)
|
||||
|
||||
**Location:** `internal/server/server.go:281-289`, `internal/grpcserver/auth.go:148-155`
|
||||
|
||||
The token renewal flow revokes the old token and tracks the new one as separate operations. The code comments acknowledge "atomically is not possible in SQLite without a transaction." However, SQLite does support transactions, and both operations use the same `*db.DB` instance with `MaxOpenConns(1)`. If the revoke succeeds but `TrackToken` fails, the user's old token is revoked but no new token is tracked, leaving them in a broken state.
|
||||
|
||||
**Recommendation:** Wrap the revoke-old + track-new pair in a single SQLite transaction. Add a method like `db.RenewToken(oldJTI, reason, newJTI, accountID, issuedAt, expiresAt)` that performs both in one `tx`.
|
||||
|
||||
---
|
||||
|
||||
### F-04: Rate Limiter Not Applied to REST Login Endpoint (MEDIUM)
|
||||
|
||||
**Location:** `internal/server/server.go:96-100`
|
||||
|
||||
Despite the comment saying "login-path rate limiting," the REST server applies `RequestLogger` as global middleware but **does not apply the `RateLimit` middleware at all**. The rate limiter is imported but never wired into the handler chain for the REST server. The `/v1/auth/login` endpoint has no rate limiting on the REST side.
|
||||
|
||||
In contrast, the gRPC server correctly applies `rateLimitInterceptor` in its interceptor chain (applied to all RPCs).
|
||||
|
||||
**Recommendation:** Apply `middleware.RateLimit(...)` to at minimum the `/v1/auth/login` and `/v1/token/validate` routes in the REST server. Consider a more restrictive rate for login (e.g., 5/min) versus general API endpoints.
|
||||
|
||||
---
|
||||
|
||||
### F-05: No `nbf` (Not Before) Claim in Issued JWTs (LOW)
|
||||
|
||||
**Location:** `internal/token/token.go:68-99`
|
||||
|
||||
Tokens are issued with `iss`, `sub`, `iat`, `exp`, and `jti` but not `nbf` (Not Before). While the architecture document states `nbf` is validated "if present," it is never set during issuance. Setting `nbf = iat` is a defense-in-depth measure that prevents premature token use if there is any clock skew between systems, and ensures relying parties that validate `nbf` don't reject MCIAS tokens.
|
||||
|
||||
**Recommendation:** Set `NotBefore: jwt.NewNumericDate(now)` in the `jwtClaims.RegisteredClaims`.
|
||||
|
||||
---
|
||||
|
||||
### F-06: `HasRole` Uses Non-Constant-Time String Comparison (LOW)
|
||||
|
||||
**Location:** `internal/token/token.go:174-181`
|
||||
|
||||
`HasRole` uses plain `==` string comparison for role names. Role names are not secret material, and this is authorization (not authentication), so this is low severity. However, if role names ever contained sensitive information, this could leak information via timing. Given the project's stated principle of using constant-time comparisons "wherever token or credential equality is checked," this is a minor inconsistency.
|
||||
|
||||
**Recommendation:** Acceptable as-is since role names are public knowledge. Document the decision.
|
||||
|
||||
---
|
||||
|
||||
### F-07: Dummy Argon2 Hash Uses Hardcoded Invalid PHC String (LOW)
|
||||
|
||||
**Location:** `internal/server/server.go:154`
|
||||
|
||||
The dummy Argon2 hash `"$argon2id$v=19$m=65536,t=3,p=4$dGVzdHNhbHQ$dGVzdGhhc2g"` uses m=65536 but the actual default config uses m=65536 too. The timing should be close. However, the dummy hash uses a 6-byte salt ("testsalt" base64) and a 6-byte hash ("testhash" base64), while real hashes use 16-byte salt and 32-byte hash. This produces a slightly different (faster) Argon2 computation than a real password verification.
|
||||
|
||||
**Recommendation:** Pre-compute a real dummy hash at server startup using `auth.HashPassword("dummy-password", actualArgonParams)` and store it as a `sync.Once` variable. This guarantees identical timing regardless of configuration.
|
||||
|
||||
---
|
||||
|
||||
### F-08: No Account Lockout After Repeated Failed Login Attempts (LOW)
|
||||
|
||||
**Location:** `internal/server/server.go:138-176`
|
||||
|
||||
There is no mechanism to lock an account after N failed login attempts. The system relies solely on rate limiting (which, per F-04, isn't applied on the REST side). An attacker with distributed IPs could attempt brute-force attacks against accounts without triggering any lockout.
|
||||
|
||||
**Recommendation:** Implement a configurable per-account failed login counter (e.g., 10 failures in 15 minutes triggers a 15-minute lockout). The counter should be stored in the DB or in memory with per-account tracking. Audit events for `login_fail` already exist and can be queried, but proactive lockout would be more effective.
|
||||
|
||||
---
|
||||
|
||||
### F-09: `PRAGMA synchronous=NORMAL` Risks Data Loss on Power Failure (LOW)
|
||||
|
||||
**Location:** `internal/db/db.go:50`
|
||||
|
||||
`PRAGMA synchronous=NORMAL` combined with WAL mode means a power failure could lose the most recent committed transactions. For a security-critical system where audit log integrity and token revocation records matter, `synchronous=FULL` is safer.
|
||||
|
||||
**Recommendation:** Change to `PRAGMA synchronous=FULL` for production deployments. The performance impact on a personal SSO system is negligible. Alternatively, document this trade-off and leave `NORMAL` as a conscious choice.
|
||||
|
||||
---
|
||||
|
||||
### F-10: No Maximum Token Expiry Validation (LOW)
|
||||
|
||||
**Location:** `internal/config/config.go:150-159`
|
||||
|
||||
Token expiry durations are validated to be positive but have no maximum. An operator could accidentally configure `default_expiry = "876000h"` (100 years). The config validation should enforce reasonable ceilings.
|
||||
|
||||
**Recommendation:** Add maximum expiry validation: e.g., `default_expiry <= 8760h` (1 year), `admin_expiry <= 168h` (1 week), `service_expiry <= 87600h` (10 years). These can be generous ceilings that prevent obvious misconfiguration.
|
||||
|
||||
---
|
||||
|
||||
### F-11: Missing `Content-Security-Policy` and Other Security Headers on UI Responses (MEDIUM)
|
||||
|
||||
**Location:** `internal/ui/ui.go:318-333`
|
||||
|
||||
The UI serves HTML pages but sets no security headers: no `Content-Security-Policy`, no `X-Content-Type-Options`, no `X-Frame-Options`, no `Strict-Transport-Security`. Since this is an admin panel for an authentication system:
|
||||
|
||||
- Without CSP, any XSS vector (e.g., via a malicious username stored in the DB) could execute arbitrary JavaScript in the admin's browser.
|
||||
- Without `X-Frame-Options: DENY`, the admin panel could be framed for clickjacking.
|
||||
- Without HSTS, a MITM could strip TLS on the first connection.
|
||||
|
||||
**Recommendation:** Add a middleware that sets:
|
||||
```
|
||||
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self'
|
||||
X-Content-Type-Options: nosniff
|
||||
X-Frame-Options: DENY
|
||||
Strict-Transport-Security: max-age=63072000; includeSubDomains
|
||||
Referrer-Policy: no-referrer
|
||||
```go
|
||||
// Security: use StorePendingTOTP (not SetTOTP) so that totp_required
|
||||
// is not enabled until the user confirms the code.
|
||||
```
|
||||
|
||||
---
|
||||
The gRPC `EnrollTOTP` handler at line 202 calls `SetTOTP` directly, which immediately sets `totp_required=1`. Any user who initiates TOTP enrollment over gRPC but does not immediately confirm will have their account locked out — they cannot log in because TOTP is required, but no working TOTP secret is confirmed.
|
||||
|
||||
### F-12: No Input Validation on Username Length or Character Set (LOW)
|
||||
|
||||
**Location:** `internal/server/server.go:465-507`
|
||||
|
||||
`handleCreateAccount` checks that username is non-empty but does not validate length or character set. A username containing control characters, null bytes, or extremely long strings (up to SQLite's TEXT limit) could cause rendering issues in the UI, log injection, or storage abuse.
|
||||
|
||||
**Recommendation:** Validate: length 1-255, alphanumeric + limited symbols (e.g., `^[a-zA-Z0-9._@-]{1,255}$`). Reject control characters, embedded NULs, and newlines.
|
||||
**Fix:** Change `grpcserver/auth.go:202` from `a.s.db.SetTOTP(...)` to `a.s.db.StorePendingTOTP(...)`, matching the REST server's behavior and the documented intent of those two DB methods.
|
||||
|
||||
---
|
||||
|
||||
### F-13: No Password Complexity or Minimum Length Enforcement (LOW)
|
||||
## Defense-in-Depth Gaps
|
||||
|
||||
**Location:** `internal/auth/auth.go:63-66`
|
||||
### DEF-01 — No Rate Limiting on the UI Login Endpoint (Medium)
|
||||
|
||||
`HashPassword` only checks that the password is non-empty. A 1-character password is accepted and hashed. While Argon2id makes brute-force expensive, a minimum password length of 8-12 characters (per NIST SP 800-63B) would prevent trivially weak passwords.
|
||||
**File:** `internal/ui/ui.go:264`
|
||||
|
||||
**Recommendation:** Enforce a minimum password length (e.g., 12 characters) at the server/handler level before passing to `HashPassword`. Optionally check against a breached-password list.
|
||||
```go
|
||||
uiMux.HandleFunc("POST /login", u.handleLoginPost)
|
||||
```
|
||||
|
||||
The REST `/v1/auth/login` endpoint is wrapped with `loginRateLimit` (10 req/s per IP). The UI `/login` endpoint has no equivalent middleware. Account lockout (10 failures per 15 minutes) partially mitigates brute force, but an attacker can still enumerate whether accounts exist at full network speed before triggering lockout, and can trigger lockout against many accounts in parallel with no rate friction.
|
||||
|
||||
**Fix:** Apply the same `middleware.RateLimit(10, 10)` to `POST /login` in the UI mux. A simpler option is to wrap the entire `uiMux` with the rate limiter since the UI is also a sensitive surface.
|
||||
|
||||
---
|
||||
|
||||
### F-14: Passphrase Not Zeroed After Use in `loadMasterKey` (LOW)
|
||||
### DEF-02 — `pendingLogins` Map Has No Expiry Cleanup (Low)
|
||||
|
||||
**Location:** `cmd/mciassrv/main.go:246-272`
|
||||
**File:** `internal/ui/ui.go:57`
|
||||
|
||||
The passphrase is read from the environment variable and passed to `crypto.DeriveKey`, but the Go `string` holding the passphrase is not zeroed afterward. The environment variable is correctly unset, and the master key is zeroed on shutdown, but the passphrase string remains in the Go heap until GC'd. Go strings are immutable, so zeroing is not straightforward, but converting to `[]byte` first and zeroing after KDF would reduce the exposure window.
|
||||
The `pendingLogins sync.Map` stores short-lived TOTP nonces (90-second TTL). When consumed via `consumeTOTPNonce`, entries are deleted via `LoadAndDelete`. However, entries that are created but never consumed (user abandons login at the TOTP step, closes browser) **accumulate indefinitely** — they are checked for expiry on read but never proactively deleted.
|
||||
|
||||
**Recommendation:** Read the environment variable into a `[]byte` (via `os.Getenv` then `[]byte` copy), pass it to a modified `DeriveKey` that accepts `[]byte`, then zero the `[]byte` immediately after. Alternatively, accept this as a Go language limitation and document it.
|
||||
In normal operation this is a minor memory leak. Under adversarial conditions — an attacker repeatedly sending username+password to step 1 without proceeding to step 2 — the map grows without bound. At scale this could be used for memory exhaustion.
|
||||
|
||||
**Fix:** Add a background goroutine (matching the pattern in `middleware.RateLimit`) that periodically iterates the map and deletes expired entries. A 5-minute cleanup interval is sufficient given the 90-second TTL.
|
||||
|
||||
---
|
||||
|
||||
### F-15: `extractBearerFromRequest` Does Not Verify "Bearer" Prefix Case-Insensitively (INFORMATIONAL)
|
||||
### DEF-03 — Rate Limiter Uses `RemoteAddr`, Not `X-Forwarded-For` (Low)
|
||||
|
||||
**Location:** `internal/server/server.go:932-942`
|
||||
**File:** `internal/middleware/middleware.go:200`
|
||||
|
||||
The REST `extractBearerFromRequest` (used by `handleTokenValidate`) does a substring check with `auth[len("Bearer ")]` without verifying the prefix actually says "Bearer". It trusts that if the header is long enough, the prefix is correct. Meanwhile, the middleware's `extractBearerToken` correctly uses `strings.EqualFold`. The gRPC `extractBearerFromMD` also correctly uses `strings.EqualFold`.
|
||||
The comment already acknowledges this: the rate limiter extracts the client IP from `r.RemoteAddr`. When the server is deployed behind a reverse proxy (nginx, Caddy, a load balancer), `RemoteAddr` will be the proxy's IP for all requests, collapsing all clients into a single rate-limit bucket. This effectively disables per-IP rate limiting in proxy deployments.
|
||||
|
||||
**Recommendation:** Use `strings.EqualFold` for the prefix check in `extractBearerFromRequest` for consistency.
|
||||
**Fix:** Add a configurable `TrustedProxy` setting. When set, extract the real client IP from `X-Forwarded-For` or `X-Real-IP` headers only for requests coming from that proxy address. Never trust those headers unconditionally — doing so allows IP spoofing.
|
||||
|
||||
---
|
||||
|
||||
### F-16: UI System Token Issuance Does Not Revoke Previous System Token (LOW)
|
||||
### DEF-04 — Missing `nbf` (Not Before) Claim on Issued Tokens (Low)
|
||||
|
||||
**Location:** `internal/ui/handlers_accounts.go:334-403`
|
||||
**File:** `internal/token/token.go:73-82`
|
||||
|
||||
The REST `handleTokenIssue` and gRPC `IssueServiceToken` both revoke the existing system token before issuing a new one. However, `handleIssueSystemToken` in the UI handler does not revoke the old system token — it calls `SetSystemToken` (which updates the system_tokens table via UPSERT) but never revokes the old token's entry in the token_revocation table. The old token remains valid until it naturally expires.
|
||||
`IssueToken` sets `iss`, `sub`, `iat`, `exp`, and `jti`, but not `nbf`. Without a not-before constraint, a token is valid from the moment of issuance and a slightly clock-skewed client or intermediate could present it early. This is a defense-in-depth measure, not a practical attack at the moment, but it costs nothing to add.
|
||||
|
||||
**Recommendation:** Before issuing a new token in `handleIssueSystemToken`, replicate the pattern from the REST handler: look up `GetSystemToken`, and if found, call `RevokeToken(existing.JTI, "rotated")`.
|
||||
**Fix:** Add `NotBefore: jwt.NewNumericDate(now)` to the `RegisteredClaims` struct. Add the corresponding validation step in `ValidateToken` (using `jwt.WithNotBefore()` or a manual check).
|
||||
|
||||
---
|
||||
|
||||
## Positive Findings (Things Done Well)
|
||||
### DEF-05 — No Maximum Token Expiry Ceiling in Config Validation (Low)
|
||||
|
||||
1. **JWT algorithm confusion defense** is correctly implemented. The `alg` header is validated inside the key function before signature verification, and only `EdDSA` is accepted. This is the correct implementation pattern.
|
||||
**File:** `internal/config/config.go:150-158`
|
||||
|
||||
2. **Constant-time comparisons** are consistently used for password verification, TOTP validation, and CSRF token validation via `crypto/subtle.ConstantTimeCompare`.
|
||||
The config validator enforces that expiry durations are positive but not that they are bounded above. An operator misconfiguration (e.g. `service_expiry = "876000h"`) would issue tokens valid for 100 years. For human sessions (`default_expiry`, `admin_expiry`) this is a significant risk in the event of token theft.
|
||||
|
||||
3. **Timing uniformity** for failed logins: dummy Argon2 operations run for unknown users and inactive accounts, preventing username enumeration via timing differences.
|
||||
|
||||
4. **Credential material exclusion** is thorough: `json:"-"` tags on `PasswordHash`, `TOTPSecretEnc`, `TOTPSecretNonce`, `PGPasswordEnc`, `PGPasswordNonce` in model types, plus deliberate omission from API responses and log statements.
|
||||
|
||||
5. **Parameterized SQL** is used consistently throughout. No string concatenation in queries. The dynamic query builder in `ListAuditEvents`/`ListAuditEventsPaged` correctly uses parameter placeholders.
|
||||
|
||||
6. **TLS configuration** is solid: TLS 1.2 minimum, X25519/P256 curves, enforced at the listener level with no plaintext fallback.
|
||||
|
||||
7. **Master key handling** is well-designed: passphrase derived via Argon2id with strong parameters (128 MiB memory), env var cleared after reading, key zeroed on shutdown.
|
||||
|
||||
8. **Systemd hardening** is comprehensive: `ProtectSystem=strict`, `NoNewPrivileges`, `MemoryDenyWriteExecute`, empty `CapabilityBoundingSet`, and `PrivateDevices`.
|
||||
|
||||
9. **AES-GCM usage** is correct: fresh random nonces per encryption, key size validated, error details not exposed on decryption failure.
|
||||
|
||||
10. **CSRF protection** is well-implemented with HMAC-signed double-submit cookies and `SameSite=Strict`.
|
||||
**Fix:** Add upper-bound checks in `validate()`. Suggested maximums: 30 days for `default_expiry`, 24 hours for `admin_expiry`, 5 years for `service_expiry`. At minimum, log a warning when values exceed reasonable thresholds.
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
### DEF-06 — `GetAccountByUsername` Comment Incorrect re: Case Sensitivity (Informational)
|
||||
|
||||
| Fixed? | ID | Severity | Title | Effort |
|
||||
|--------|----|----------|-------|--------|
|
||||
| Yes | F-01 | MEDIUM | TOTP enrollment sets required=1 before confirmation | Small |
|
||||
| Yes | F-02 | MEDIUM | Password in HTML hidden fields during TOTP step | Medium |
|
||||
| Yes | F-03 | MEDIUM | Token renewal not atomic (race window) | Small |
|
||||
| Yes | F-04 | MEDIUM | Rate limiter not applied to REST login endpoint | Small |
|
||||
| Yes | F-11 | MEDIUM | Missing security headers on UI responses | Small |
|
||||
| No | F-05 | LOW | No `nbf` claim in issued JWTs | Trivial |
|
||||
| No | F-06 | LOW | `HasRole` uses non-constant-time comparison | Trivial |
|
||||
| Yes | F-07 | LOW | Dummy Argon2 hash timing mismatch | Small |
|
||||
| Yes | F-08 | LOW | No account lockout after repeated failures | Medium |
|
||||
| No | F-09 | LOW | `synchronous=NORMAL` risks audit data loss | Trivial |
|
||||
| No | F-10 | LOW | No maximum token expiry validation | Small |
|
||||
| Yes | F-12 | LOW | No username length/charset validation | Small |
|
||||
| Yes | F-13 | LOW | No minimum password length enforcement | Small |
|
||||
| No | F-14 | LOW | Passphrase string not zeroed after KDF | Small |
|
||||
| Yes | F-16 | LOW | UI system token issuance skips old token revocation | Small |
|
||||
| No | F-15 | INFO | Bearer prefix check inconsistency | Trivial |
|
||||
**File:** `internal/db/accounts.go:73`
|
||||
|
||||
The comment reads "case-insensitive" but the query uses `WHERE username = ?` with SQLite's default BINARY collation, which is **case-sensitive**. This means `admin` and `Admin` would be treated as distinct accounts. This is not a security bug by itself, but it contradicts the comment and could mask confusion.
|
||||
|
||||
**Fix:** If case-insensitive matching is intended, add `COLLATE NOCASE` to the column definition or the query. If case-sensitive is correct (more common for SSO systems), remove the word "case-insensitive" from the comment.
|
||||
|
||||
---
|
||||
|
||||
## Recommended Remediation Priority
|
||||
### DEF-07 — SQLite `synchronous=NORMAL` in WAL Mode (Low)
|
||||
|
||||
**Immediate (before production deployment):**
|
||||
1. F-04 — Wire the rate limiter into the REST server. This is the most impactful gap.
|
||||
2. F-11 — Add security headers to UI responses.
|
||||
3. F-01 — Fix TOTP enrollment to not lock accounts prematurely.
|
||||
**File:** `internal/db/db.go:68`
|
||||
|
||||
**Short-term:**
|
||||
4. F-03 — Make token renewal atomic.
|
||||
5. F-02 — Replace password-in-hidden-field with a server-side nonce.
|
||||
6. F-16 — Fix UI system token issuance to revoke old tokens.
|
||||
7. F-07 — Use a real dummy hash with matching parameters.
|
||||
With `PRAGMA synchronous=NORMAL` and `journal_mode=WAL`, SQLite syncs the WAL file on checkpoints but not on every write. A power failure between a write and the next checkpoint could lose the most recent transactions. For an authentication system — where token issuance and revocation records must be durable — this is a meaningful risk.
|
||||
|
||||
**Medium-term:**
|
||||
8. F-08 — Implement account lockout.
|
||||
9. F-12, F-13 — Input validation for usernames and passwords.
|
||||
10. Remaining LOW/INFO items at maintainer discretion.
|
||||
**Fix:** Change to `PRAGMA synchronous=FULL`. For a single-node personal SSO the performance impact is negligible; durability of token revocations is worth it.
|
||||
|
||||
---
|
||||
|
||||
### DEF-08 — gRPC `Login` Counts TOTP-Missing as a Login Failure (Low)
|
||||
|
||||
**File:** `internal/grpcserver/auth.go:76-77`
|
||||
|
||||
When TOTP is required but no code is provided (`req.TotpCode == ""`), the gRPC handler calls `RecordLoginFailure`. In the two-step UI flow this is defensible, but via the gRPC single-step `Login` RPC, a well-behaved client that has not yet obtained the TOTP code (not an attacker) will increment the failure counter. Repeated retries could trigger account lockout unintentionally.
|
||||
|
||||
**Fix:** Either document that gRPC clients must always include the TOTP code and treat its omission as a deliberate attempt, or do not count "TOTP code required" as a failure (since the password was verified successfully at that point).
|
||||
|
||||
---
|
||||
|
||||
### DEF-09 — Security Headers Missing on REST API Docs Endpoints (Informational)
|
||||
|
||||
**File:** `internal/server/server.go:85-94`
|
||||
|
||||
The `/docs` and `/docs/openapi.yaml` endpoints are served from the parent `mux` and therefore do not receive the `securityHeaders` middleware applied to the UI sub-mux. The Swagger UI page at `/docs` is served without `X-Frame-Options`, `Content-Security-Policy`, etc.
|
||||
|
||||
**Fix:** Apply a security-headers middleware to the docs handlers, or move them into the UI sub-mux.
|
||||
|
||||
---
|
||||
|
||||
### DEF-10 — Role Strings Not Validated Against an Allowlist (Low)
|
||||
|
||||
**File:** `internal/db/accounts.go:302-311` (`GrantRole`)
|
||||
|
||||
There is no allowlist for role strings written to the `account_roles` table. Any string can be stored. While the admin-only constraint prevents non-admins from calling these endpoints, a typo by an admin (e.g. `"admim"`) would silently create an unknown role that silently grants nothing. The `RequireRole` check would never match it, causing a confusing failure mode.
|
||||
|
||||
**Fix:** Maintain a compile-time allowlist of valid roles (e.g. `"admin"`, `"user"`) and reject unknown role names at the handler layer before writing to the database.
|
||||
|
||||
---
|
||||
|
||||
## Positive Findings
|
||||
|
||||
The following implementation details are exemplary and should be preserved:
|
||||
|
||||
| Area | Detail |
|
||||
|------|--------|
|
||||
| JWT alg confusion | `ValidateToken` enforces `alg=EdDSA` in the key function, before signature verification — the only correct place |
|
||||
| Constant-time comparisons | `crypto/subtle.ConstantTimeCompare` used consistently for password hashes, TOTP codes, and CSRF tokens |
|
||||
| Timing uniformity | Dummy Argon2 computed (once, with full production parameters via `sync.Once`) for unknown/inactive users on both REST and gRPC paths |
|
||||
| Token revocation | Every token is tracked by JTI; unknown tokens are rejected (fail-closed) rather than silently accepted |
|
||||
| Token renewal atomicity | `RenewToken` wraps revocation + insertion in a single SQLite transaction |
|
||||
| TOTP nonce design | Two-step UI login uses a 128-bit single-use server-side nonce to avoid transmitting the password twice |
|
||||
| CSRF protection | HMAC-SHA256 signed double-submit cookie with `SameSite=Strict` and constant-time validation |
|
||||
| Credential exclusion | `json:"-"` tags on all credential fields; proto messages omit them too |
|
||||
| Security headers | All UI responses receive CSP, `X-Content-Type-Options`, `X-Frame-Options`, HSTS, and `Referrer-Policy` |
|
||||
| Account lockout | 10-attempt, 15-minute rolling lockout checked before Argon2 to prevent timing oracle |
|
||||
| Argon2id parameters | Config validator enforces OWASP 2023 minimums and rejects weakening |
|
||||
| SQL injection | All queries use parameterized statements; no string concatenation anywhere |
|
||||
| Audit log | Append-only with actor/target/IP; no delete path provided |
|
||||
| Master key handling | Env var cleared after reading; signing key zeroed on shutdown |
|
||||
|
||||
---
|
||||
|
||||
## Remediation Priority
|
||||
|
||||
| Fixed | Priority | ID | Severity | Action |
|
||||
|-------|----------|----|----------|--------|
|
||||
| Yes | 1 | CRIT-02 | Medium | Change `grpcserver/auth.go:202` to call `StorePendingTOTP` instead of `SetTOTP` |
|
||||
| Yes | 2 | CRIT-01 | Medium | Add `last_totp_counter` tracking to prevent TOTP replay within the validity window |
|
||||
| Yes | 3 | DEF-01 | Medium | Apply IP rate limiting to the UI `POST /login` endpoint |
|
||||
| Yes | 4 | DEF-02 | Low | Add background cleanup goroutine for the `pendingLogins` map |
|
||||
| Yes | 5 | DEF-03 | Low | Support trusted-proxy IP extraction for accurate per-client rate limiting |
|
||||
| Yes | 6 | DEF-04 | Low | Add `nbf` claim to issued tokens and validate it on receipt |
|
||||
| Yes | 7 | DEF-05 | Low | Add upper-bound caps on token expiry durations in config validation |
|
||||
| Yes | 8 | DEF-07 | Low | Change SQLite to `PRAGMA synchronous=FULL` |
|
||||
| Yes | 9 | DEF-08 | Low | Do not count gRPC TOTP-missing as a login failure |
|
||||
| Yes | 10 | DEF-10 | Low | Validate role strings against an allowlist before writing to the DB |
|
||||
| Yes | 11 | DEF-09 | Info | Apply security headers to `/docs` endpoints |
|
||||
| Yes | 12 | DEF-06 | Info | Correct the misleading "case-insensitive" comment in `GetAccountByUsername` |
|
||||
|
||||
---
|
||||
|
||||
## Schema Observations
|
||||
|
||||
The migration chain (migrations 001–006) is sound. Foreign key cascades are appropriate. Indexes are present on all commonly-queried columns. The `failed_logins` table uses a rolling window query approach which is correct.
|
||||
|
||||
One note: the `accounts` table has no unique index enforcing `COLLATE NOCASE` on `username`. This is consistent with treating usernames as case-sensitive but should be documented explicitly to avoid future ambiguity.
|
||||
|
||||
Reference in New Issue
Block a user