- Added failed login tracking for account lockout enforcement in `db` and `ui` layers; introduced `failed_logins` table to store attempts, window start, and attempt count. - Updated login checks in `grpcserver/auth.go` and `ui/handlers_auth.go` to reject requests if the account is locked. - Added immediate failure counter reset on successful login. - Implemented username length and character set validation (F-12) and minimum password length enforcement (F-13) in shared `validate` package. - Updated account creation and edit flows in `ui` and `grpcserver` layers to apply validation before hashing/processing. - Added comprehensive unit tests for lockout, validation, and related edge cases. - Updated `AUDIT.md` to mark F-08, F-12, and F-13 as fixed. - Updated `openapi.yaml` to reflect new validation and lockout behaviors. Security: Prevents brute-force attacks via lockout mechanism and strengthens defenses against weak and invalid input.
16 KiB
MCIAS Security Audit Report
Scope: Full codebase review of git.wntrmute.dev/kyle/mcias (commit 4596ea0) aka mcias.
Auditor: Comprehensive source review of all Go source files, protobuf definitions, Dockerfile, systemd unit, and client libraries
Classification: Findings rated as CRITICAL, HIGH, MEDIUM, LOW, or INFORMATIONAL
Executive Summary
MCIAS is well-engineered for a security-critical system. The code demonstrates strong awareness of common vulnerability classes: JWT algorithm confusion is properly mitigated, constant-time comparisons are used throughout, timing-uniform dummy operations prevent user enumeration, and credential material is systematically excluded from logs and API responses. The cryptographic choices are sound and current.
That said, I identified 16 findings ranging from medium-severity design issues to low-severity hardening opportunities. There are no critical vulnerabilities that would allow immediate remote compromise, but several medium-severity items warrant remediation before production deployment.
FINDINGS
F-01: TOTP Enrollment Sets totp_required=1 Before Confirmation (MEDIUM)
Location: internal/db/accounts.go:131-141, internal/server/server.go:651-658
SetTOTP unconditionally sets totp_required = 1. This means during the enrollment phase (before the user has confirmed), the TOTP requirement flag is already true. If the user abandons enrollment after calling /v1/auth/totp/enroll but before calling /confirm, the account is now locked: TOTP is "required" but the user was never shown a QR code they can use to generate valid codes.
Recommendation: Add a separate StorePendingTOTP(accountID, secretEnc, secretNonce) that writes the encrypted secret but leaves totp_required = 0. Only set totp_required = 1 in the confirm handler via the existing SetTOTP. Alternatively, add a ClearTOTP recovery step to the enrollment flow on timeout/failure.
F-02: Password Embedded in HTML Hidden Fields During TOTP Step (MEDIUM)
Location: internal/ui/handlers_auth.go:74-84
During the TOTP step of UI login, the plaintext password is embedded as a hidden form field so it can be re-verified on the second POST. This means:
- The password exists in the DOM and is accessible to any browser extension or XSS-via-extension vector.
- The password is sent over the wire a second time (TLS protects transit, but it doubles the exposure window).
- Browser form autofill or "view source" can reveal it.
Recommendation: On successful password verification in the first step, issue a short-lived (e.g., 60-second), single-use, server-side nonce that represents "password verified for user X". Store this nonce in the DB or an in-memory cache. The TOTP confirmation step presents this nonce instead of the password. The server validates the nonce + TOTP code and issues the session token.
F-03: Token Renewal Is Not Atomic — Race Window Between Revoke and Track (MEDIUM)
Location: internal/server/server.go:281-289, internal/grpcserver/auth.go:148-155
The token renewal flow revokes the old token and tracks the new one as separate operations. The code comments acknowledge "atomically is not possible in SQLite without a transaction." However, SQLite does support transactions, and both operations use the same *db.DB instance with MaxOpenConns(1). If the revoke succeeds but TrackToken fails, the user's old token is revoked but no new token is tracked, leaving them in a broken state.
Recommendation: Wrap the revoke-old + track-new pair in a single SQLite transaction. Add a method like db.RenewToken(oldJTI, reason, newJTI, accountID, issuedAt, expiresAt) that performs both in one tx.
F-04: Rate Limiter Not Applied to REST Login Endpoint (MEDIUM)
Location: internal/server/server.go:96-100
Despite the comment saying "login-path rate limiting," the REST server applies RequestLogger as global middleware but does not apply the RateLimit middleware at all. The rate limiter is imported but never wired into the handler chain for the REST server. The /v1/auth/login endpoint has no rate limiting on the REST side.
In contrast, the gRPC server correctly applies rateLimitInterceptor in its interceptor chain (applied to all RPCs).
Recommendation: Apply middleware.RateLimit(...) to at minimum the /v1/auth/login and /v1/token/validate routes in the REST server. Consider a more restrictive rate for login (e.g., 5/min) versus general API endpoints.
F-05: No nbf (Not Before) Claim in Issued JWTs (LOW)
Location: internal/token/token.go:68-99
Tokens are issued with iss, sub, iat, exp, and jti but not nbf (Not Before). While the architecture document states nbf is validated "if present," it is never set during issuance. Setting nbf = iat is a defense-in-depth measure that prevents premature token use if there is any clock skew between systems, and ensures relying parties that validate nbf don't reject MCIAS tokens.
Recommendation: Set NotBefore: jwt.NewNumericDate(now) in the jwtClaims.RegisteredClaims.
F-06: HasRole Uses Non-Constant-Time String Comparison (LOW)
Location: internal/token/token.go:174-181
HasRole uses plain == string comparison for role names. Role names are not secret material, and this is authorization (not authentication), so this is low severity. However, if role names ever contained sensitive information, this could leak information via timing. Given the project's stated principle of using constant-time comparisons "wherever token or credential equality is checked," this is a minor inconsistency.
Recommendation: Acceptable as-is since role names are public knowledge. Document the decision.
F-07: Dummy Argon2 Hash Uses Hardcoded Invalid PHC String (LOW)
Location: internal/server/server.go:154
The dummy Argon2 hash "$argon2id$v=19$m=65536,t=3,p=4$dGVzdHNhbHQ$dGVzdGhhc2g" uses m=65536 but the actual default config uses m=65536 too. The timing should be close. However, the dummy hash uses a 6-byte salt ("testsalt" base64) and a 6-byte hash ("testhash" base64), while real hashes use 16-byte salt and 32-byte hash. This produces a slightly different (faster) Argon2 computation than a real password verification.
Recommendation: Pre-compute a real dummy hash at server startup using auth.HashPassword("dummy-password", actualArgonParams) and store it as a sync.Once variable. This guarantees identical timing regardless of configuration.
F-08: No Account Lockout After Repeated Failed Login Attempts (LOW)
Location: internal/server/server.go:138-176
There is no mechanism to lock an account after N failed login attempts. The system relies solely on rate limiting (which, per F-04, isn't applied on the REST side). An attacker with distributed IPs could attempt brute-force attacks against accounts without triggering any lockout.
Recommendation: Implement a configurable per-account failed login counter (e.g., 10 failures in 15 minutes triggers a 15-minute lockout). The counter should be stored in the DB or in memory with per-account tracking. Audit events for login_fail already exist and can be queried, but proactive lockout would be more effective.
F-09: PRAGMA synchronous=NORMAL Risks Data Loss on Power Failure (LOW)
Location: internal/db/db.go:50
PRAGMA synchronous=NORMAL combined with WAL mode means a power failure could lose the most recent committed transactions. For a security-critical system where audit log integrity and token revocation records matter, synchronous=FULL is safer.
Recommendation: Change to PRAGMA synchronous=FULL for production deployments. The performance impact on a personal SSO system is negligible. Alternatively, document this trade-off and leave NORMAL as a conscious choice.
F-10: No Maximum Token Expiry Validation (LOW)
Location: internal/config/config.go:150-159
Token expiry durations are validated to be positive but have no maximum. An operator could accidentally configure default_expiry = "876000h" (100 years). The config validation should enforce reasonable ceilings.
Recommendation: Add maximum expiry validation: e.g., default_expiry <= 8760h (1 year), admin_expiry <= 168h (1 week), service_expiry <= 87600h (10 years). These can be generous ceilings that prevent obvious misconfiguration.
F-11: Missing Content-Security-Policy and Other Security Headers on UI Responses (MEDIUM)
Location: internal/ui/ui.go:318-333
The UI serves HTML pages but sets no security headers: no Content-Security-Policy, no X-Content-Type-Options, no X-Frame-Options, no Strict-Transport-Security. Since this is an admin panel for an authentication system:
- Without CSP, any XSS vector (e.g., via a malicious username stored in the DB) could execute arbitrary JavaScript in the admin's browser.
- Without
X-Frame-Options: DENY, the admin panel could be framed for clickjacking. - Without HSTS, a MITM could strip TLS on the first connection.
Recommendation: Add a middleware that sets:
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self'
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Strict-Transport-Security: max-age=63072000; includeSubDomains
Referrer-Policy: no-referrer
F-12: No Input Validation on Username Length or Character Set (LOW)
Location: internal/server/server.go:465-507
handleCreateAccount checks that username is non-empty but does not validate length or character set. A username containing control characters, null bytes, or extremely long strings (up to SQLite's TEXT limit) could cause rendering issues in the UI, log injection, or storage abuse.
Recommendation: Validate: length 1-255, alphanumeric + limited symbols (e.g., ^[a-zA-Z0-9._@-]{1,255}$). Reject control characters, embedded NULs, and newlines.
F-13: No Password Complexity or Minimum Length Enforcement (LOW)
Location: internal/auth/auth.go:63-66
HashPassword only checks that the password is non-empty. A 1-character password is accepted and hashed. While Argon2id makes brute-force expensive, a minimum password length of 8-12 characters (per NIST SP 800-63B) would prevent trivially weak passwords.
Recommendation: Enforce a minimum password length (e.g., 12 characters) at the server/handler level before passing to HashPassword. Optionally check against a breached-password list.
F-14: Passphrase Not Zeroed After Use in loadMasterKey (LOW)
Location: cmd/mciassrv/main.go:246-272
The passphrase is read from the environment variable and passed to crypto.DeriveKey, but the Go string holding the passphrase is not zeroed afterward. The environment variable is correctly unset, and the master key is zeroed on shutdown, but the passphrase string remains in the Go heap until GC'd. Go strings are immutable, so zeroing is not straightforward, but converting to []byte first and zeroing after KDF would reduce the exposure window.
Recommendation: Read the environment variable into a []byte (via os.Getenv then []byte copy), pass it to a modified DeriveKey that accepts []byte, then zero the []byte immediately after. Alternatively, accept this as a Go language limitation and document it.
F-15: extractBearerFromRequest Does Not Verify "Bearer" Prefix Case-Insensitively (INFORMATIONAL)
Location: internal/server/server.go:932-942
The REST extractBearerFromRequest (used by handleTokenValidate) does a substring check with auth[len("Bearer ")] without verifying the prefix actually says "Bearer". It trusts that if the header is long enough, the prefix is correct. Meanwhile, the middleware's extractBearerToken correctly uses strings.EqualFold. The gRPC extractBearerFromMD also correctly uses strings.EqualFold.
Recommendation: Use strings.EqualFold for the prefix check in extractBearerFromRequest for consistency.
F-16: UI System Token Issuance Does Not Revoke Previous System Token (LOW)
Location: internal/ui/handlers_accounts.go:334-403
The REST handleTokenIssue and gRPC IssueServiceToken both revoke the existing system token before issuing a new one. However, handleIssueSystemToken in the UI handler does not revoke the old system token — it calls SetSystemToken (which updates the system_tokens table via UPSERT) but never revokes the old token's entry in the token_revocation table. The old token remains valid until it naturally expires.
Recommendation: Before issuing a new token in handleIssueSystemToken, replicate the pattern from the REST handler: look up GetSystemToken, and if found, call RevokeToken(existing.JTI, "rotated").
Positive Findings (Things Done Well)
-
JWT algorithm confusion defense is correctly implemented. The
algheader is validated inside the key function before signature verification, and onlyEdDSAis accepted. This is the correct implementation pattern. -
Constant-time comparisons are consistently used for password verification, TOTP validation, and CSRF token validation via
crypto/subtle.ConstantTimeCompare. -
Timing uniformity for failed logins: dummy Argon2 operations run for unknown users and inactive accounts, preventing username enumeration via timing differences.
-
Credential material exclusion is thorough:
json:"-"tags onPasswordHash,TOTPSecretEnc,TOTPSecretNonce,PGPasswordEnc,PGPasswordNoncein model types, plus deliberate omission from API responses and log statements. -
Parameterized SQL is used consistently throughout. No string concatenation in queries. The dynamic query builder in
ListAuditEvents/ListAuditEventsPagedcorrectly uses parameter placeholders. -
TLS configuration is solid: TLS 1.2 minimum, X25519/P256 curves, enforced at the listener level with no plaintext fallback.
-
Master key handling is well-designed: passphrase derived via Argon2id with strong parameters (128 MiB memory), env var cleared after reading, key zeroed on shutdown.
-
Systemd hardening is comprehensive:
ProtectSystem=strict,NoNewPrivileges,MemoryDenyWriteExecute, emptyCapabilityBoundingSet, andPrivateDevices. -
AES-GCM usage is correct: fresh random nonces per encryption, key size validated, error details not exposed on decryption failure.
-
CSRF protection is well-implemented with HMAC-signed double-submit cookies and
SameSite=Strict.
Summary Table
| Fixed? | ID | Severity | Title | Effort |
|---|---|---|---|---|
| Yes | F-01 | MEDIUM | TOTP enrollment sets required=1 before confirmation | Small |
| Yes | F-02 | MEDIUM | Password in HTML hidden fields during TOTP step | Medium |
| Yes | F-03 | MEDIUM | Token renewal not atomic (race window) | Small |
| Yes | F-04 | MEDIUM | Rate limiter not applied to REST login endpoint | Small |
| Yes | F-11 | MEDIUM | Missing security headers on UI responses | Small |
| No | F-05 | LOW | No nbf claim in issued JWTs |
Trivial |
| No | F-06 | LOW | HasRole uses non-constant-time comparison |
Trivial |
| Yes | F-07 | LOW | Dummy Argon2 hash timing mismatch | Small |
| Yes | F-08 | LOW | No account lockout after repeated failures | Medium |
| No | F-09 | LOW | synchronous=NORMAL risks audit data loss |
Trivial |
| No | F-10 | LOW | No maximum token expiry validation | Small |
| Yes | F-12 | LOW | No username length/charset validation | Small |
| Yes | F-13 | LOW | No minimum password length enforcement | Small |
| No | F-14 | LOW | Passphrase string not zeroed after KDF | Small |
| Yes | F-16 | LOW | UI system token issuance skips old token revocation | Small |
| No | F-15 | INFO | Bearer prefix check inconsistency | Trivial |
Recommended Remediation Priority
Immediate (before production deployment):
- F-04 — Wire the rate limiter into the REST server. This is the most impactful gap.
- F-11 — Add security headers to UI responses.
- F-01 — Fix TOTP enrollment to not lock accounts prematurely.
Short-term: 4. F-03 — Make token renewal atomic. 5. F-02 — Replace password-in-hidden-field with a server-side nonce. 6. F-16 — Fix UI system token issuance to revoke old tokens. 7. F-07 — Use a real dummy hash with matching parameters.
Medium-term: 8. F-08 — Implement account lockout. 9. F-12, F-13 — Input validation for usernames and passwords. 10. Remaining LOW/INFO items at maintainer discretion.