# MCR Architecture Metacircular Container Registry — Technical Design Document --- ## 1. System Overview MCR is an OCI Distribution Spec-compliant container registry for Metacircular Dynamics. It stores and serves container images for the platform's services, with MCP directing nodes to pull images from MCR. Authentication is delegated to MCIAS; all operations require a valid bearer token. MCR sits behind an mc-proxy instance for TLS routing. ### Components ``` ┌──────────────────────────────────────────────┐ │ MCR Server (mcrsrv) │ │ │ │ ┌────────────┐ ┌──────────┐ ┌──────────┐ │ │ │ OCI API │ │ Auth │ │ Policy │ │ │ │ Handler │ │ (MCIAS) │ │ Engine │ │ │ └─────┬──────┘ └────┬─────┘ └────┬─────┘ │ │ └──────────────┼─────────────┘ │ │ │ │ │ ┌─────────────▼────────────┐ │ │ │ SQLite (metadata) │ │ │ └──────────────────────────┘ │ │ ┌─────────────────────────┐ │ │ │ Filesystem (blobs) │ │ │ │ /srv/mcr/layers/ │ │ │ └──────────────────────────┘ │ │ │ │ ┌─────────────────┐ ┌──────────────────┐ │ │ │ REST listener │ │ gRPC listener │ │ │ │ (OCI + admin) │ │ (admin) │ │ │ │ :8443 │ │ :9443 │ │ │ └─────────────────┘ └──────────────────┘ │ └──────────────────────────────────────────────┘ ▲ ▲ ▲ │ │ │ ┌────┴───┐ ┌─────┴─────┐ ┌───┴──────┐ │ Docker │ │ mcrctl │ │ mcr-web │ │ / OCI │ │ (admin │ │ (web UI) │ │ client │ │ CLI) │ │ │ └────────┘ └───────────┘ └──────────┘ ``` **mcrsrv** — The registry server. Exposes OCI Distribution endpoints and an admin REST API over HTTPS/TLS, plus a gRPC admin API. Handles blob storage, manifest management, and token-based authentication via MCIAS. **mcr-web** — The web UI. Communicates with mcrsrv via gRPC. Provides repository/tag browsing and ACL policy management for administrators. **mcrctl** — The admin CLI. Communicates with mcrsrv via REST or gRPC. Provides garbage collection, repository management, and policy management. --- ## 2. OCI Distribution Spec Compliance MCR implements the OCI Distribution Specification for content discovery and content management. All OCI endpoints require authentication — there is no anonymous access. ### Supported Operations | Category | Capability | |----------|-----------| | Content discovery | Repository catalog, tag listing | | Pull | Manifest retrieval (by tag or digest), blob download | | Push | Monolithic and chunked blob upload, manifest upload | | Delete | Manifest deletion (by digest), blob deletion | ### Not Supported (v1) | Feature | Rationale | |---------|-----------| | Multi-arch manifest lists | Not needed for single-platform deployment | | Image signing / content trust | Deferred to future work | | Cross-repository blob mounts | Complexity not justified at current scale | ### Content Addressing All blobs and manifests are identified by their SHA-256 digest in the format `sha256:`. Digests are verified on upload — if the computed digest does not match the client-supplied digest, the upload is rejected. Tags are mutable pointers to manifest digests. Pushing a manifest with an existing tag atomically updates the tag to point to the new digest. --- ## 3. Authentication MCR delegates all authentication to MCIAS. No local user database. ### MCIAS Configuration MCR registers with MCIAS as a service with the `env:restricted` tag. This means MCIAS denies login to `guest` and `viewer` accounts — only `admin` and `user` roles can authenticate to MCR. ```toml [mcias] server_url = "https://mcias.metacircular.net:8443" service_name = "mcr" tags = ["env:restricted"] ``` ### OCI Token Authentication OCI/Docker clients expect a specific auth handshake: ``` Client mcrsrv │ │ ├─ GET /v2/ ────────────────────▶│ │◀─ 401 WWW-Authenticate: │ │ Bearer realm="/v2/token", │ │ service="mcr.metacircular.…" │ │ │ ├─ GET /v2/token ───────────────▶│ (Basic auth: username:password) │ ├─ Forward credentials to MCIAS │ │ POST /v1/auth/login │ ├─ MCIAS returns JWT │◀─ 200 {"token":"", │ │ "expires_in": N} │ │ │ ├─ GET /v2//manifests/… ──▶│ (Authorization: Bearer ) │ ├─ Validate token via MCIAS │ ├─ Check policy engine │◀─ 200 (manifest) │ ``` **Token endpoint** (`GET /v2/token`): Accepts HTTP Basic auth, forwards credentials to MCIAS `/v1/auth/login`, and returns the MCIAS JWT in the Docker-compatible token response format. The `scope` parameter is accepted but not used for token scoping — authorization is enforced per-request by the policy engine. **Direct bearer tokens**: MCIAS service tokens and pre-authenticated JWTs are accepted directly on all OCI endpoints via the `Authorization: Bearer` header. This allows system accounts and CLI tools to skip the token endpoint. **Token validation**: Every request validates the bearer token by calling MCIAS `ValidateToken()`. Results are cached by SHA-256 of the token with a 30-second TTL. --- ## 4. Authorization & Policy Engine ### Role Model | Role | Access | |------|--------| | `admin` | Full access: push, pull, delete, catalog, policy management, GC | | `user` | Full content access: push, pull, delete, catalog | | System account | Default deny; requires explicit policy rule for any operation | Admin detection is based solely on the MCIAS `admin` role. Human users with the `user` role have full content management access. System accounts have no implicit access and must be granted specific permissions via policy rules. ### Policy Engine MCR implements a local policy engine for registry-specific access control, following the same architecture as MCIAS's policy engine (priority-based, deny-wins, default-deny). The engine is an in-process Go package (`internal/policy`) with no external dependencies. #### Actions | Action | Description | |--------|-------------| | `registry:version_check` | OCI version check (`GET /v2/`) | | `registry:pull` | Read manifests, download blobs, list tags for a repository | | `registry:push` | Upload blobs and push manifests/tags | | `registry:delete` | Delete manifests and blobs | | `registry:catalog` | List all repositories (`GET /v2/_catalog`) | | `policy:manage` | Create, update, delete policy rules (admin only) | #### Policy Input ```go type PolicyInput struct { Subject string // MCIAS account UUID AccountType string // "human" or "system" Roles []string // roles from MCIAS JWT Action Action Repository string // target repository name (e.g., "myapp"); // empty for global operations (catalog, health) } ``` #### Rule Structure ```go type Rule struct { ID int64 Priority int // lower = evaluated first Description string Effect Effect // "allow" or "deny" // Principal conditions (all populated fields ANDed) Roles []string // principal must hold at least one AccountTypes []string // "human", "system", or both SubjectUUID string // exact principal UUID // Action condition Actions []Action // Resource condition Repositories []string // repository name patterns (glob via path.Match). // Examples: // "myapp" — exact match // "production/*" — any repo one level under production/ // Empty list = wildcard (matches all repositories). } ``` #### Evaluation Algorithm ``` 1. Merge built-in defaults with operator-defined rules 2. Sort by Priority ascending (stable) 3. Collect all matching rules 4. If any match has Effect=Deny → return Deny (deny-wins) 5. If any match has Effect=Allow → return Allow 6. Return Deny (default-deny) ``` A rule matches when every populated field satisfies its condition: | Field | Match condition | |-------|----------------| | `Roles` | Principal holds at least one of the listed roles | | `AccountTypes` | Principal's account type is in the list | | `SubjectUUID` | Principal UUID equals exactly | | `Actions` | Request action is in the list | | `Repositories` | Request repository matches at least one pattern; empty list is a wildcard (matches all repositories) | Repository glob matching uses `path.Match` semantics: `*` matches any sequence of non-`/` characters within a single path segment. For example, `production/*` matches `production/myapp` but not `production/team/myapp`. An empty `Repositories` field is a wildcard (matches all repositories). When `PolicyInput.Repository` is empty (global operations like catalog), only rules with an empty `Repositories` field match — a rule scoped to specific repositories does not apply to global operations. #### Built-in Default Rules ``` Priority 0, Allow: roles=[admin], actions= — admin wildcard Priority 0, Allow: roles=[user], accountTypes=[human], actions=[registry:pull, registry:push, registry:delete, registry:catalog] — human users have full content access Priority 0, Allow: actions=[registry:version_check] — /v2/ endpoint (always accessible to authenticated users) ``` System accounts have no built-in allow rules for catalog, push, pull, or delete. An operator must create explicit policy rules granting them access. #### Example: CI System Push Access (Glob) Grant a CI system account permission to push and pull from all repositories under the `production/` namespace: ```json { "effect": "allow", "account_types": ["system"], "subject_uuid": "", "actions": ["registry:push", "registry:pull"], "repositories": ["production/*"], "priority": 50, "description": "CI system: push/pull to all production repos" } ``` #### Example: Deploy Agent Pull-Only Grant a deploy agent pull access to all repositories (empty `repositories` = wildcard), but deny delete globally: ```json { "effect": "allow", "subject_uuid": "", "actions": ["registry:pull"], "repositories": [], "priority": 50, "description": "deploy-agent: pull from any repo" } ``` ```json { "effect": "deny", "subject_uuid": "", "actions": ["registry:delete"], "priority": 10, "description": "deploy-agent may never delete images (deny-wins)" } ``` #### Example: Exact Repo Access Grant a specific system account access to exactly two named repositories: ```json { "effect": "allow", "subject_uuid": "", "actions": ["registry:push", "registry:pull"], "repositories": ["myapp", "infra-proxy"], "priority": 50, "description": "svc-account: push/pull to myapp and infra-proxy only" } ``` ### Policy Management Policy rules are managed via the admin REST/gRPC API and the web UI. Only users with the `admin` role can create, update, or delete policy rules. --- ## 5. Storage Design MCR uses a split storage model: SQLite for metadata, filesystem for blobs. ### Blob Storage Blobs (image layers and config objects) are stored as content-addressed files under `/srv/mcr/layers/`: ``` /srv/mcr/layers/ └── sha256/ ├── ab/ │ └── abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890 ├── cd/ │ └── cdef... └── ef/ └── ef01... ``` The two-character hex prefix directory limits the number of files per directory. Blobs are written atomically: data is written to a temporary file in `/srv/mcr/uploads/`, then renamed into place after digest verification. The `uploads/` and `layers/` directories must reside on the same filesystem for `rename(2)` to be atomic. ### Upload Staging In-progress blob uploads are stored in `/srv/mcr/uploads/`: ``` /srv/mcr/uploads/ └── ``` Each upload UUID corresponds to a row in the `uploads` table tracking the current byte offset. Completed uploads are renamed to the blob store; cancelled or expired uploads are cleaned up. ### Manifest Storage Manifests are small JSON documents stored directly in the SQLite database (in the `manifests` table `content` column). This simplifies metadata queries and avoids filesystem overhead for small objects. ### Manifest Push Flow When a client calls `PUT /v2//manifests/`: ``` 1. Parse the manifest JSON. Reject if malformed or unsupported media type. 2. Compute the SHA-256 digest of the raw manifest bytes. 3. If is a digest, verify it matches the computed digest. Reject with DIGEST_INVALID if mismatch. 4. Parse the manifest's layer and config descriptors. 5. Verify every referenced blob exists in the `blobs` table. Reject with MANIFEST_BLOB_UNKNOWN if any are missing. 6. Begin write transaction: a. Create the repository row if it does not exist (implicit creation). b. Insert or update the manifest row (repository_id, digest, content). c. Populate `manifest_blobs` join table for all referenced blobs. d. If is a tag name, insert or update the tag row to point to the new manifest (atomic tag move). 7. Commit. 8. Return 201 Created with `Docker-Content-Digest: ` header. ``` This is the most complex write path in the system. The entire operation executes in a single SQLite transaction so a crash at any point leaves the database consistent. ### Data Directory ``` /srv/mcr/ ├── mcr.toml Configuration ├── mcr.db SQLite database (metadata) ├── certs/ TLS certificates ├── layers/ Content-addressed blob storage │ └── sha256/ ├── uploads/ In-progress blob uploads └── backups/ Database snapshots ``` --- ## 6. API Surface ### Error Response Formats OCI and admin endpoints use different error formats: **OCI endpoints** (`/v2/...`) follow the OCI Distribution Spec error format: ```json {"errors": [{"code": "MANIFEST_UNKNOWN", "message": "...", "detail": "..."}]} ``` Standard OCI error codes used by MCR: | Code | HTTP Status | Trigger | |------|-------------|---------| | `UNAUTHORIZED` | 401 | Missing or invalid bearer token | | `DENIED` | 403 | Policy engine denied the request | | `NAME_UNKNOWN` | 404 | Repository does not exist | | `MANIFEST_UNKNOWN` | 404 | Manifest not found (by tag or digest) | | `BLOB_UNKNOWN` | 404 | Blob not found | | `MANIFEST_BLOB_UNKNOWN` | 400 | Manifest references a blob not yet uploaded | | `DIGEST_INVALID` | 400 | Computed digest does not match supplied digest | | `MANIFEST_INVALID` | 400 | Malformed or unsupported manifest | | `BLOB_UPLOAD_UNKNOWN` | 404 | Upload UUID not found or expired | | `BLOB_UPLOAD_INVALID` | 400 | Chunked upload byte range error | | `UNSUPPORTED` | 405 | Operation not supported (e.g., cross-repo mount) | **Admin endpoints** (`/v1/...`) use the platform-standard format: ```json {"error": "human-readable message"} ``` ### OCI Distribution Endpoints All OCI endpoints are prefixed with `/v2` and require authentication. #### Version Check | Method | Path | Description | |--------|------|-------------| | GET | `/v2/` | API version check; returns `{}` if authenticated | #### Token | Method | Path | Auth | Description | |--------|------|------|-------------| | GET | `/v2/token` | Basic | Exchange credentials for bearer token via MCIAS | #### Content Discovery | Method | Path | Auth | Description | |--------|------|------|-------------| | GET | `/v2/_catalog` | bearer | List repositories (paginated) | | GET | `/v2//tags/list` | bearer | List tags for a repository (paginated) | #### Manifests | Method | Path | Auth | Description | |--------|------|------|-------------| | HEAD | `/v2//manifests/` | bearer | Check manifest existence | | GET | `/v2//manifests/` | bearer | Pull manifest by tag or digest | | PUT | `/v2//manifests/` | bearer | Push manifest | | DELETE | `/v2//manifests/` | bearer | Delete manifest (digest only) | `` is either a tag name or a `sha256:...` digest. All manifest responses (GET, HEAD, PUT) include the `Docker-Content-Digest` header with the manifest's SHA-256 digest and a `Content-Type` header with the manifest's media type. #### Blobs | Method | Path | Auth | Description | |--------|------|------|-------------| | HEAD | `/v2//blobs/` | bearer | Check blob existence | | GET | `/v2//blobs/` | bearer | Download blob | | DELETE | `/v2//blobs/` | bearer | Delete blob | #### Blob Uploads | Method | Path | Auth | Description | |--------|------|------|-------------| | POST | `/v2//blobs/uploads/` | bearer | Initiate blob upload | | GET | `/v2//blobs/uploads/` | bearer | Check upload progress | | PATCH | `/v2//blobs/uploads/` | bearer | Upload chunk (chunked flow) | | PUT | `/v2//blobs/uploads/?digest=` | bearer | Complete upload with digest verification | | DELETE | `/v2//blobs/uploads/` | bearer | Cancel upload | **Monolithic upload**: Client sends `POST` to initiate, then `PUT` with the entire blob body and `digest` query parameter in a single request. **Chunked upload**: Client sends `POST` to initiate, then one or more `PATCH` requests with sequential byte ranges, then `PUT` with the final digest. ### Admin REST Endpoints Admin endpoints use the `/v1` prefix and follow the platform API conventions. #### Authentication | Method | Path | Auth | Description | |--------|------|------|-------------| | POST | `/v1/auth/login` | none | Login via MCIAS (username/password) | | POST | `/v1/auth/logout` | bearer | Revoke current token | #### Health | Method | Path | Auth | Description | |--------|------|------|-------------| | GET | `/v1/health` | none | Health check | #### Repository Management | Method | Path | Auth | Description | |--------|------|------|-------------| | GET | `/v1/repositories` | bearer | List repositories with metadata | | GET | `/v1/repositories/{name}` | bearer | Repository detail (tags, size, manifest count) | | DELETE | `/v1/repositories/{name}` | admin | Delete repository and all its manifests/tags | Repository `{name}` may contain `/` (e.g., `production/myapp`). The chi router must use a wildcard catch-all segment for this parameter. #### Garbage Collection | Method | Path | Auth | Description | |--------|------|------|-------------| | POST | `/v1/gc` | admin | Trigger garbage collection (async) | | GET | `/v1/gc/status` | admin | Check GC status (running, last result) | #### Policy Management | Method | Path | Auth | Description | |--------|------|------|-------------| | GET | `/v1/policy/rules` | admin | List all policy rules | | POST | `/v1/policy/rules` | admin | Create a policy rule | | GET | `/v1/policy/rules/{id}` | admin | Get a single rule | | PATCH | `/v1/policy/rules/{id}` | admin | Update rule | | DELETE | `/v1/policy/rules/{id}` | admin | Delete rule | #### Audit | Method | Path | Auth | Description | |--------|------|------|-------------| | GET | `/v1/audit` | admin | List audit log events | --- ## 7. gRPC Admin Interface The gRPC API provides the same admin capabilities as the REST admin API. OCI Distribution endpoints are REST-only (protocol requirement — OCI clients speak HTTP). ### Proto Package Layout ``` proto/ └── mcr/ └── v1/ ├── registry.proto # Repository listing, GC ├── policy.proto # Policy rule CRUD ├── audit.proto # Audit log queries ├── admin.proto # Health └── common.proto # Shared message types gen/ └── mcr/ └── v1/ # Generated Go stubs (committed) ``` ### Service Definitions | Service | RPCs | |---------|------| | `RegistryService` | `ListRepositories`, `GetRepository`, `DeleteRepository`, `GarbageCollect`, `GetGCStatus` | | `PolicyService` | `ListPolicyRules`, `CreatePolicyRule`, `GetPolicyRule`, `UpdatePolicyRule`, `DeletePolicyRule` | | `AuditService` | `ListAuditEvents` | | `AdminService` | `Health` | Auth endpoints (`/v1/auth/login`, `/v1/auth/logout`) are REST-only. Login requires HTTP Basic auth or form-encoded credentials, which do not map cleanly to gRPC unary RPCs. Clients that need programmatic auth use MCIAS directly and present the resulting bearer token to gRPC. ### Transport Security Same TLS certificate and key as the REST server. TLS 1.3 minimum. ### Authentication gRPC unary interceptor extracts the `authorization` metadata key, validates the MCIAS bearer token, and injects claims into the context. Same validation logic as the REST middleware. ### Interceptor Chain ``` [Request Logger] → [Auth Interceptor] → [Admin Interceptor] → [Handler] ``` - **Request Logger**: Logs method, peer IP, status code, duration. - **Auth Interceptor**: Validates bearer JWT via MCIAS. `Health` bypasses auth. - **Admin Interceptor**: Requires admin role for GC, policy, and delete operations. --- ## 8. Database Schema SQLite 3, WAL mode, `PRAGMA foreign_keys = ON`, `PRAGMA busy_timeout = 5000`. ```sql -- Schema version tracking CREATE TABLE schema_migrations ( version INTEGER PRIMARY KEY, applied_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')) ); -- Container image repositories CREATE TABLE repositories ( id INTEGER PRIMARY KEY, name TEXT NOT NULL UNIQUE, -- e.g., "myapp", "infra/proxy" created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')) ); -- UNIQUE on name creates an implicit index; no explicit index needed. -- Image manifests (content stored in DB — small JSON documents) CREATE TABLE manifests ( id INTEGER PRIMARY KEY, repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE, digest TEXT NOT NULL, -- "sha256:" media_type TEXT NOT NULL, -- "application/vnd.oci.image.manifest.v1+json" content BLOB NOT NULL, -- manifest JSON size INTEGER NOT NULL, -- byte size of content created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')), UNIQUE(repository_id, digest) ); CREATE INDEX idx_manifests_repo ON manifests (repository_id); CREATE INDEX idx_manifests_digest ON manifests (digest); -- Tags: mutable pointers from name → manifest CREATE TABLE tags ( id INTEGER PRIMARY KEY, repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE, name TEXT NOT NULL, -- e.g., "latest", "v1.2.3" manifest_id INTEGER NOT NULL REFERENCES manifests(id) ON DELETE CASCADE, updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')), UNIQUE(repository_id, name) ); CREATE INDEX idx_tags_repo ON tags (repository_id); CREATE INDEX idx_tags_manifest ON tags (manifest_id); -- Blob metadata (actual data on filesystem at /srv/mcr/layers/) CREATE TABLE blobs ( id INTEGER PRIMARY KEY, digest TEXT NOT NULL UNIQUE, -- "sha256:" size INTEGER NOT NULL, created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')) ); -- UNIQUE on digest creates an implicit index; no explicit index needed. -- Many-to-many: tracks which blobs are referenced by manifests in which repos. -- A blob may be shared across repositories (content-addressed dedup). -- Used by garbage collection to determine unreferenced blobs. CREATE TABLE manifest_blobs ( manifest_id INTEGER NOT NULL REFERENCES manifests(id) ON DELETE CASCADE, blob_id INTEGER NOT NULL REFERENCES blobs(id), PRIMARY KEY (manifest_id, blob_id) ); CREATE INDEX idx_manifest_blobs_blob ON manifest_blobs (blob_id); -- In-progress blob uploads CREATE TABLE uploads ( id INTEGER PRIMARY KEY, uuid TEXT NOT NULL UNIQUE, repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE, byte_offset INTEGER NOT NULL DEFAULT 0, created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')) ); -- Policy rules for registry access control CREATE TABLE policy_rules ( id INTEGER PRIMARY KEY, priority INTEGER NOT NULL DEFAULT 100, description TEXT NOT NULL, rule_json TEXT NOT NULL, -- JSON-encoded rule body enabled INTEGER NOT NULL DEFAULT 1 CHECK (enabled IN (0,1)), created_by TEXT, -- MCIAS account UUID created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')), updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')) ); -- Audit log — append-only CREATE TABLE audit_log ( id INTEGER PRIMARY KEY, event_time TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')), event_type TEXT NOT NULL, actor_id TEXT, -- MCIAS account UUID repository TEXT, -- repository name (if applicable) digest TEXT, -- affected digest (if applicable) ip_address TEXT, details TEXT -- JSON blob; never contains secrets ); CREATE INDEX idx_audit_time ON audit_log (event_time); CREATE INDEX idx_audit_actor ON audit_log (actor_id); CREATE INDEX idx_audit_event ON audit_log (event_type); ``` ### Schema Notes - Repositories are created implicitly on first push. No explicit creation step required. - Repository names may contain `/` for organizational grouping (e.g., `production/myapp`) but are otherwise flat strings. No hierarchical enforcement. - The `manifest_blobs` join table enables content-addressed deduplication: the same layer blob may be referenced by manifests in different repositories. - Manifest deletion cascades to `manifest_blobs` rows and to any tags pointing at the deleted manifest (`ON DELETE CASCADE` on both `manifest_blobs.manifest_id` and `tags.manifest_id`). Blob files on the filesystem are not deleted — unreferenced blobs are reclaimed by garbage collection. - Tag updates are atomic: pushing a manifest with an existing tag name updates the `manifest_id` in a single transaction. --- ## 9. Garbage Collection Garbage collection removes unreferenced blobs — blobs that are not referenced by any manifest. GC is a manual process triggered by an administrator. ### Algorithm GC runs in two phases to maintain consistency across a crash: ``` Phase 1 — Mark and sweep (database) 1. Acquire registry-wide GC lock (blocks new blob uploads). 2. Begin write transaction. 3. Find all blob rows in `blobs` with no corresponding row in `manifest_blobs` (unreferenced blobs). Record their digests. 4. Delete those rows from `blobs`. 5. Commit. Phase 2 — File cleanup (filesystem) 6. For each digest recorded in step 3: a. Delete the file from /srv/mcr/layers/sha256//. 7. Remove empty prefix directories. 8. Release GC lock. ``` **Crash safety**: If the process crashes after phase 1 but before phase 2 completes, orphaned files remain on disk with no matching DB row. These are harmless — they consume space but are never served. A subsequent GC run or a filesystem reconciliation command (`mcrctl gc --reconcile`) cleans them up by scanning the layers directory and deleting files with no `blobs` row. ### Trigger Methods - **CLI**: `mcrctl gc` - **REST**: `POST /v1/gc` - **gRPC**: `RegistryService.GarbageCollect` GC runs asynchronously. Status can be checked via `GET /v1/gc/status` or `mcrctl gc status`. Only one GC run may be active at a time. ### Safety GC acquires a registry-wide lock that **blocks all new blob uploads** for the duration of the mark-and-sweep phase. Ongoing uploads that started before the lock are allowed to complete before the lock is acquired. This is a stop-the-world approach, acceptable at the target scale (single developer, several dozen repos). Pulls are not blocked. --- ## 10. Configuration TOML format. Environment variable overrides via `MCR_*`. ```toml [server] listen_addr = ":8443" # HTTPS (OCI + admin REST) grpc_addr = ":9443" # gRPC admin API (optional; omit to disable) tls_cert = "/srv/mcr/certs/cert.pem" tls_key = "/srv/mcr/certs/key.pem" read_timeout = "30s" # HTTP read timeout write_timeout = "0s" # HTTP write timeout; 0 = disabled for large # blob uploads (idle_timeout provides the safety net) idle_timeout = "120s" # HTTP idle timeout shutdown_timeout = "60s" # Graceful shutdown drain period [database] path = "/srv/mcr/mcr.db" [storage] layers_path = "/srv/mcr/layers" # Blob storage root uploads_path = "/srv/mcr/uploads" # Upload staging directory # Must be on the same filesystem as layers_path [mcias] server_url = "https://mcias.metacircular.net:8443" ca_cert = "" # Custom CA for MCIAS TLS service_name = "mcr" tags = ["env:restricted"] [web] listen_addr = "127.0.0.1:8080" # Web UI listen address grpc_addr = "127.0.0.1:9443" # mcrsrv gRPC address for the web UI to connect to ca_cert = "" # CA cert for verifying mcrsrv gRPC TLS [log] level = "info" # debug, info, warn, error ``` ### Validation Required fields are validated at startup. The server refuses to start if any are missing or if TLS certificate paths are invalid. `storage.uploads_path` and `storage.layers_path` must resolve to the same filesystem (verified at startup via `os.Stat` device ID comparison). ### Timeout Notes The HTTP `write_timeout` is disabled (0) by default because blob uploads can transfer hundreds of megabytes over slow connections. The `idle_timeout` serves as the safety net for stale connections. Operators may set a non-zero `write_timeout` if all clients are on fast local networks. --- ## 11. Web UI ### Technology Go `html/template` + htmx, embedded via `//go:embed`. The web UI is a separate binary (`mcr-web`) that communicates with mcrsrv via gRPC. ### Pages | Path | Description | |------|-------------| | `/login` | MCIAS login form | | `/` | Dashboard (repository count, total size, recent pushes) | | `/repositories` | Repository list with tag counts and sizes | | `/repositories/{name}` | Repository detail: tags, manifests, layer list | | `/repositories/{name}/manifests/{digest}` | Manifest detail: layers, config, size | | `/policies` | Policy rule management (admin only): create, edit, delete | | `/audit` | Audit log viewer (admin only) | ### Security - CSRF protection via signed double-submit cookies on all mutating requests. - Session cookie: `HttpOnly`, `Secure`, `SameSite=Strict`. - All user input escaped by `html/template`. --- ## 12. CLI Tools ### mcrsrv The registry server. Cobra subcommands: | Command | Description | |---------|-------------| | `server` | Start the registry server | | `init` | First-time setup (create directories, example config) | | `snapshot` | Database backup via `VACUUM INTO` | ### mcr-web The web UI server. Communicates with mcrsrv via gRPC. | Command | Description | |---------|-------------| | `server` | Start the web UI server | ### mcrctl Admin CLI. Communicates with mcrsrv via REST or gRPC. | Command | Description | |---------|-------------| | `status` | Query server health | | `repo list` | List repositories | | `repo delete ` | Delete a repository | | `gc` | Trigger garbage collection | | `gc status` | Check GC status | | `policy list` | List policy rules | | `policy create` | Create a policy rule | | `policy update ` | Update a policy rule | | `policy delete ` | Delete a policy rule | | `audit tail [--n N]` | Print recent audit events | | `snapshot` | Trigger database backup via `VACUUM INTO` | --- ## 13. Package Structure ``` mcr/ ├── cmd/ │ ├── mcrsrv/ # Server binary: OCI + REST + gRPC │ ├── mcr-web/ # Web UI binary │ └── mcrctl/ # Admin CLI ├── internal/ │ ├── auth/ # MCIAS integration: token validation, 30s cache │ ├── config/ # TOML config loading and validation │ ├── db/ # SQLite: migrations, CRUD for all tables │ ├── oci/ # OCI Distribution Spec handler: manifests, blobs, uploads │ ├── policy/ # Registry policy engine: rules, evaluation, defaults │ ├── server/ # REST API: admin routes, middleware, chi router │ ├── grpcserver/ # gRPC admin API: interceptors, service handlers │ ├── webserver/ # Web UI: template routes, htmx handlers │ ├── storage/ # Blob filesystem operations: write, read, delete, GC │ └── gc/ # Garbage collection: mark, sweep, locking ├── proto/mcr/ │ └── v1/ # Protobuf definitions ├── gen/mcr/ │ └── v1/ # Generated gRPC code (never edit by hand) ├── web/ │ ├── embed.go # //go:embed directive │ ├── templates/ # HTML templates │ └── static/ # CSS, htmx ├── deploy/ │ ├── docker/ # Docker Compose │ ├── examples/ # Example config files │ ├── scripts/ # Install script │ └── systemd/ # systemd units and timers └── docs/ # Internal documentation ``` --- ## 14. Deployment ### Binary Single static binary per component, built with `CGO_ENABLED=0`. ### Container Multi-stage Docker build: 1. **Builder**: `golang:1.25-alpine`, static compilation with `-trimpath -ldflags="-s -w"`. 2. **Runtime**: `alpine:3.21`, non-root user (`mcr`), ports 8443/9443. The `/srv/mcr/` directory is a mounted volume containing the database, blob storage, and configuration. ### systemd | File | Purpose | |------|---------| | `mcr.service` | Registry server | | `mcr-web.service` | Web UI | | `mcr-backup.service` | Oneshot database backup | | `mcr-backup.timer` | Daily backup timer (02:00 UTC, 5-minute jitter) | Standard security hardening per engineering standards (`NoNewPrivileges=true`, `ProtectSystem=strict`, `ReadWritePaths=/srv/mcr`, etc.). ### Backup Database backup via `VACUUM INTO` captures metadata only (repositories, manifests, tags, blobs table, policy rules, audit log). **Blob data on the filesystem is not included.** A complete backup requires both: 1. `mcrsrv snapshot` (or `mcrctl snapshot`) — SQLite database backup. 2. Filesystem-level copy of `/srv/mcr/layers/` — blob data. The database and blob directory must be backed up together. A database backup without the corresponding blob directory is usable (metadata is intact; missing blobs return 404 on pull) but incomplete. A blob directory without the database is useless (no way to map digests to repositories). ### Graceful Shutdown On `SIGINT` or `SIGTERM`: 1. Stop accepting new connections. 2. Drain in-flight requests (including ongoing uploads) up to `shutdown_timeout` (default 60s). 3. Force-close remaining connections. 4. Close database. 5. Exit. --- ## 15. Audit Events | Event | Trigger | |-------|---------| | `manifest_pushed` | Manifest uploaded (includes repo, tag, digest) | | `manifest_pulled` | Manifest downloaded | | `manifest_deleted` | Manifest deleted | | `blob_uploaded` | Blob upload completed | | `blob_deleted` | Blob deleted | | `repo_deleted` | Repository deleted (admin) | | `gc_started` | Garbage collection started | | `gc_completed` | Garbage collection finished (includes blobs removed, bytes freed) | | `policy_rule_created` | Policy rule created | | `policy_rule_updated` | Policy rule updated | | `policy_rule_deleted` | Policy rule deleted | | `policy_deny` | Policy engine denied a request | | `login_ok` | Successful authentication | | `login_fail` | Failed authentication | The audit log is append-only. It never contains credentials or token values. --- ## 16. Error Handling and Logging - All errors are wrapped with `fmt.Errorf("context: %w", err)`. - Structured logging uses `log/slog` (or goutils wrapper). - Log levels: DEBUG (dev only), INFO (normal ops), WARN (recoverable), ERROR (unexpected failures). - OCI operations (push, pull, delete) are logged at INFO with: `{event, repository, reference, digest, account_uuid, ip, duration}`. - **Never log:** bearer tokens, passwords, blob content, MCIAS credentials. - OCI endpoint errors use the OCI error format (see §6). Admin endpoint errors use the platform `{"error": "..."}` format. --- ## 17. Security Model ### Threat Mitigations | Threat | Mitigation | |--------|------------| | Unauthenticated access | All endpoints require MCIAS bearer token; `env:restricted` tag blocks guest/viewer login | | Unauthorized push/delete | Policy engine enforces per-principal, per-repository access; default-deny for system accounts | | Digest mismatch (supply chain) | All uploads verified against client-supplied digest; rejected if mismatch | | Blob corruption | Content-addressed storage; digests verified on write. Periodic integrity scrub via `mcrctl scrub` (future) | | Upload resource exhaustion | Stale uploads expire and are cleaned up; GC reclaims orphaned data | | Information leakage | Error responses follow OCI spec format; no internal details exposed | ### Security Invariants 1. Every API request (OCI and admin) requires a valid MCIAS bearer token. 2. Token validation results are cached for at most 30 seconds. 3. System accounts have no implicit access — explicit policy rules required. 4. Blob digests are verified on upload; mismatched digests are rejected. Reads trust the content-addressed path (digest is the filename). 5. Manifest deletion by tag is not supported — only by digest (OCI spec). 6. The audit log never contains credentials, tokens, or blob content. 7. TLS 1.3 minimum on all listeners. No fallback. --- ## 18. Future Work | Item | Description | |------|-------------| | **Image signing / content trust** | Cosign or Notary v2 integration for image verification | | **Multi-arch manifests** | OCI image index support for multi-platform images | | **Cross-repo blob mounts** | `POST /v2//blobs/uploads/?mount=&from=` for efficient cross-repo copies | | **MCP integration** | Wire MCR into the Metacircular Control Plane for automated image deployment | | **Upload expiry** | Automatic cleanup of stale uploads after configurable TTL | | **Repository size quotas** | Per-repository storage limits | | **Webhook notifications** | Push events to external systems on manifest push/delete | | **Integrity scrub** | `mcrctl scrub` — verify blob digests on disk match their filenames, report corruption | | **Metrics** | Prometheus-compatible metrics: push/pull counts, storage usage, request latency |