Files
mcr/ARCHITECTURE.md

41 KiB

MCR Architecture

Metacircular Container Registry — Technical Design Document


1. System Overview

MCR is an OCI Distribution Spec-compliant container registry for Metacircular Dynamics. It stores and serves container images for the platform's services, with MCP directing nodes to pull images from MCR. Authentication is delegated to MCIAS; all operations require a valid bearer token. MCR sits behind an mc-proxy instance for TLS routing.

Components

                    ┌──────────────────────────────────────────────┐
                    │                  MCR Server (mcrsrv)         │
                    │                                              │
                    │  ┌────────────┐  ┌──────────┐  ┌──────────┐ │
                    │  │ OCI API    │  │ Auth     │  │ Policy   │ │
                    │  │ Handler    │  │ (MCIAS)  │  │ Engine   │ │
                    │  └─────┬──────┘  └────┬─────┘  └────┬─────┘ │
                    │        └──────────────┼─────────────┘       │
                    │                       │                     │
                    │         ┌─────────────▼────────────┐        │
                    │         │   SQLite (metadata)      │        │
                    │         └──────────────────────────┘        │
                    │         ┌─────────────────────────┐         │
                    │         │   Filesystem (blobs)     │        │
                    │         │   /srv/mcr/layers/       │        │
                    │         └──────────────────────────┘        │
                    │                                              │
                    │  ┌─────────────────┐  ┌──────────────────┐  │
                    │  │ REST listener   │  │ gRPC listener    │  │
                    │  │ (OCI + admin)   │  │ (admin)          │  │
                    │  │ :8443           │  │ :9443            │  │
                    │  └─────────────────┘  └──────────────────┘  │
                    └──────────────────────────────────────────────┘
                         ▲            ▲             ▲
                         │            │             │
                    ┌────┴───┐  ┌─────┴─────┐  ┌───┴──────┐
                    │ Docker │  │  mcrctl   │  │ mcr-web  │
                    │ / OCI  │  │ (admin    │  │ (web UI) │
                    │ client │  │   CLI)    │  │          │
                    └────────┘  └───────────┘  └──────────┘

mcrsrv — The registry server. Exposes OCI Distribution endpoints and an admin REST API over HTTPS/TLS, plus a gRPC admin API. Handles blob storage, manifest management, and token-based authentication via MCIAS.

mcr-web — The web UI. Communicates with mcrsrv via gRPC. Provides repository/tag browsing and ACL policy management for administrators.

mcrctl — The admin CLI. Communicates with mcrsrv via REST or gRPC. Provides garbage collection, repository management, and policy management.


2. OCI Distribution Spec Compliance

MCR implements the OCI Distribution Specification for content discovery and content management. All OCI endpoints require authentication — there is no anonymous access.

Supported Operations

Category Capability
Content discovery Repository catalog, tag listing
Pull Manifest retrieval (by tag or digest), blob download
Push Monolithic and chunked blob upload, manifest upload
Delete Manifest deletion (by digest), blob deletion

Not Supported (v1)

Feature Rationale
Multi-arch manifest lists Not needed for single-platform deployment
Image signing / content trust Deferred to future work
Cross-repository blob mounts Complexity not justified at current scale

Content Addressing

All blobs and manifests are identified by their SHA-256 digest in the format sha256:<hex>. Digests are verified on upload — if the computed digest does not match the client-supplied digest, the upload is rejected.

Tags are mutable pointers to manifest digests. Pushing a manifest with an existing tag atomically updates the tag to point to the new digest.


3. Authentication

MCR delegates all authentication to MCIAS. No local user database.

MCIAS Configuration

MCR registers with MCIAS as a service with the env:restricted tag. This means MCIAS denies login to guest and viewer accounts — only admin and user roles can authenticate to MCR.

[mcias]
server_url   = "https://mcias.metacircular.net:8443"
service_name = "mcr"
tags         = ["env:restricted"]

OCI Token Authentication

OCI/Docker clients expect a specific auth handshake:

Client                          mcrsrv
  │                                │
  ├─ GET /v2/ ────────────────────▶│
  │◀─ 401 WWW-Authenticate:       │
  │   Bearer realm="/v2/token",    │
  │   service="mcr.metacircular.…" │
  │                                │
  ├─ GET /v2/token ───────────────▶│  (Basic auth: username:password)
  │                                ├─ Forward credentials to MCIAS
  │                                │  POST /v1/auth/login
  │                                ├─ MCIAS returns JWT
  │◀─ 200 {"token":"<jwt>",       │
  │        "expires_in": N}        │
  │                                │
  ├─ GET /v2/<name>/manifests/… ──▶│  (Authorization: Bearer <jwt>)
  │                                ├─ Validate token via MCIAS
  │                                ├─ Check policy engine
  │◀─ 200 (manifest)              │

Token endpoint (GET /v2/token): Accepts HTTP Basic auth, forwards credentials to MCIAS /v1/auth/login, and returns the MCIAS JWT in the Docker-compatible token response format. The scope parameter is accepted but not used for token scoping — authorization is enforced per-request by the policy engine.

Direct bearer tokens: MCIAS service tokens and pre-authenticated JWTs are accepted directly on all OCI endpoints via the Authorization: Bearer header. This allows system accounts and CLI tools to skip the token endpoint.

Token validation: Every request validates the bearer token by calling MCIAS ValidateToken(). Results are cached by SHA-256 of the token with a 30-second TTL.


4. Authorization & Policy Engine

Role Model

Role Access
admin Full access: push, pull, delete, catalog, policy management, GC
user Full content access: push, pull, delete, catalog
System account Default deny; requires explicit policy rule for any operation

Admin detection is based solely on the MCIAS admin role. Human users with the user role have full content management access. System accounts have no implicit access and must be granted specific permissions via policy rules.

Policy Engine

MCR implements a local policy engine for registry-specific access control, following the same architecture as MCIAS's policy engine (priority-based, deny-wins, default-deny). The engine is an in-process Go package (internal/policy) with no external dependencies.

Actions

Action Description
registry:version_check OCI version check (GET /v2/)
registry:pull Read manifests, download blobs, list tags for a repository
registry:push Upload blobs and push manifests/tags
registry:delete Delete manifests and blobs
registry:catalog List all repositories (GET /v2/_catalog)
policy:manage Create, update, delete policy rules (admin only)

Policy Input

type PolicyInput struct {
    Subject     string   // MCIAS account UUID
    AccountType string   // "human" or "system"
    Roles       []string // roles from MCIAS JWT

    Action     Action
    Repository string   // target repository name (e.g., "myapp");
                        // empty for global operations (catalog, health)
}

Rule Structure

type Rule struct {
    ID           int64
    Priority     int      // lower = evaluated first
    Description  string
    Effect       Effect   // "allow" or "deny"

    // Principal conditions (all populated fields ANDed)
    Roles        []string // principal must hold at least one
    AccountTypes []string // "human", "system", or both
    SubjectUUID  string   // exact principal UUID

    // Action condition
    Actions      []Action

    // Resource condition
    Repositories []string // repository name patterns (glob via path.Match).
                          // Examples:
                          //   "myapp"         — exact match
                          //   "production/*"  — any repo one level under production/
                          // Empty list = wildcard (matches all repositories).
}

Evaluation Algorithm

1. Merge built-in defaults with operator-defined rules
2. Sort by Priority ascending (stable)
3. Collect all matching rules
4. If any match has Effect=Deny → return Deny (deny-wins)
5. If any match has Effect=Allow → return Allow
6. Return Deny (default-deny)

A rule matches when every populated field satisfies its condition:

Field Match condition
Roles Principal holds at least one of the listed roles
AccountTypes Principal's account type is in the list
SubjectUUID Principal UUID equals exactly
Actions Request action is in the list
Repositories Request repository matches at least one pattern; empty list is a wildcard (matches all repositories)

Repository glob matching uses path.Match semantics: * matches any sequence of non-/ characters within a single path segment. For example, production/* matches production/myapp but not production/team/myapp. An empty Repositories field is a wildcard (matches all repositories).

When PolicyInput.Repository is empty (global operations like catalog), only rules with an empty Repositories field match — a rule scoped to specific repositories does not apply to global operations.

Built-in Default Rules

Priority 0, Allow: roles=[admin], actions=<all>
    — admin wildcard

Priority 0, Allow: roles=[user], accountTypes=[human],
    actions=[registry:pull, registry:push, registry:delete, registry:catalog]
    — human users have full content access

Priority 0, Allow: actions=[registry:version_check]
    — /v2/ endpoint (always accessible to authenticated users)

System accounts have no built-in allow rules for catalog, push, pull, or delete. An operator must create explicit policy rules granting them access.

Example: CI System Push Access (Glob)

Grant a CI system account permission to push and pull from all repositories under the production/ namespace:

{
  "effect": "allow",
  "account_types": ["system"],
  "subject_uuid": "<ci-account-uuid>",
  "actions": ["registry:push", "registry:pull"],
  "repositories": ["production/*"],
  "priority": 50,
  "description": "CI system: push/pull to all production repos"
}

Example: Deploy Agent Pull-Only

Grant a deploy agent pull access to all repositories (empty repositories = wildcard), but deny delete globally:

{
  "effect": "allow",
  "subject_uuid": "<deploy-agent-uuid>",
  "actions": ["registry:pull"],
  "repositories": [],
  "priority": 50,
  "description": "deploy-agent: pull from any repo"
}
{
  "effect": "deny",
  "subject_uuid": "<deploy-agent-uuid>",
  "actions": ["registry:delete"],
  "priority": 10,
  "description": "deploy-agent may never delete images (deny-wins)"
}

Example: Exact Repo Access

Grant a specific system account access to exactly two named repositories:

{
  "effect": "allow",
  "subject_uuid": "<svc-account-uuid>",
  "actions": ["registry:push", "registry:pull"],
  "repositories": ["myapp", "infra-proxy"],
  "priority": 50,
  "description": "svc-account: push/pull to myapp and infra-proxy only"
}

Policy Management

Policy rules are managed via the admin REST/gRPC API and the web UI. Only users with the admin role can create, update, or delete policy rules.


5. Storage Design

MCR uses a split storage model: SQLite for metadata, filesystem for blobs.

Blob Storage

Blobs (image layers and config objects) are stored as content-addressed files under /srv/mcr/layers/:

/srv/mcr/layers/
└── sha256/
    ├── ab/
    │   └── abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890
    ├── cd/
    │   └── cdef...
    └── ef/
        └── ef01...

The two-character hex prefix directory limits the number of files per directory. Blobs are written atomically: data is written to a temporary file in /srv/mcr/uploads/, then renamed into place after digest verification. The uploads/ and layers/ directories must reside on the same filesystem for rename(2) to be atomic.

Upload Staging

In-progress blob uploads are stored in /srv/mcr/uploads/:

/srv/mcr/uploads/
└── <upload-uuid>

Each upload UUID corresponds to a row in the uploads table tracking the current byte offset. Completed uploads are renamed to the blob store; cancelled or expired uploads are cleaned up.

Manifest Storage

Manifests are small JSON documents stored directly in the SQLite database (in the manifests table content column). This simplifies metadata queries and avoids filesystem overhead for small objects.

Manifest Push Flow

When a client calls PUT /v2/<name>/manifests/<reference>:

1. Parse the manifest JSON. Reject if malformed or unsupported media type.
2. Compute the SHA-256 digest of the raw manifest bytes.
3. If <reference> is a digest, verify it matches the computed digest.
   Reject with DIGEST_INVALID if mismatch.
4. Parse the manifest's layer and config descriptors.
5. Verify every referenced blob exists in the `blobs` table.
   Reject with MANIFEST_BLOB_UNKNOWN if any are missing.
6. Begin write transaction:
   a. Create the repository row if it does not exist (implicit creation).
   b. Insert or update the manifest row (repository_id, digest, content).
   c. Populate `manifest_blobs` join table for all referenced blobs.
   d. If <reference> is a tag name, insert or update the tag row
      to point to the new manifest (atomic tag move).
7. Commit.
8. Return 201 Created with `Docker-Content-Digest: <digest>` header.

This is the most complex write path in the system. The entire operation executes in a single SQLite transaction so a crash at any point leaves the database consistent.

Data Directory

/srv/mcr/
├── mcr.toml              Configuration
├── mcr.db                SQLite database (metadata)
├── certs/                TLS certificates
├── layers/               Content-addressed blob storage
│   └── sha256/
├── uploads/              In-progress blob uploads
└── backups/              Database snapshots

6. API Surface

Error Response Formats

OCI and admin endpoints use different error formats:

OCI endpoints (/v2/...) follow the OCI Distribution Spec error format:

{"errors": [{"code": "MANIFEST_UNKNOWN", "message": "...", "detail": "..."}]}

Standard OCI error codes used by MCR:

Code HTTP Status Trigger
UNAUTHORIZED 401 Missing or invalid bearer token
DENIED 403 Policy engine denied the request
NAME_UNKNOWN 404 Repository does not exist
MANIFEST_UNKNOWN 404 Manifest not found (by tag or digest)
BLOB_UNKNOWN 404 Blob not found
MANIFEST_BLOB_UNKNOWN 400 Manifest references a blob not yet uploaded
DIGEST_INVALID 400 Computed digest does not match supplied digest
MANIFEST_INVALID 400 Malformed or unsupported manifest
BLOB_UPLOAD_UNKNOWN 404 Upload UUID not found or expired
BLOB_UPLOAD_INVALID 400 Chunked upload byte range error
UNSUPPORTED 405 Operation not supported (e.g., cross-repo mount)

Admin endpoints (/v1/...) use the platform-standard format:

{"error": "human-readable message"}

OCI Distribution Endpoints

All OCI endpoints are prefixed with /v2 and require authentication.

Version Check

Method Path Description
GET /v2/ API version check; returns {} if authenticated

Token

Method Path Auth Description
GET /v2/token Basic Exchange credentials for bearer token via MCIAS

Content Discovery

Method Path Auth Description
GET /v2/_catalog bearer List repositories (paginated)
GET /v2/<name>/tags/list bearer List tags for a repository (paginated)

Manifests

Method Path Auth Description
HEAD /v2/<name>/manifests/<reference> bearer Check manifest existence
GET /v2/<name>/manifests/<reference> bearer Pull manifest by tag or digest
PUT /v2/<name>/manifests/<reference> bearer Push manifest
DELETE /v2/<name>/manifests/<digest> bearer Delete manifest (digest only)

<reference> is either a tag name or a sha256:... digest.

All manifest responses (GET, HEAD, PUT) include the Docker-Content-Digest header with the manifest's SHA-256 digest and a Content-Type header with the manifest's media type.

Blobs

Method Path Auth Description
HEAD /v2/<name>/blobs/<digest> bearer Check blob existence
GET /v2/<name>/blobs/<digest> bearer Download blob
DELETE /v2/<name>/blobs/<digest> bearer Delete blob

Blob Uploads

Method Path Auth Description
POST /v2/<name>/blobs/uploads/ bearer Initiate blob upload
GET /v2/<name>/blobs/uploads/<uuid> bearer Check upload progress
PATCH /v2/<name>/blobs/uploads/<uuid> bearer Upload chunk (chunked flow)
PUT /v2/<name>/blobs/uploads/<uuid>?digest=<digest> bearer Complete upload with digest verification
DELETE /v2/<name>/blobs/uploads/<uuid> bearer Cancel upload

Monolithic upload: Client sends POST to initiate, then PUT with the entire blob body and digest query parameter in a single request.

Chunked upload: Client sends POST to initiate, then one or more PATCH requests with sequential byte ranges, then PUT with the final digest.

Admin REST Endpoints

Admin endpoints use the /v1 prefix and follow the platform API conventions.

Authentication

Method Path Auth Description
POST /v1/auth/login none Login via MCIAS (username/password)
POST /v1/auth/logout bearer Revoke current token

Health

Method Path Auth Description
GET /v1/health none Health check

Repository Management

Method Path Auth Description
GET /v1/repositories bearer List repositories with metadata
GET /v1/repositories/{name} bearer Repository detail (tags, size, manifest count)
DELETE /v1/repositories/{name} admin Delete repository and all its manifests/tags

Repository {name} may contain / (e.g., production/myapp). The chi router must use a wildcard catch-all segment for this parameter.

Garbage Collection

Method Path Auth Description
POST /v1/gc admin Trigger garbage collection (async)
GET /v1/gc/status admin Check GC status (running, last result)

Policy Management

Method Path Auth Description
GET /v1/policy/rules admin List all policy rules
POST /v1/policy/rules admin Create a policy rule
GET /v1/policy/rules/{id} admin Get a single rule
PATCH /v1/policy/rules/{id} admin Update rule
DELETE /v1/policy/rules/{id} admin Delete rule

Audit

Method Path Auth Description
GET /v1/audit admin List audit log events

7. gRPC Admin Interface

The gRPC API provides the same admin capabilities as the REST admin API. OCI Distribution endpoints are REST-only (protocol requirement — OCI clients speak HTTP).

Proto Package Layout

proto/
└── mcr/
    └── v1/
        ├── registry.proto     # Repository listing, GC
        ├── policy.proto       # Policy rule CRUD
        ├── audit.proto        # Audit log queries
        ├── admin.proto        # Health
        └── common.proto       # Shared message types
gen/
└── mcr/
    └── v1/                    # Generated Go stubs (committed)

Service Definitions

Service RPCs
RegistryService ListRepositories, GetRepository, DeleteRepository, GarbageCollect, GetGCStatus
PolicyService ListPolicyRules, CreatePolicyRule, GetPolicyRule, UpdatePolicyRule, DeletePolicyRule
AuditService ListAuditEvents
AdminService Health

Auth endpoints (/v1/auth/login, /v1/auth/logout) are REST-only. Login requires HTTP Basic auth or form-encoded credentials, which do not map cleanly to gRPC unary RPCs. Clients that need programmatic auth use MCIAS directly and present the resulting bearer token to gRPC.

Transport Security

Same TLS certificate and key as the REST server. TLS 1.3 minimum.

Authentication

gRPC unary interceptor extracts the authorization metadata key, validates the MCIAS bearer token, and injects claims into the context. Same validation logic as the REST middleware.

Interceptor Chain

[Request Logger] → [Auth Interceptor] → [Admin Interceptor] → [Handler]
  • Request Logger: Logs method, peer IP, status code, duration.
  • Auth Interceptor: Validates bearer JWT via MCIAS. Health bypasses auth.
  • Admin Interceptor: Requires admin role for GC, policy, and delete operations.

8. Database Schema

SQLite 3, WAL mode, PRAGMA foreign_keys = ON, PRAGMA busy_timeout = 5000.

-- Schema version tracking
CREATE TABLE schema_migrations (
    version    INTEGER PRIMARY KEY,
    applied_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);

-- Container image repositories
CREATE TABLE repositories (
    id         INTEGER PRIMARY KEY,
    name       TEXT    NOT NULL UNIQUE,  -- e.g., "myapp", "infra/proxy"
    created_at TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- UNIQUE on name creates an implicit index; no explicit index needed.

-- Image manifests (content stored in DB — small JSON documents)
CREATE TABLE manifests (
    id            INTEGER PRIMARY KEY,
    repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE,
    digest        TEXT    NOT NULL,  -- "sha256:<hex>"
    media_type    TEXT    NOT NULL,  -- "application/vnd.oci.image.manifest.v1+json"
    content       BLOB   NOT NULL,  -- manifest JSON
    size          INTEGER NOT NULL,  -- byte size of content
    created_at    TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
    UNIQUE(repository_id, digest)
);

CREATE INDEX idx_manifests_repo   ON manifests (repository_id);
CREATE INDEX idx_manifests_digest ON manifests (digest);

-- Tags: mutable pointers from name → manifest
CREATE TABLE tags (
    id            INTEGER PRIMARY KEY,
    repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE,
    name          TEXT    NOT NULL,  -- e.g., "latest", "v1.2.3"
    manifest_id   INTEGER NOT NULL REFERENCES manifests(id) ON DELETE CASCADE,
    updated_at    TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
    UNIQUE(repository_id, name)
);

CREATE INDEX idx_tags_repo     ON tags (repository_id);
CREATE INDEX idx_tags_manifest ON tags (manifest_id);

-- Blob metadata (actual data on filesystem at /srv/mcr/layers/)
CREATE TABLE blobs (
    id         INTEGER PRIMARY KEY,
    digest     TEXT    NOT NULL UNIQUE,  -- "sha256:<hex>"
    size       INTEGER NOT NULL,
    created_at TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- UNIQUE on digest creates an implicit index; no explicit index needed.

-- Many-to-many: tracks which blobs are referenced by manifests in which repos.
-- A blob may be shared across repositories (content-addressed dedup).
-- Used by garbage collection to determine unreferenced blobs.
CREATE TABLE manifest_blobs (
    manifest_id INTEGER NOT NULL REFERENCES manifests(id) ON DELETE CASCADE,
    blob_id     INTEGER NOT NULL REFERENCES blobs(id),
    PRIMARY KEY (manifest_id, blob_id)
);

CREATE INDEX idx_manifest_blobs_blob ON manifest_blobs (blob_id);

-- In-progress blob uploads
CREATE TABLE uploads (
    id            INTEGER PRIMARY KEY,
    uuid          TEXT    NOT NULL UNIQUE,
    repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE,
    byte_offset   INTEGER NOT NULL DEFAULT 0,
    created_at    TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);

-- Policy rules for registry access control
CREATE TABLE policy_rules (
    id          INTEGER PRIMARY KEY,
    priority    INTEGER NOT NULL DEFAULT 100,
    description TEXT    NOT NULL,
    rule_json   TEXT    NOT NULL,  -- JSON-encoded rule body
    enabled     INTEGER NOT NULL DEFAULT 1 CHECK (enabled IN (0,1)),
    created_by  TEXT,              -- MCIAS account UUID
    created_at  TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
    updated_at  TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);

-- Audit log — append-only
CREATE TABLE audit_log (
    id          INTEGER PRIMARY KEY,
    event_time  TEXT    NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
    event_type  TEXT    NOT NULL,
    actor_id    TEXT,       -- MCIAS account UUID
    repository  TEXT,       -- repository name (if applicable)
    digest      TEXT,       -- affected digest (if applicable)
    ip_address  TEXT,
    details     TEXT        -- JSON blob; never contains secrets
);

CREATE INDEX idx_audit_time  ON audit_log (event_time);
CREATE INDEX idx_audit_actor ON audit_log (actor_id);
CREATE INDEX idx_audit_event ON audit_log (event_type);

Schema Notes

  • Repositories are created implicitly on first push. No explicit creation step required.
  • Repository names may contain / for organizational grouping (e.g., production/myapp) but are otherwise flat strings. No hierarchical enforcement.
  • The manifest_blobs join table enables content-addressed deduplication: the same layer blob may be referenced by manifests in different repositories.
  • Manifest deletion cascades to manifest_blobs rows and to any tags pointing at the deleted manifest (ON DELETE CASCADE on both manifest_blobs.manifest_id and tags.manifest_id). Blob files on the filesystem are not deleted — unreferenced blobs are reclaimed by garbage collection.
  • Tag updates are atomic: pushing a manifest with an existing tag name updates the manifest_id in a single transaction.

9. Garbage Collection

Garbage collection removes unreferenced blobs — blobs that are not referenced by any manifest. GC is a manual process triggered by an administrator.

Algorithm

GC runs in two phases to maintain consistency across a crash:

Phase 1 — Mark and sweep (database)
1. Acquire registry-wide GC lock (blocks new blob uploads).
2. Begin write transaction.
3. Find all blob rows in `blobs` with no corresponding row in
   `manifest_blobs` (unreferenced blobs). Record their digests.
4. Delete those rows from `blobs`.
5. Commit.

Phase 2 — File cleanup (filesystem)
6. For each digest recorded in step 3:
   a. Delete the file from /srv/mcr/layers/sha256/<prefix>/<digest>.
7. Remove empty prefix directories.
8. Release GC lock.

Crash safety: If the process crashes after phase 1 but before phase 2 completes, orphaned files remain on disk with no matching DB row. These are harmless — they consume space but are never served. A subsequent GC run or a filesystem reconciliation command (mcrctl gc --reconcile) cleans them up by scanning the layers directory and deleting files with no blobs row.

Trigger Methods

  • CLI: mcrctl gc
  • REST: POST /v1/gc
  • gRPC: RegistryService.GarbageCollect

GC runs asynchronously. Status can be checked via GET /v1/gc/status or mcrctl gc status. Only one GC run may be active at a time.

Safety

GC acquires a registry-wide lock that blocks all new blob uploads for the duration of the mark-and-sweep phase. Ongoing uploads that started before the lock are allowed to complete before the lock is acquired. This is a stop-the-world approach, acceptable at the target scale (single developer, several dozen repos). Pulls are not blocked.


10. Configuration

TOML format. Environment variable overrides via MCR_*.

[server]
listen_addr      = ":8443"           # HTTPS (OCI + admin REST)
grpc_addr        = ":9443"           # gRPC admin API (optional; omit to disable)
tls_cert         = "/srv/mcr/certs/cert.pem"
tls_key          = "/srv/mcr/certs/key.pem"
read_timeout     = "30s"             # HTTP read timeout
write_timeout    = "0s"              # HTTP write timeout; 0 = disabled for large
                                     # blob uploads (idle_timeout provides the safety net)
idle_timeout     = "120s"            # HTTP idle timeout
shutdown_timeout = "60s"             # Graceful shutdown drain period

[database]
path = "/srv/mcr/mcr.db"

[storage]
layers_path  = "/srv/mcr/layers"    # Blob storage root
uploads_path = "/srv/mcr/uploads"   # Upload staging directory
                                    # Must be on the same filesystem as layers_path

[mcias]
server_url   = "https://mcias.metacircular.net:8443"
ca_cert      = ""                   # Custom CA for MCIAS TLS
service_name = "mcr"
tags         = ["env:restricted"]

[web]
listen_addr = "127.0.0.1:8080"      # Web UI listen address
grpc_addr   = "127.0.0.1:9443"      # mcrsrv gRPC address for the web UI to connect to
ca_cert     = ""                     # CA cert for verifying mcrsrv gRPC TLS

[log]
level = "info"                       # debug, info, warn, error

Validation

Required fields are validated at startup. The server refuses to start if any are missing or if TLS certificate paths are invalid. storage.uploads_path and storage.layers_path must resolve to the same filesystem (verified at startup via os.Stat device ID comparison).

Timeout Notes

The HTTP write_timeout is disabled (0) by default because blob uploads can transfer hundreds of megabytes over slow connections. The idle_timeout serves as the safety net for stale connections. Operators may set a non-zero write_timeout if all clients are on fast local networks.


11. Web UI

Technology

Go html/template + htmx, embedded via //go:embed. The web UI is a separate binary (mcr-web) that communicates with mcrsrv via gRPC.

Pages

Path Description
/login MCIAS login form
/ Dashboard (repository count, total size, recent pushes)
/repositories Repository list with tag counts and sizes
/repositories/{name} Repository detail: tags, manifests, layer list
/repositories/{name}/manifests/{digest} Manifest detail: layers, config, size
/policies Policy rule management (admin only): create, edit, delete
/audit Audit log viewer (admin only)

Security

  • CSRF protection via signed double-submit cookies on all mutating requests.
  • Session cookie: HttpOnly, Secure, SameSite=Strict.
  • All user input escaped by html/template.

12. CLI Tools

mcrsrv

The registry server. Cobra subcommands:

Command Description
server Start the registry server
init First-time setup (create directories, example config)
snapshot Database backup via VACUUM INTO

mcr-web

The web UI server. Communicates with mcrsrv via gRPC.

Command Description
server Start the web UI server

mcrctl

Admin CLI. Communicates with mcrsrv via REST or gRPC.

Command Description
status Query server health
repo list List repositories
repo delete <name> Delete a repository
gc Trigger garbage collection
gc status Check GC status
policy list List policy rules
policy create Create a policy rule
policy update <id> Update a policy rule
policy delete <id> Delete a policy rule
audit tail [--n N] Print recent audit events
snapshot Trigger database backup via VACUUM INTO

13. Package Structure

mcr/
├── cmd/
│   ├── mcrsrv/           # Server binary: OCI + REST + gRPC
│   ├── mcr-web/          # Web UI binary
│   └── mcrctl/           # Admin CLI
├── internal/
│   ├── auth/             # MCIAS integration: token validation, 30s cache
│   ├── config/           # TOML config loading and validation
│   ├── db/               # SQLite: migrations, CRUD for all tables
│   ├── oci/              # OCI Distribution Spec handler: manifests, blobs, uploads
│   ├── policy/           # Registry policy engine: rules, evaluation, defaults
│   ├── server/           # REST API: admin routes, middleware, chi router
│   ├── grpcserver/       # gRPC admin API: interceptors, service handlers
│   ├── webserver/        # Web UI: template routes, htmx handlers
│   ├── storage/          # Blob filesystem operations: write, read, delete, GC
│   └── gc/               # Garbage collection: mark, sweep, locking
├── proto/mcr/
│   └── v1/               # Protobuf definitions
├── gen/mcr/
│   └── v1/               # Generated gRPC code (never edit by hand)
├── web/
│   ├── embed.go          # //go:embed directive
│   ├── templates/        # HTML templates
│   └── static/           # CSS, htmx
├── deploy/
│   ├── docker/           # Docker Compose
│   ├── examples/         # Example config files
│   ├── scripts/          # Install script
│   └── systemd/          # systemd units and timers
└── docs/                 # Internal documentation

14. Deployment

Binary

Single static binary per component, built with CGO_ENABLED=0.

Container

Multi-stage Docker build:

  1. Builder: golang:1.25-alpine, static compilation with -trimpath -ldflags="-s -w".
  2. Runtime: alpine:3.21, non-root user (mcr), ports 8443/9443.

The /srv/mcr/ directory is a mounted volume containing the database, blob storage, and configuration.

systemd

File Purpose
mcr.service Registry server
mcr-web.service Web UI
mcr-backup.service Oneshot database backup
mcr-backup.timer Daily backup timer (02:00 UTC, 5-minute jitter)

Standard security hardening per engineering standards (NoNewPrivileges=true, ProtectSystem=strict, ReadWritePaths=/srv/mcr, etc.).

Backup

Database backup via VACUUM INTO captures metadata only (repositories, manifests, tags, blobs table, policy rules, audit log). Blob data on the filesystem is not included. A complete backup requires both:

  1. mcrsrv snapshot (or mcrctl snapshot) — SQLite database backup.
  2. Filesystem-level copy of /srv/mcr/layers/ — blob data.

The database and blob directory must be backed up together. A database backup without the corresponding blob directory is usable (metadata is intact; missing blobs return 404 on pull) but incomplete. A blob directory without the database is useless (no way to map digests to repositories).

Graceful Shutdown

On SIGINT or SIGTERM:

  1. Stop accepting new connections.
  2. Drain in-flight requests (including ongoing uploads) up to shutdown_timeout (default 60s).
  3. Force-close remaining connections.
  4. Close database.
  5. Exit.

15. Audit Events

Event Trigger
manifest_pushed Manifest uploaded (includes repo, tag, digest)
manifest_pulled Manifest downloaded
manifest_deleted Manifest deleted
blob_uploaded Blob upload completed
blob_deleted Blob deleted
repo_deleted Repository deleted (admin)
gc_started Garbage collection started
gc_completed Garbage collection finished (includes blobs removed, bytes freed)
policy_rule_created Policy rule created
policy_rule_updated Policy rule updated
policy_rule_deleted Policy rule deleted
policy_deny Policy engine denied a request
login_ok Successful authentication
login_fail Failed authentication

The audit log is append-only. It never contains credentials or token values.


16. Error Handling and Logging

  • All errors are wrapped with fmt.Errorf("context: %w", err).
  • Structured logging uses log/slog (or goutils wrapper).
  • Log levels: DEBUG (dev only), INFO (normal ops), WARN (recoverable), ERROR (unexpected failures).
  • OCI operations (push, pull, delete) are logged at INFO with: {event, repository, reference, digest, account_uuid, ip, duration}.
  • Never log: bearer tokens, passwords, blob content, MCIAS credentials.
  • OCI endpoint errors use the OCI error format (see §6). Admin endpoint errors use the platform {"error": "..."} format.

17. Security Model

Threat Mitigations

Threat Mitigation
Unauthenticated access All endpoints require MCIAS bearer token; env:restricted tag blocks guest/viewer login
Unauthorized push/delete Policy engine enforces per-principal, per-repository access; default-deny for system accounts
Digest mismatch (supply chain) All uploads verified against client-supplied digest; rejected if mismatch
Blob corruption Content-addressed storage; digests verified on write. Periodic integrity scrub via mcrctl scrub (future)
Upload resource exhaustion Stale uploads expire and are cleaned up; GC reclaims orphaned data
Information leakage Error responses follow OCI spec format; no internal details exposed

Security Invariants

  1. Every API request (OCI and admin) requires a valid MCIAS bearer token.
  2. Token validation results are cached for at most 30 seconds.
  3. System accounts have no implicit access — explicit policy rules required.
  4. Blob digests are verified on upload; mismatched digests are rejected. Reads trust the content-addressed path (digest is the filename).
  5. Manifest deletion by tag is not supported — only by digest (OCI spec).
  6. The audit log never contains credentials, tokens, or blob content.
  7. TLS 1.3 minimum on all listeners. No fallback.

18. Future Work

Item Description
Image signing / content trust Cosign or Notary v2 integration for image verification
Multi-arch manifests OCI image index support for multi-platform images
Cross-repo blob mounts POST /v2/<name>/blobs/uploads/?mount=<digest>&from=<other> for efficient cross-repo copies
MCP integration Wire MCR into the Metacircular Control Plane for automated image deployment
Upload expiry Automatic cleanup of stale uploads after configurable TTL
Repository size quotas Per-repository storage limits
Webhook notifications Push events to external systems on manifest push/delete
Integrity scrub mcrctl scrub — verify blob digests on disk match their filenames, report corruption
Metrics Prometheus-compatible metrics: push/pull counts, storage usage, request latency