Files
mcr/ARCHITECTURE.md

1095 lines
41 KiB
Markdown

# MCR Architecture
Metacircular Container Registry — Technical Design Document
---
## 1. System Overview
MCR is an OCI Distribution Spec-compliant container registry for Metacircular
Dynamics. It stores and serves container images for the platform's services,
with MCP directing nodes to pull images from MCR. Authentication is delegated
to MCIAS; all operations require a valid bearer token. MCR sits behind an
mc-proxy instance for TLS routing.
### Components
```
┌──────────────────────────────────────────────┐
│ MCR Server (mcrsrv) │
│ │
│ ┌────────────┐ ┌──────────┐ ┌──────────┐ │
│ │ OCI API │ │ Auth │ │ Policy │ │
│ │ Handler │ │ (MCIAS) │ │ Engine │ │
│ └─────┬──────┘ └────┬─────┘ └────┬─────┘ │
│ └──────────────┼─────────────┘ │
│ │ │
│ ┌─────────────▼────────────┐ │
│ │ SQLite (metadata) │ │
│ └──────────────────────────┘ │
│ ┌─────────────────────────┐ │
│ │ Filesystem (blobs) │ │
│ │ /srv/mcr/layers/ │ │
│ └──────────────────────────┘ │
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ REST listener │ │ gRPC listener │ │
│ │ (OCI + admin) │ │ (admin) │ │
│ │ :8443 │ │ :9443 │ │
│ └─────────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────┘
▲ ▲ ▲
│ │ │
┌────┴───┐ ┌─────┴─────┐ ┌───┴──────┐
│ Docker │ │ mcrctl │ │ mcr-web │
│ / OCI │ │ (admin │ │ (web UI) │
│ client │ │ CLI) │ │ │
└────────┘ └───────────┘ └──────────┘
```
**mcrsrv** — The registry server. Exposes OCI Distribution endpoints and
an admin REST API over HTTPS/TLS, plus a gRPC admin API. Handles blob
storage, manifest management, and token-based authentication via MCIAS.
**mcr-web** — The web UI. Communicates with mcrsrv via gRPC. Provides
repository/tag browsing and ACL policy management for administrators.
**mcrctl** — The admin CLI. Communicates with mcrsrv via REST or gRPC.
Provides garbage collection, repository management, and policy management.
---
## 2. OCI Distribution Spec Compliance
MCR implements the OCI Distribution Specification for content discovery and
content management. All OCI endpoints require authentication — there is no
anonymous access.
### Supported Operations
| Category | Capability |
|----------|-----------|
| Content discovery | Repository catalog, tag listing |
| Pull | Manifest retrieval (by tag or digest), blob download |
| Push | Monolithic and chunked blob upload, manifest upload |
| Delete | Manifest deletion (by digest), blob deletion |
### Not Supported (v1)
| Feature | Rationale |
|---------|-----------|
| Multi-arch manifest lists | Not needed for single-platform deployment |
| Image signing / content trust | Deferred to future work |
| Cross-repository blob mounts | Complexity not justified at current scale |
### Content Addressing
All blobs and manifests are identified by their SHA-256 digest in the format
`sha256:<hex>`. Digests are verified on upload — if the computed digest does
not match the client-supplied digest, the upload is rejected.
Tags are mutable pointers to manifest digests. Pushing a manifest with an
existing tag atomically updates the tag to point to the new digest.
---
## 3. Authentication
MCR delegates all authentication to MCIAS. No local user database.
### MCIAS Configuration
MCR registers with MCIAS as a service with the `env:restricted` tag. This
means MCIAS denies login to `guest` and `viewer` accounts — only `admin`
and `user` roles can authenticate to MCR.
```toml
[mcias]
server_url = "https://mcias.metacircular.net:8443"
service_name = "mcr"
tags = ["env:restricted"]
```
### OCI Token Authentication
OCI/Docker clients expect a specific auth handshake:
```
Client mcrsrv
│ │
├─ GET /v2/ ────────────────────▶│
│◀─ 401 WWW-Authenticate: │
│ Bearer realm="/v2/token", │
│ service="mcr.metacircular.…" │
│ │
├─ GET /v2/token ───────────────▶│ (Basic auth: username:password)
│ ├─ Forward credentials to MCIAS
│ │ POST /v1/auth/login
│ ├─ MCIAS returns JWT
│◀─ 200 {"token":"<jwt>", │
│ "expires_in": N} │
│ │
├─ GET /v2/<name>/manifests/… ──▶│ (Authorization: Bearer <jwt>)
│ ├─ Validate token via MCIAS
│ ├─ Check policy engine
│◀─ 200 (manifest) │
```
**Token endpoint** (`GET /v2/token`): Accepts HTTP Basic auth, forwards
credentials to MCIAS `/v1/auth/login`, and returns the MCIAS JWT in the
Docker-compatible token response format. The `scope` parameter is accepted
but not used for token scoping — authorization is enforced per-request by
the policy engine.
**Direct bearer tokens**: MCIAS service tokens and pre-authenticated JWTs
are accepted directly on all OCI endpoints via the `Authorization: Bearer`
header. This allows system accounts and CLI tools to skip the token endpoint.
**Token validation**: Every request validates the bearer token by calling
MCIAS `ValidateToken()`. Results are cached by SHA-256 of the token with a
30-second TTL.
---
## 4. Authorization & Policy Engine
### Role Model
| Role | Access |
|------|--------|
| `admin` | Full access: push, pull, delete, catalog, policy management, GC |
| `user` | Full content access: push, pull, delete, catalog |
| System account | Default deny; requires explicit policy rule for any operation |
Admin detection is based solely on the MCIAS `admin` role. Human users with
the `user` role have full content management access. System accounts have no
implicit access and must be granted specific permissions via policy rules.
### Policy Engine
MCR implements a local policy engine for registry-specific access control,
following the same architecture as MCIAS's policy engine (priority-based,
deny-wins, default-deny). The engine is an in-process Go package
(`internal/policy`) with no external dependencies.
#### Actions
| Action | Description |
|--------|-------------|
| `registry:version_check` | OCI version check (`GET /v2/`) |
| `registry:pull` | Read manifests, download blobs, list tags for a repository |
| `registry:push` | Upload blobs and push manifests/tags |
| `registry:delete` | Delete manifests and blobs |
| `registry:catalog` | List all repositories (`GET /v2/_catalog`) |
| `policy:manage` | Create, update, delete policy rules (admin only) |
#### Policy Input
```go
type PolicyInput struct {
Subject string // MCIAS account UUID
AccountType string // "human" or "system"
Roles []string // roles from MCIAS JWT
Action Action
Repository string // target repository name (e.g., "myapp");
// empty for global operations (catalog, health)
}
```
#### Rule Structure
```go
type Rule struct {
ID int64
Priority int // lower = evaluated first
Description string
Effect Effect // "allow" or "deny"
// Principal conditions (all populated fields ANDed)
Roles []string // principal must hold at least one
AccountTypes []string // "human", "system", or both
SubjectUUID string // exact principal UUID
// Action condition
Actions []Action
// Resource condition
Repositories []string // repository name patterns (glob via path.Match).
// Examples:
// "myapp" — exact match
// "production/*" — any repo one level under production/
// Empty list = wildcard (matches all repositories).
}
```
#### Evaluation Algorithm
```
1. Merge built-in defaults with operator-defined rules
2. Sort by Priority ascending (stable)
3. Collect all matching rules
4. If any match has Effect=Deny → return Deny (deny-wins)
5. If any match has Effect=Allow → return Allow
6. Return Deny (default-deny)
```
A rule matches when every populated field satisfies its condition:
| Field | Match condition |
|-------|----------------|
| `Roles` | Principal holds at least one of the listed roles |
| `AccountTypes` | Principal's account type is in the list |
| `SubjectUUID` | Principal UUID equals exactly |
| `Actions` | Request action is in the list |
| `Repositories` | Request repository matches at least one pattern; empty list is a wildcard (matches all repositories) |
Repository glob matching uses `path.Match` semantics: `*` matches any
sequence of non-`/` characters within a single path segment. For example,
`production/*` matches `production/myapp` but not `production/team/myapp`.
An empty `Repositories` field is a wildcard (matches all repositories).
When `PolicyInput.Repository` is empty (global operations like catalog),
only rules with an empty `Repositories` field match — a rule scoped to
specific repositories does not apply to global operations.
#### Built-in Default Rules
```
Priority 0, Allow: roles=[admin], actions=<all>
— admin wildcard
Priority 0, Allow: roles=[user], accountTypes=[human],
actions=[registry:pull, registry:push, registry:delete, registry:catalog]
— human users have full content access
Priority 0, Allow: actions=[registry:version_check]
— /v2/ endpoint (always accessible to authenticated users)
```
System accounts have no built-in allow rules for catalog, push, pull, or
delete. An operator must create explicit policy rules granting them access.
#### Example: CI System Push Access (Glob)
Grant a CI system account permission to push and pull from all repositories
under the `production/` namespace:
```json
{
"effect": "allow",
"account_types": ["system"],
"subject_uuid": "<ci-account-uuid>",
"actions": ["registry:push", "registry:pull"],
"repositories": ["production/*"],
"priority": 50,
"description": "CI system: push/pull to all production repos"
}
```
#### Example: Deploy Agent Pull-Only
Grant a deploy agent pull access to all repositories (empty `repositories`
= wildcard), but deny delete globally:
```json
{
"effect": "allow",
"subject_uuid": "<deploy-agent-uuid>",
"actions": ["registry:pull"],
"repositories": [],
"priority": 50,
"description": "deploy-agent: pull from any repo"
}
```
```json
{
"effect": "deny",
"subject_uuid": "<deploy-agent-uuid>",
"actions": ["registry:delete"],
"priority": 10,
"description": "deploy-agent may never delete images (deny-wins)"
}
```
#### Example: Exact Repo Access
Grant a specific system account access to exactly two named repositories:
```json
{
"effect": "allow",
"subject_uuid": "<svc-account-uuid>",
"actions": ["registry:push", "registry:pull"],
"repositories": ["myapp", "infra-proxy"],
"priority": 50,
"description": "svc-account: push/pull to myapp and infra-proxy only"
}
```
### Policy Management
Policy rules are managed via the admin REST/gRPC API and the web UI. Only
users with the `admin` role can create, update, or delete policy rules.
---
## 5. Storage Design
MCR uses a split storage model: SQLite for metadata, filesystem for blobs.
### Blob Storage
Blobs (image layers and config objects) are stored as content-addressed files
under `/srv/mcr/layers/`:
```
/srv/mcr/layers/
└── sha256/
├── ab/
│ └── abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890
├── cd/
│ └── cdef...
└── ef/
└── ef01...
```
The two-character hex prefix directory limits the number of files per
directory. Blobs are written atomically: data is written to a temporary file
in `/srv/mcr/uploads/`, then renamed into place after digest verification.
The `uploads/` and `layers/` directories must reside on the same filesystem
for `rename(2)` to be atomic.
### Upload Staging
In-progress blob uploads are stored in `/srv/mcr/uploads/`:
```
/srv/mcr/uploads/
└── <upload-uuid>
```
Each upload UUID corresponds to a row in the `uploads` table tracking the
current byte offset. Completed uploads are renamed to the blob store;
cancelled or expired uploads are cleaned up.
### Manifest Storage
Manifests are small JSON documents stored directly in the SQLite database
(in the `manifests` table `content` column). This simplifies metadata queries
and avoids filesystem overhead for small objects.
### Manifest Push Flow
When a client calls `PUT /v2/<name>/manifests/<reference>`:
```
1. Parse the manifest JSON. Reject if malformed or unsupported media type.
2. Compute the SHA-256 digest of the raw manifest bytes.
3. If <reference> is a digest, verify it matches the computed digest.
Reject with DIGEST_INVALID if mismatch.
4. Parse the manifest's layer and config descriptors.
5. Verify every referenced blob exists in the `blobs` table.
Reject with MANIFEST_BLOB_UNKNOWN if any are missing.
6. Begin write transaction:
a. Create the repository row if it does not exist (implicit creation).
b. Insert or update the manifest row (repository_id, digest, content).
c. Populate `manifest_blobs` join table for all referenced blobs.
d. If <reference> is a tag name, insert or update the tag row
to point to the new manifest (atomic tag move).
7. Commit.
8. Return 201 Created with `Docker-Content-Digest: <digest>` header.
```
This is the most complex write path in the system. The entire operation
executes in a single SQLite transaction so a crash at any point leaves the
database consistent.
### Data Directory
```
/srv/mcr/
├── mcr.toml Configuration
├── mcr.db SQLite database (metadata)
├── certs/ TLS certificates
├── layers/ Content-addressed blob storage
│ └── sha256/
├── uploads/ In-progress blob uploads
└── backups/ Database snapshots
```
---
## 6. API Surface
### Error Response Formats
OCI and admin endpoints use different error formats:
**OCI endpoints** (`/v2/...`) follow the OCI Distribution Spec error format:
```json
{"errors": [{"code": "MANIFEST_UNKNOWN", "message": "...", "detail": "..."}]}
```
Standard OCI error codes used by MCR:
| Code | HTTP Status | Trigger |
|------|-------------|---------|
| `UNAUTHORIZED` | 401 | Missing or invalid bearer token |
| `DENIED` | 403 | Policy engine denied the request |
| `NAME_UNKNOWN` | 404 | Repository does not exist |
| `MANIFEST_UNKNOWN` | 404 | Manifest not found (by tag or digest) |
| `BLOB_UNKNOWN` | 404 | Blob not found |
| `MANIFEST_BLOB_UNKNOWN` | 400 | Manifest references a blob not yet uploaded |
| `DIGEST_INVALID` | 400 | Computed digest does not match supplied digest |
| `MANIFEST_INVALID` | 400 | Malformed or unsupported manifest |
| `BLOB_UPLOAD_UNKNOWN` | 404 | Upload UUID not found or expired |
| `BLOB_UPLOAD_INVALID` | 400 | Chunked upload byte range error |
| `UNSUPPORTED` | 405 | Operation not supported (e.g., cross-repo mount) |
**Admin endpoints** (`/v1/...`) use the platform-standard format:
```json
{"error": "human-readable message"}
```
### OCI Distribution Endpoints
All OCI endpoints are prefixed with `/v2` and require authentication.
#### Version Check
| Method | Path | Description |
|--------|------|-------------|
| GET | `/v2/` | API version check; returns `{}` if authenticated |
#### Token
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/v2/token` | Basic | Exchange credentials for bearer token via MCIAS |
#### Content Discovery
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/v2/_catalog` | bearer | List repositories (paginated) |
| GET | `/v2/<name>/tags/list` | bearer | List tags for a repository (paginated) |
#### Manifests
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| HEAD | `/v2/<name>/manifests/<reference>` | bearer | Check manifest existence |
| GET | `/v2/<name>/manifests/<reference>` | bearer | Pull manifest by tag or digest |
| PUT | `/v2/<name>/manifests/<reference>` | bearer | Push manifest |
| DELETE | `/v2/<name>/manifests/<digest>` | bearer | Delete manifest (digest only) |
`<reference>` is either a tag name or a `sha256:...` digest.
All manifest responses (GET, HEAD, PUT) include the `Docker-Content-Digest`
header with the manifest's SHA-256 digest and a `Content-Type` header with
the manifest's media type.
#### Blobs
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| HEAD | `/v2/<name>/blobs/<digest>` | bearer | Check blob existence |
| GET | `/v2/<name>/blobs/<digest>` | bearer | Download blob |
| DELETE | `/v2/<name>/blobs/<digest>` | bearer | Delete blob |
#### Blob Uploads
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| POST | `/v2/<name>/blobs/uploads/` | bearer | Initiate blob upload |
| GET | `/v2/<name>/blobs/uploads/<uuid>` | bearer | Check upload progress |
| PATCH | `/v2/<name>/blobs/uploads/<uuid>` | bearer | Upload chunk (chunked flow) |
| PUT | `/v2/<name>/blobs/uploads/<uuid>?digest=<digest>` | bearer | Complete upload with digest verification |
| DELETE | `/v2/<name>/blobs/uploads/<uuid>` | bearer | Cancel upload |
**Monolithic upload**: Client sends `POST` to initiate, then `PUT` with the
entire blob body and `digest` query parameter in a single request.
**Chunked upload**: Client sends `POST` to initiate, then one or more `PATCH`
requests with sequential byte ranges, then `PUT` with the final digest.
### Admin REST Endpoints
Admin endpoints use the `/v1` prefix and follow the platform API conventions.
#### Authentication
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| POST | `/v1/auth/login` | none | Login via MCIAS (username/password) |
| POST | `/v1/auth/logout` | bearer | Revoke current token |
#### Health
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/v1/health` | none | Health check |
#### Repository Management
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/v1/repositories` | bearer | List repositories with metadata |
| GET | `/v1/repositories/{name}` | bearer | Repository detail (tags, size, manifest count) |
| DELETE | `/v1/repositories/{name}` | admin | Delete repository and all its manifests/tags |
Repository `{name}` may contain `/` (e.g., `production/myapp`). The chi
router must use a wildcard catch-all segment for this parameter.
#### Garbage Collection
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| POST | `/v1/gc` | admin | Trigger garbage collection (async) |
| GET | `/v1/gc/status` | admin | Check GC status (running, last result) |
#### Policy Management
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/v1/policy/rules` | admin | List all policy rules |
| POST | `/v1/policy/rules` | admin | Create a policy rule |
| GET | `/v1/policy/rules/{id}` | admin | Get a single rule |
| PATCH | `/v1/policy/rules/{id}` | admin | Update rule |
| DELETE | `/v1/policy/rules/{id}` | admin | Delete rule |
#### Audit
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| GET | `/v1/audit` | admin | List audit log events |
---
## 7. gRPC Admin Interface
The gRPC API provides the same admin capabilities as the REST admin API. OCI
Distribution endpoints are REST-only (protocol requirement — OCI clients
speak HTTP).
### Proto Package Layout
```
proto/
└── mcr/
└── v1/
├── registry.proto # Repository listing, GC
├── policy.proto # Policy rule CRUD
├── audit.proto # Audit log queries
├── admin.proto # Health
└── common.proto # Shared message types
gen/
└── mcr/
└── v1/ # Generated Go stubs (committed)
```
### Service Definitions
| Service | RPCs |
|---------|------|
| `RegistryService` | `ListRepositories`, `GetRepository`, `DeleteRepository`, `GarbageCollect`, `GetGCStatus` |
| `PolicyService` | `ListPolicyRules`, `CreatePolicyRule`, `GetPolicyRule`, `UpdatePolicyRule`, `DeletePolicyRule` |
| `AuditService` | `ListAuditEvents` |
| `AdminService` | `Health` |
Auth endpoints (`/v1/auth/login`, `/v1/auth/logout`) are REST-only. Login
requires HTTP Basic auth or form-encoded credentials, which do not map
cleanly to gRPC unary RPCs. Clients that need programmatic auth use MCIAS
directly and present the resulting bearer token to gRPC.
### Transport Security
Same TLS certificate and key as the REST server. TLS 1.3 minimum.
### Authentication
gRPC unary interceptor extracts the `authorization` metadata key, validates
the MCIAS bearer token, and injects claims into the context. Same validation
logic as the REST middleware.
### Interceptor Chain
```
[Request Logger] → [Auth Interceptor] → [Admin Interceptor] → [Handler]
```
- **Request Logger**: Logs method, peer IP, status code, duration.
- **Auth Interceptor**: Validates bearer JWT via MCIAS. `Health` bypasses auth.
- **Admin Interceptor**: Requires admin role for GC, policy, and delete operations.
---
## 8. Database Schema
SQLite 3, WAL mode, `PRAGMA foreign_keys = ON`, `PRAGMA busy_timeout = 5000`.
```sql
-- Schema version tracking
CREATE TABLE schema_migrations (
version INTEGER PRIMARY KEY,
applied_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- Container image repositories
CREATE TABLE repositories (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL UNIQUE, -- e.g., "myapp", "infra/proxy"
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- UNIQUE on name creates an implicit index; no explicit index needed.
-- Image manifests (content stored in DB — small JSON documents)
CREATE TABLE manifests (
id INTEGER PRIMARY KEY,
repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE,
digest TEXT NOT NULL, -- "sha256:<hex>"
media_type TEXT NOT NULL, -- "application/vnd.oci.image.manifest.v1+json"
content BLOB NOT NULL, -- manifest JSON
size INTEGER NOT NULL, -- byte size of content
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
UNIQUE(repository_id, digest)
);
CREATE INDEX idx_manifests_repo ON manifests (repository_id);
CREATE INDEX idx_manifests_digest ON manifests (digest);
-- Tags: mutable pointers from name → manifest
CREATE TABLE tags (
id INTEGER PRIMARY KEY,
repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE,
name TEXT NOT NULL, -- e.g., "latest", "v1.2.3"
manifest_id INTEGER NOT NULL REFERENCES manifests(id) ON DELETE CASCADE,
updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
UNIQUE(repository_id, name)
);
CREATE INDEX idx_tags_repo ON tags (repository_id);
CREATE INDEX idx_tags_manifest ON tags (manifest_id);
-- Blob metadata (actual data on filesystem at /srv/mcr/layers/)
CREATE TABLE blobs (
id INTEGER PRIMARY KEY,
digest TEXT NOT NULL UNIQUE, -- "sha256:<hex>"
size INTEGER NOT NULL,
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- UNIQUE on digest creates an implicit index; no explicit index needed.
-- Many-to-many: tracks which blobs are referenced by manifests in which repos.
-- A blob may be shared across repositories (content-addressed dedup).
-- Used by garbage collection to determine unreferenced blobs.
CREATE TABLE manifest_blobs (
manifest_id INTEGER NOT NULL REFERENCES manifests(id) ON DELETE CASCADE,
blob_id INTEGER NOT NULL REFERENCES blobs(id),
PRIMARY KEY (manifest_id, blob_id)
);
CREATE INDEX idx_manifest_blobs_blob ON manifest_blobs (blob_id);
-- In-progress blob uploads
CREATE TABLE uploads (
id INTEGER PRIMARY KEY,
uuid TEXT NOT NULL UNIQUE,
repository_id INTEGER NOT NULL REFERENCES repositories(id) ON DELETE CASCADE,
byte_offset INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- Policy rules for registry access control
CREATE TABLE policy_rules (
id INTEGER PRIMARY KEY,
priority INTEGER NOT NULL DEFAULT 100,
description TEXT NOT NULL,
rule_json TEXT NOT NULL, -- JSON-encoded rule body
enabled INTEGER NOT NULL DEFAULT 1 CHECK (enabled IN (0,1)),
created_by TEXT, -- MCIAS account UUID
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now'))
);
-- Audit log — append-only
CREATE TABLE audit_log (
id INTEGER PRIMARY KEY,
event_time TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ','now')),
event_type TEXT NOT NULL,
actor_id TEXT, -- MCIAS account UUID
repository TEXT, -- repository name (if applicable)
digest TEXT, -- affected digest (if applicable)
ip_address TEXT,
details TEXT -- JSON blob; never contains secrets
);
CREATE INDEX idx_audit_time ON audit_log (event_time);
CREATE INDEX idx_audit_actor ON audit_log (actor_id);
CREATE INDEX idx_audit_event ON audit_log (event_type);
```
### Schema Notes
- Repositories are created implicitly on first push. No explicit creation
step required.
- Repository names may contain `/` for organizational grouping (e.g.,
`production/myapp`) but are otherwise flat strings. No hierarchical
enforcement.
- The `manifest_blobs` join table enables content-addressed deduplication:
the same layer blob may be referenced by manifests in different
repositories.
- Manifest deletion cascades to `manifest_blobs` rows and to any tags
pointing at the deleted manifest (`ON DELETE CASCADE` on both
`manifest_blobs.manifest_id` and `tags.manifest_id`). Blob files on
the filesystem are not deleted — unreferenced blobs are reclaimed by
garbage collection.
- Tag updates are atomic: pushing a manifest with an existing tag name
updates the `manifest_id` in a single transaction.
---
## 9. Garbage Collection
Garbage collection removes unreferenced blobs — blobs that are not
referenced by any manifest. GC is a manual process triggered by an
administrator.
### Algorithm
GC runs in two phases to maintain consistency across a crash:
```
Phase 1 — Mark and sweep (database)
1. Acquire registry-wide GC lock (blocks new blob uploads).
2. Begin write transaction.
3. Find all blob rows in `blobs` with no corresponding row in
`manifest_blobs` (unreferenced blobs). Record their digests.
4. Delete those rows from `blobs`.
5. Commit.
Phase 2 — File cleanup (filesystem)
6. For each digest recorded in step 3:
a. Delete the file from /srv/mcr/layers/sha256/<prefix>/<digest>.
7. Remove empty prefix directories.
8. Release GC lock.
```
**Crash safety**: If the process crashes after phase 1 but before phase 2
completes, orphaned files remain on disk with no matching DB row. These are
harmless — they consume space but are never served. A subsequent GC run or
a filesystem reconciliation command (`mcrctl gc --reconcile`) cleans them
up by scanning the layers directory and deleting files with no `blobs` row.
### Trigger Methods
- **CLI**: `mcrctl gc`
- **REST**: `POST /v1/gc`
- **gRPC**: `RegistryService.GarbageCollect`
GC runs asynchronously. Status can be checked via `GET /v1/gc/status` or
`mcrctl gc status`. Only one GC run may be active at a time.
### Safety
GC acquires a registry-wide lock that **blocks all new blob uploads** for
the duration of the mark-and-sweep phase. Ongoing uploads that started
before the lock are allowed to complete before the lock is acquired. This
is a stop-the-world approach, acceptable at the target scale (single
developer, several dozen repos). Pulls are not blocked.
---
## 10. Configuration
TOML format. Environment variable overrides via `MCR_*`.
```toml
[server]
listen_addr = ":8443" # HTTPS (OCI + admin REST)
grpc_addr = ":9443" # gRPC admin API (optional; omit to disable)
tls_cert = "/srv/mcr/certs/cert.pem"
tls_key = "/srv/mcr/certs/key.pem"
read_timeout = "30s" # HTTP read timeout
write_timeout = "0s" # HTTP write timeout; 0 = disabled for large
# blob uploads (idle_timeout provides the safety net)
idle_timeout = "120s" # HTTP idle timeout
shutdown_timeout = "60s" # Graceful shutdown drain period
[database]
path = "/srv/mcr/mcr.db"
[storage]
layers_path = "/srv/mcr/layers" # Blob storage root
uploads_path = "/srv/mcr/uploads" # Upload staging directory
# Must be on the same filesystem as layers_path
[mcias]
server_url = "https://mcias.metacircular.net:8443"
ca_cert = "" # Custom CA for MCIAS TLS
service_name = "mcr"
tags = ["env:restricted"]
[web]
listen_addr = "127.0.0.1:8080" # Web UI listen address
grpc_addr = "127.0.0.1:9443" # mcrsrv gRPC address for the web UI to connect to
ca_cert = "" # CA cert for verifying mcrsrv gRPC TLS
[log]
level = "info" # debug, info, warn, error
```
### Validation
Required fields are validated at startup. The server refuses to start if
any are missing or if TLS certificate paths are invalid. `storage.uploads_path`
and `storage.layers_path` must resolve to the same filesystem (verified at
startup via `os.Stat` device ID comparison).
### Timeout Notes
The HTTP `write_timeout` is disabled (0) by default because blob uploads
can transfer hundreds of megabytes over slow connections. The `idle_timeout`
serves as the safety net for stale connections. Operators may set a non-zero
`write_timeout` if all clients are on fast local networks.
---
## 11. Web UI
### Technology
Go `html/template` + htmx, embedded via `//go:embed`. The web UI is a
separate binary (`mcr-web`) that communicates with mcrsrv via gRPC.
### Pages
| Path | Description |
|------|-------------|
| `/login` | MCIAS login form |
| `/` | Dashboard (repository count, total size, recent pushes) |
| `/repositories` | Repository list with tag counts and sizes |
| `/repositories/{name}` | Repository detail: tags, manifests, layer list |
| `/repositories/{name}/manifests/{digest}` | Manifest detail: layers, config, size |
| `/policies` | Policy rule management (admin only): create, edit, delete |
| `/audit` | Audit log viewer (admin only) |
### Security
- CSRF protection via signed double-submit cookies on all mutating requests.
- Session cookie: `HttpOnly`, `Secure`, `SameSite=Strict`.
- All user input escaped by `html/template`.
---
## 12. CLI Tools
### mcrsrv
The registry server. Cobra subcommands:
| Command | Description |
|---------|-------------|
| `server` | Start the registry server |
| `init` | First-time setup (create directories, example config) |
| `snapshot` | Database backup via `VACUUM INTO` |
### mcr-web
The web UI server. Communicates with mcrsrv via gRPC.
| Command | Description |
|---------|-------------|
| `server` | Start the web UI server |
### mcrctl
Admin CLI. Communicates with mcrsrv via REST or gRPC.
| Command | Description |
|---------|-------------|
| `status` | Query server health |
| `repo list` | List repositories |
| `repo delete <name>` | Delete a repository |
| `gc` | Trigger garbage collection |
| `gc status` | Check GC status |
| `policy list` | List policy rules |
| `policy create` | Create a policy rule |
| `policy update <id>` | Update a policy rule |
| `policy delete <id>` | Delete a policy rule |
| `audit tail [--n N]` | Print recent audit events |
| `snapshot` | Trigger database backup via `VACUUM INTO` |
---
## 13. Package Structure
```
mcr/
├── cmd/
│ ├── mcrsrv/ # Server binary: OCI + REST + gRPC
│ ├── mcr-web/ # Web UI binary
│ └── mcrctl/ # Admin CLI
├── internal/
│ ├── auth/ # MCIAS integration: token validation, 30s cache
│ ├── config/ # TOML config loading and validation
│ ├── db/ # SQLite: migrations, CRUD for all tables
│ ├── oci/ # OCI Distribution Spec handler: manifests, blobs, uploads
│ ├── policy/ # Registry policy engine: rules, evaluation, defaults
│ ├── server/ # REST API: admin routes, middleware, chi router
│ ├── grpcserver/ # gRPC admin API: interceptors, service handlers
│ ├── webserver/ # Web UI: template routes, htmx handlers
│ ├── storage/ # Blob filesystem operations: write, read, delete, GC
│ └── gc/ # Garbage collection: mark, sweep, locking
├── proto/mcr/
│ └── v1/ # Protobuf definitions
├── gen/mcr/
│ └── v1/ # Generated gRPC code (never edit by hand)
├── web/
│ ├── embed.go # //go:embed directive
│ ├── templates/ # HTML templates
│ └── static/ # CSS, htmx
├── deploy/
│ ├── docker/ # Docker Compose
│ ├── examples/ # Example config files
│ ├── scripts/ # Install script
│ └── systemd/ # systemd units and timers
└── docs/ # Internal documentation
```
---
## 14. Deployment
### Binary
Single static binary per component, built with `CGO_ENABLED=0`.
### Container
Multi-stage Docker build:
1. **Builder**: `golang:1.25-alpine`, static compilation with
`-trimpath -ldflags="-s -w"`.
2. **Runtime**: `alpine:3.21`, non-root user (`mcr`), ports 8443/9443.
The `/srv/mcr/` directory is a mounted volume containing the database,
blob storage, and configuration.
### systemd
| File | Purpose |
|------|---------|
| `mcr.service` | Registry server |
| `mcr-web.service` | Web UI |
| `mcr-backup.service` | Oneshot database backup |
| `mcr-backup.timer` | Daily backup timer (02:00 UTC, 5-minute jitter) |
Standard security hardening per engineering standards
(`NoNewPrivileges=true`, `ProtectSystem=strict`,
`ReadWritePaths=/srv/mcr`, etc.).
### Backup
Database backup via `VACUUM INTO` captures metadata only (repositories,
manifests, tags, blobs table, policy rules, audit log). **Blob data on the
filesystem is not included.** A complete backup requires both:
1. `mcrsrv snapshot` (or `mcrctl snapshot`) — SQLite database backup.
2. Filesystem-level copy of `/srv/mcr/layers/` — blob data.
The database and blob directory must be backed up together. A database
backup without the corresponding blob directory is usable (metadata is
intact; missing blobs return 404 on pull) but incomplete. A blob directory
without the database is useless (no way to map digests to repositories).
### Graceful Shutdown
On `SIGINT` or `SIGTERM`:
1. Stop accepting new connections.
2. Drain in-flight requests (including ongoing uploads) up to
`shutdown_timeout` (default 60s).
3. Force-close remaining connections.
4. Close database.
5. Exit.
---
## 15. Audit Events
| Event | Trigger |
|-------|---------|
| `manifest_pushed` | Manifest uploaded (includes repo, tag, digest) |
| `manifest_pulled` | Manifest downloaded |
| `manifest_deleted` | Manifest deleted |
| `blob_uploaded` | Blob upload completed |
| `blob_deleted` | Blob deleted |
| `repo_deleted` | Repository deleted (admin) |
| `gc_started` | Garbage collection started |
| `gc_completed` | Garbage collection finished (includes blobs removed, bytes freed) |
| `policy_rule_created` | Policy rule created |
| `policy_rule_updated` | Policy rule updated |
| `policy_rule_deleted` | Policy rule deleted |
| `policy_deny` | Policy engine denied a request |
| `login_ok` | Successful authentication |
| `login_fail` | Failed authentication |
The audit log is append-only. It never contains credentials or token values.
---
## 16. Error Handling and Logging
- All errors are wrapped with `fmt.Errorf("context: %w", err)`.
- Structured logging uses `log/slog` (or goutils wrapper).
- Log levels: DEBUG (dev only), INFO (normal ops), WARN (recoverable),
ERROR (unexpected failures).
- OCI operations (push, pull, delete) are logged at INFO with:
`{event, repository, reference, digest, account_uuid, ip, duration}`.
- **Never log:** bearer tokens, passwords, blob content, MCIAS credentials.
- OCI endpoint errors use the OCI error format (see §6). Admin endpoint
errors use the platform `{"error": "..."}` format.
---
## 17. Security Model
### Threat Mitigations
| Threat | Mitigation |
|--------|------------|
| Unauthenticated access | All endpoints require MCIAS bearer token; `env:restricted` tag blocks guest/viewer login |
| Unauthorized push/delete | Policy engine enforces per-principal, per-repository access; default-deny for system accounts |
| Digest mismatch (supply chain) | All uploads verified against client-supplied digest; rejected if mismatch |
| Blob corruption | Content-addressed storage; digests verified on write. Periodic integrity scrub via `mcrctl scrub` (future) |
| Upload resource exhaustion | Stale uploads expire and are cleaned up; GC reclaims orphaned data |
| Information leakage | Error responses follow OCI spec format; no internal details exposed |
### Security Invariants
1. Every API request (OCI and admin) requires a valid MCIAS bearer token.
2. Token validation results are cached for at most 30 seconds.
3. System accounts have no implicit access — explicit policy rules required.
4. Blob digests are verified on upload; mismatched digests are rejected.
Reads trust the content-addressed path (digest is the filename).
5. Manifest deletion by tag is not supported — only by digest (OCI spec).
6. The audit log never contains credentials, tokens, or blob content.
7. TLS 1.3 minimum on all listeners. No fallback.
---
## 18. Future Work
| Item | Description |
|------|-------------|
| **Image signing / content trust** | Cosign or Notary v2 integration for image verification |
| **Multi-arch manifests** | OCI image index support for multi-platform images |
| **Cross-repo blob mounts** | `POST /v2/<name>/blobs/uploads/?mount=<digest>&from=<other>` for efficient cross-repo copies |
| **MCP integration** | Wire MCR into the Metacircular Control Plane for automated image deployment |
| **Upload expiry** | Automatic cleanup of stale uploads after configurable TTL |
| **Repository size quotas** | Per-repository storage limits |
| **Webhook notifications** | Push events to external systems on manifest push/delete |
| **Integrity scrub** | `mcrctl scrub` — verify blob digests on disk match their filenames, report corruption |
| **Metrics** | Prometheus-compatible metrics: push/pull counts, storage usage, request latency |