Phase 9: two-phase garbage collection engine
GC engine (internal/gc/): Collector.Run() implements the two-phase algorithm — Phase 1 finds unreferenced blobs and deletes DB rows in a single transaction, Phase 2 deletes blob files from storage. Registry-wide mutex blocks concurrent GC runs. Collector.Reconcile() scans filesystem for orphaned files with no DB row (crash recovery). Wired into admin_gc.go: POST /v1/gc now launches the real collector in a goroutine with gc_started/gc_completed audit events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
49
PROGRESS.md
49
PROGRESS.md
@@ -6,7 +6,7 @@ See `PROJECT_PLAN.md` for the implementation roadmap and
|
||||
|
||||
## Current State
|
||||
|
||||
**Phase:** 7 complete, ready for Phase 9
|
||||
**Phase:** 9 complete, ready for Phase 10
|
||||
**Last updated:** 2026-03-19
|
||||
|
||||
### Completed
|
||||
@@ -20,6 +20,7 @@ See `PROJECT_PLAN.md` for the implementation roadmap and
|
||||
- Phase 6: OCI push path (all 3 steps)
|
||||
- Phase 7: OCI delete path (all 2 steps)
|
||||
- Phase 8: Admin REST API (all 5 steps)
|
||||
- Phase 9: Garbage collection (all 2 steps)
|
||||
- `ARCHITECTURE.md` — Full design specification (18 sections)
|
||||
- `CLAUDE.md` — AI development guidance
|
||||
- `PROJECT_PLAN.md` — Implementation plan (14 phases, 40+ steps)
|
||||
@@ -27,13 +28,55 @@ See `PROJECT_PLAN.md` for the implementation roadmap and
|
||||
|
||||
### Next Steps
|
||||
|
||||
1. Phase 9 (garbage collection)
|
||||
2. Phase 10 (gRPC admin API)
|
||||
1. Phase 10 (gRPC admin API)
|
||||
2. Phase 11 (CLI tool) and Phase 12 (web UI)
|
||||
|
||||
---
|
||||
|
||||
## Log
|
||||
|
||||
### 2026-03-19 — Phase 9: Garbage collection
|
||||
|
||||
**Task:** Implement the two-phase GC algorithm for removing unreferenced
|
||||
blobs per ARCHITECTURE.md §9.
|
||||
|
||||
**Changes:**
|
||||
|
||||
Step 9.1 — GC engine (`internal/gc/`):
|
||||
- `gc.go`: `Collector` struct with `sync.Mutex` for registry-wide lock;
|
||||
`New(db, storage)` constructor; `Run(ctx)` executes two-phase algorithm
|
||||
(Phase 1: find unreferenced blobs + delete rows in transaction;
|
||||
Phase 2: delete files from storage); `Reconcile(ctx)` scans filesystem
|
||||
for orphaned files with no DB row (crash recovery); `TryLock()` for
|
||||
concurrent GC rejection
|
||||
- `errors.go`: `ErrGCRunning` sentinel
|
||||
- `DB` interface: `FindAndDeleteUnreferencedBlobs()`, `BlobExistsByDigest()`
|
||||
- `Storage` interface: `Delete()`, `ListBlobDigests()`
|
||||
- `db/gc.go`: `FindAndDeleteUnreferencedBlobs()` — LEFT JOIN blobs to
|
||||
manifest_blobs, finds unreferenced, deletes rows in single transaction;
|
||||
`BlobExistsByDigest()`
|
||||
- `storage/list.go`: `ListBlobDigests()` — scans sha256 prefix dirs
|
||||
|
||||
Step 9.2 — Wire GC into server:
|
||||
- `server/admin_gc.go`: updated `GCState` to hold `*gc.Collector` and
|
||||
`AuditFunc`; `AdminTriggerGCHandler` now launches `collector.Run()`
|
||||
in a goroutine, tracks result, writes `gc_started`/`gc_completed`
|
||||
audit events
|
||||
|
||||
**Verification:**
|
||||
- `make all` passes: vet clean, lint 0 issues, all tests passing,
|
||||
all 3 binaries built
|
||||
- GC engine tests (6 new): removes unreferenced blobs (verify both DB
|
||||
rows and storage files deleted, referenced blobs preserved), does not
|
||||
remove referenced blobs, concurrent GC rejected (ErrGCRunning), empty
|
||||
registry (no-op), reconcile cleans orphaned files, reconcile empty
|
||||
storage
|
||||
- DB GC tests (3 new): FindAndDeleteUnreferencedBlobs (unreferenced
|
||||
removed, referenced preserved), no unreferenced returns nil,
|
||||
BlobExistsByDigest (found + not found)
|
||||
|
||||
---
|
||||
|
||||
### 2026-03-19 — Phase 7: OCI delete path
|
||||
|
||||
**Task:** Implement manifest and blob deletion per OCI Distribution Spec.
|
||||
|
||||
Reference in New Issue
Block a user