Files
mcdoc/ARCHITECTURE.md
Kyle Isom 0578dbcb02 Update MCP service definition to convention-driven format
Drop uses_mcdsl, explicit image URL, network, restart. Use
service-level version, route declarations, and derived defaults
per PLATFORM_EVOLUTION.md conventions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 00:27:06 -07:00

344 lines
10 KiB
Markdown

# mcdoc Architecture
Metacircular Documentation Server — Technical Design Document
---
## 1. Overview
mcdoc is a public documentation server for the Metacircular platform. It
fetches markdown files from Gitea, renders them to HTML, and serves a
navigable read-only documentation site. No authentication is required.
mcdoc replaces the need to check out repositories or use a local tool to
read platform documentation. It is the web counterpart to browsing docs
locally — anyone with the URL can read the current documentation for any
Metacircular service.
### Goals
- Serve rendered markdown documentation for all Metacircular services
- Public access, no authentication
- Refresh automatically when documentation changes (Gitea webhooks)
- Fit the platform conventions (Go, htmx, MCP-deployed, mc-proxy-routed)
### Non-Goals
- Editing or authoring (docs are authored in git)
- Versioned documentation (tagged snapshots, historical diffs)
- Serving non-markdown content (images, PDFs)
- Search (v1 — may be added later)
---
## 2. Architecture
```
┌────────────────────────┐
│ Gitea (deimos) │
│ git.wntrmute.dev │
│ │
│ mc/mcr │
│ mc/mcp │
│ mc/metacrypt │
│ ... │
└──────────┬─────────────┘
fetch (boot + webhook)
Reader ──── HTTPS ────► mc-proxy ────► mcdoc (rift)
:443 (L7) ┌──────────────────┐
docs. │ │
metacircular. │ Content cache │
net │ (in-memory) │
│ │
│ Rendered HTML │
│ per repo/file │
│ │
└──────────────────┘
```
mcdoc runs on rift as a single container. mc-proxy terminates TLS on
`docs.metacircular.net` and forwards to mcdoc's HTTP listener.
Internally, mcdoc is also reachable at `mcdoc.svc.mcp.metacircular.net`
for Gitea webhook delivery and internal access.
---
## 3. Content Model
### Source
mcdoc fetches markdown files from public repositories in the `mc`
organization on Gitea (`git.wntrmute.dev`). The repos are public, so no
authentication token is needed.
### Discovery
On boot, mcdoc fetches the list of repositories in the `mc` org via the
Gitea API, then for each repo:
1. Fetch the file tree (recursive) from the default branch.
2. Filter to `*.md` files.
3. Exclude files matching configurable patterns (default: `vendor/`,
`.claude/`, `node_modules/`).
4. Fetch each markdown file's raw content.
5. Render to HTML via goldmark (GitHub Flavored Markdown).
6. Store the rendered HTML in the in-memory cache, keyed by
`repo/filepath`.
### Navigation Structure
The site is organized by repository, then by file path within the repo:
```
/ → index: list of repos with descriptions
/mcr/ → repo index: list of docs in mcr
/mcr/ARCHITECTURE → rendered ARCHITECTURE.md
/mcr/RUNBOOK → rendered RUNBOOK.md
/mcp/ARCHITECTURE → rendered mcp ARCHITECTURE.md
/mcp/docs/bootstrap → rendered mcp docs/bootstrap.md
```
The `.md` extension is stripped from URLs. Directory structure within
a repo is preserved.
### Ordering
The repo index page lists documents in a fixed priority order for
well-known filenames, followed by alphabetical:
1. README
2. ARCHITECTURE
3. RUNBOOK
4. CLAUDE
5. (all others, alphabetical)
The top-level index lists repos alphabetically with the first line of
each repo's Gitea description as a subtitle.
---
## 4. Caching and Refresh
### Boot Fetch
On startup, mcdoc fetches all content from Gitea. This is the
cold-start path. With ~8 repos and ~50 markdown files total, the full
fetch takes 2-3 seconds using concurrent requests (bounded to avoid
overwhelming Gitea). Fetches are parallelized per-repo.
mcdoc serves requests only after the initial fetch completes. During
boot, it returns HTTP 503 with a "loading" page.
### Webhook Refresh
Gitea sends a push webhook to `mcdoc.svc.mcp.metacircular.net/webhook`
on every push to a configured repo's default branch. mcdoc re-fetches
that repo's file tree and content, re-renders, and atomically swaps the
cache entries for that repo. In-flight requests to other repos are
unaffected.
The webhook endpoint validates the Gitea webhook secret (shared secret,
HMAC-SHA256 signature in `X-Gitea-Signature` header).
### Fallback Poll
As a safety net, mcdoc polls Gitea for changes every 15 minutes. This
catches missed webhooks (network blips, Gitea restarts). The poll
checks each repo's latest commit SHA against the cached version and
only re-fetches repos that have changed.
### Resilience
- **Gitea unreachable at boot**: mcdoc starts, serves a "docs
unavailable, retrying" page, and retries the fetch every 30 seconds
until it succeeds.
- **Gitea unreachable after boot**: stale cache continues serving.
Readers see the last-known-good content. Poll/webhook failures are
logged but do not affect availability.
- **Single file fetch failure**: skip the file, log a warning, serve
the rest of the repo's docs. Retry on next poll cycle.
---
## 5. Rendering
### Markdown
goldmark with the following extensions:
- GitHub Flavored Markdown (tables, strikethrough, autolinks, task lists)
- Syntax highlighting for fenced code blocks (chroma)
- Heading anchors (linkable `#section-name` fragments)
- Table of contents generation (extracted from headings, rendered as a
sidebar or top-of-page nav)
### HTML Output
Each markdown file is rendered into a page template with:
- **Header**: site title, repo name, breadcrumb navigation
- **Sidebar**: document list for the current repo (persistent nav)
- **Content**: rendered markdown
- **Footer**: last-updated timestamp (from Gitea commit metadata)
The page template uses htmx for navigation — clicking a doc link swaps
the content pane without a full page reload, keeping the sidebar state.
### Styling
Clean, readable typography optimized for long-form technical documents.
The design should prioritize readability:
- Serif or readable sans-serif body text
- Generous line height and margins
- Constrained content width (~70ch)
- Syntax-highlighted code blocks with a muted theme
- Responsive layout (readable on mobile)
---
## 6. Configuration
TOML configuration at `/srv/mcdoc/mcdoc.toml`:
```toml
[server]
listen_addr = ":8080"
[gitea]
url = "https://git.wntrmute.dev"
org = "mc"
webhook_secret = "..."
poll_interval = "15m"
fetch_timeout = "30s"
max_concurrency = 4
[gitea.exclude_paths]
patterns = ["vendor/", ".claude/", "node_modules/", ".junie/"]
[gitea.exclude_repos]
names = []
[log]
level = "info"
```
Environment variable overrides follow platform convention: `MCDOC_*`
(e.g., `MCDOC_GITEA_WEBHOOK_SECRET`).
---
## 7. Deployment
### Container
Single binary, single container. Multi-stage Docker build per platform
convention (`golang:alpine` builder, `alpine` runtime).
mcdoc listens on a single HTTP port. mc-proxy handles TLS termination
and routes `docs.metacircular.net` to mcdoc's listener.
### MCP Service Definition
```toml
name = "mcdoc"
node = "rift"
version = "v0.1.0"
[build.images]
mcdoc = "Dockerfile"
[[components]]
name = "mcdoc"
[[components.routes]]
port = 443
mode = "l7"
hostname = "docs.metacircular.net"
```
Port assignment is pending MCP support for automatic port allocation
and mc-proxy route registration (see `PLATFORM_EVOLUTION.md`). Until
then, a manually assigned port will be used.
### mc-proxy Routes
```toml
# On :443 listener
[[listeners.routes]]
hostname = "docs.metacircular.net"
backend = "127.0.0.1:<port>"
mode = "l7"
tls_cert = "/srv/mc-proxy/certs/docs.pem"
tls_key = "/srv/mc-proxy/certs/docs.key"
backend_tls = false
[[listeners.routes]]
hostname = "mcdoc.svc.mcp.metacircular.net"
backend = "127.0.0.1:<port>"
mode = "l7"
tls_cert = "/srv/mc-proxy/certs/mcdoc-svc.pem"
tls_key = "/srv/mc-proxy/certs/mcdoc-svc.key"
backend_tls = false
```
Note: `backend_tls = false` — mcdoc is plain HTTP behind mc-proxy.
This is safe because mc-proxy and mcdoc are on the same host. TLS is
terminated at mc-proxy.
### DNS
| Record | Value |
|--------|-------|
| `docs.metacircular.net` | Public DNS → rift's public IP |
| `mcdoc.svc.mcp.metacircular.net` | Internal DNS (MCNS) → rift |
---
## 8. Package Structure
```
cmd/mcdoc/ CLI entry point (cobra: server subcommand)
internal/
config/ TOML config loading and validation
gitea/ Gitea API client (list repos, fetch trees, fetch files)
cache/ In-memory content cache (atomic swap per repo)
render/ goldmark rendering pipeline
server/ HTTP server, chi routes, htmx handlers
web/
templates/ Go html/template files (index, repo, doc, error)
static/ CSS, favicon
```
---
## 9. Routes
| Method | Path | Description |
|--------|------|-------------|
| GET | `/` | Top-level index (repo list) |
| GET | `/{repo}/` | Repo index (doc list) |
| GET | `/{repo}/{path...}` | Rendered document |
| POST | `/webhook` | Gitea push webhook receiver |
| GET | `/health` | Health check (200 if cache is populated, 503 if not) |
htmx partial responses: when `HX-Request` header is present, return
only the content fragment (no surrounding layout). This enables
client-side navigation without full page reloads.
---
## 10. Future Work
| Item | Description |
|------|-------------|
| **Search** | Full-text search across all docs (bleve or similar) |
| **Cross-linking** | Resolve relative markdown links across repos |
| **Mermaid/diagrams** | Render mermaid fenced blocks as SVG |
| **Dark mode** | Theme toggle (light/dark) |
| **Pinned versions** | Serve docs at a specific git tag |