Add architecture and project plan documentation

ARCHITECTURE.md covers the system design: exod backend, single Kotlin
desktop app (Obsidian-style), layered architecture, data flow, CAS blob
store, cross-pillar integration, and key design decisions.

PROJECT_PLAN.md defines six implementation phases from foundation through
remote access, with concrete deliverables per phase.

CLAUDE.md updated to reference both documents and reflect the single-app
UI decision with unified search.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 12:18:54 -07:00
parent ea32237279
commit 83f4b327f3
3 changed files with 347 additions and 2 deletions

205
ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,205 @@
# Architecture
Technical reference for kExocortex — a personal knowledge management system that combines an **artifact repository** (source documents) with a **knowledge graph** (notes and ideas) into a unified, searchable exocortex.
Core formula: **artifacts + notes + graph structure = exocortex**
## Tech Stack
| Role | Technology |
|------|-----------|
| Backend server, CLI tools | Go |
| Desktop application | Kotlin |
| Metadata storage | SQLite |
| Blob storage | Content-addressable store (SHA256) |
| Client-server communication | gRPC / Protobuf |
| Remote blob backup | Minio |
| Secure remote access | Tailscale |
## System Components
```
┌──────────────────┐ gRPC ┌──────────────────────────┐
│ Kotlin Desktop │◄════════════════►│ │
│ (all UI facets) │ │ │
└──────────────────┘ │ exod │
│ (Go daemon) │
┌──────────────────┐ gRPC │ │
│ CLI tools │◄════════════════►│ sole owner of all data │
│ (Go binaries) │ │ │
└──────────────────┘ └─────┬──────┬──────┬──────┘
│ │ │
┌─────────────┘ │ └─────────────┐
│ │ │
┌─────▼──────┐ ┌──────▼───────┐ ┌──────▼──────┐
│ SQLite │ │ Local Blob │ │ Minio │
│ Database │ │ Store (CAS) │ │ (remote) │
└────────────┘ └──────────────┘ └─────────────┘
Remote access:
┌────────┐ HTTPS ┌─────────────────────┐ Tailscale ┌──────┐
│ Mobile │──────────►│ Reverse Proxy │────────────►│ exod │
│ Device │ │ (TLS + basic auth) │ │ │
└────────┘ └─────────────────────┘ └──────┘
```
Three runtime components exist:
- **exod** — The Go backend daemon. Sole owner of the SQLite database and blob store. All reads and writes go through exod. No client accesses storage directly.
- **Kotlin desktop app** — A single application for both artifact management and knowledge graph interaction. Obsidian-style layout: tree/outline sidebar for navigation, contextual main panel, graph visualization, and unified search with selector prefixes. Connects to exod via gRPC.
- **CLI tools** — Go binaries for scripting, bulk operations, and administrative tasks. Also connect via gRPC.
## Layered Architecture
### Layer 1: Storage
Two storage mechanisms, separated by purpose:
**SQLite database** stores all metadata — everything that needs to be queried, filtered, or joined. This includes artifact headers, citations, tags, categories, publisher info, snapshot records, blob registry entries, and knowledge graph facts. A single unified database is used (rather than split databases) so that tags and categories are shared across both pillars.
**Content-addressable blob store** stores the actual artifact content (PDFs, images, web snapshots, etc.) on the local filesystem. Files are addressed by the SHA256 hash of their contents, stored in a hierarchical directory layout. This separation exists because blobs are large, opaque, and benefit from deduplication, while SQLite is not suited for large binary storage.
Together, the database and blob store form a single logical unit that must stay consistent.
### Layer 2: Domain Model
Three Go packages implement the data model:
**`core`** — Shared types used by both pillars:
- `Header` (ID, Type, Created, Modified, Categories, Tags, Meta)
- `Metadata` (map of string keys to typed `Value` structs)
- UUID generation
**`artifacts`** — The artifact repository pillar. Key relationship chain:
```
Artifact ──► Snapshot(s) ──► Blob(s)
│ │
▼ ▼
Citation Citation (can override parent)
```
An Artifact has a type (Article, Book, URL, Paper, Video, Image, etc.), a history of Snapshots keyed by datetime, and a top-level Citation. Each Snapshot can have its own Citation that overrides or extends the artifact-level one (e.g., a specific edition of a book). Each Snapshot contains Blobs keyed by MIME type.
See `docs/KExocortex/Spec/Artifacts.md` for canonical type definitions.
**`kg`** — The knowledge graph pillar:
- **Node** — An entity in the graph, containing Cells
- **Cell** — A content unit within a note (markdown, code, etc.), inspired by Quiver's cell-based structure
- **Fact** — An entity-attribute-value tuple with a transaction timestamp and retraction flag, based on the protobuf model in `docs/KExocortex/KnowledgeGraph/Tuple.md`
Nodes are conceptually `Node = Note | ArtifactLink` — they can be original analysis or references to artifacts.
### Layer 3: Service
The `exod` gRPC server is the exclusive gateway to all data:
- Manages transaction boundaries (begin, commit/rollback)
- Handles blob lifecycle (hash content, write to CAS, register in SQLite, queue for Minio sync)
- Runs the Minio sync queue for asynchronous backup replication
- Exposes gRPC endpoints defined in `.proto` files for all CRUD operations on both pillars
### Layer 4: Presentation
A single Kotlin desktop application handles both artifact management and knowledge graph interaction, following the Obsidian model. CLI tools provide a scriptable alternative.
#### Desktop Application Layout
```
┌─────────────────────────────────────────────────────────────┐
│ [Command Palette: Ctrl+Shift+A] [Search: Ctrl+F] │
├──────────────┬──────────────────────────────────────────────┤
│ │ │
│ Sidebar │ Main Panel │
│ │ │
│ Tree/ │ Contextual view based on selection: │
│ Outline │ • Note editor (cell-based) │
│ View │ • Artifact detail (citation, snapshots) │
│ │ • Search results │
│ │ • Catalog (items needing attention) │
│ │ │
├──────────────┴──────────────────────────────────────────────┤
│ [Graph View toggle] │
└─────────────────────────────────────────────────────────────┘
```
**Sidebar** — Tree/outline view of the knowledge graph hierarchy as primary navigation. Artifacts appear under their linked nodes; unlinked artifacts appear in a dedicated section. Collapsible, like Obsidian's file explorer.
**Main panel** — Changes contextually:
- **Note view**: Cell-based editor (markdown, code blocks). Associated artifacts listed inline. Dendron-style Ctrl+L for note creation.
- **Artifact view**: Citation details, snapshot history, blob preview (PDF, HTML). Tag/category editing. Link to nodes.
- **Search view**: Unified results from both pillars. Selector prefixes for precision: `artifact:`, `note:`, `cell:`, `tag:`, `author:`, `doi:`.
- **Catalog view**: Surfaces untagged, uncategorized, or unlinked artifacts needing attention.
**Graph view** — Secondary visualization available as a toggle or separate pane, showing nodes and their connections (like Obsidian's graph view). Useful for exploration and discovering clusters.
**Command palette** — Ctrl+Shift+A (IntelliJ-style) for quick actions: create note, import artifact, search, switch views, manage tags.
#### CLI Tools
Go binaries connecting to exod via gRPC for automation, bulk operations, and scripting. Commands: `import`, `tag`, `cat`, `search`.
## Data Flow
### Importing an Artifact
1. Client sends artifact metadata (citation, tags, categories) and blob data to exod via gRPC
2. exod begins a database transaction
3. Tags and categories are created if they don't exist (idempotent upsert)
4. Publisher is resolved (lookup by name+address, create if missing)
5. Citation is stored with publisher FK and author records
6. Artifact header is stored with citation FK
7. For each snapshot: store snapshot record, then for each blob: compute SHA256, write file to CAS directory, insert blob record
8. History entries are recorded linking artifact to snapshots by datetime
9. Transaction commits
10. Blobs are queued for Minio sync
### Querying by Tag
1. Client sends a tag string to exod
2. Tag name is resolved to its UUID via the `tags` table
3. The `artifact_tags` junction table is queried for matching artifact IDs
4. Full artifact headers are hydrated (citation, publisher, tags, categories, metadata)
5. Results are returned; blob data is not fetched until explicitly requested
### Creating a Knowledge Graph Note
1. Client sends node metadata and cell contents
2. exod creates a Node with a UUID
3. Cells are stored with their content type (markdown, code, etc.)
4. Facts are recorded as EAV tuples linking the node to attributes, other nodes, and artifacts
5. Tags from the note content are cross-referenced with the shared tag pool
## Content-Addressable Store
- **Addressing**: SHA256 hash of blob contents, rendered as a hex string
- **Directory layout**: Hash split into 4-character segments as nested directories (e.g., `a1b2c3d4...``a1b2/c3d4/.../a1b2c3d4...`)
- **Deduplication**: Identical content from different snapshots shares the same blob — same hash, same file
- **Registry**: The `blobs` table in SQLite stores `(snapshot_id, blob_id, format)` where `blob_id` is the SHA256 hash
- **Backup**: Minio sync queue replicates blobs to remote S3-compatible storage asynchronously
- **Retrieval**: An optional HTTP endpoint (`GET /artifacts/blob/{id}`) may be added for direct blob access
## Cross-Pillar Integration
The architectural core that makes kExocortex more than the sum of its parts:
- **Shared taxonomy**: Tags and categories exist in a single pool used by both artifacts and knowledge graph nodes. This enables cross-pillar queries: "show me everything tagged X."
- **Node-to-artifact links**: Knowledge graph nodes can reference artifacts by ID, so the graph contains both original analysis and source material references.
- **Shared metadata**: The polymorphic `metadata` table uses the owner's UUID as a foreign key, attaching key-value metadata to any object in either pillar.
- **Cell-artifact bridging**: A Cell within a note can embed references to artifacts, linking prose analysis directly to source material.
## Network & Access
- **Local-first**: exod, the database, and the blob store all live on the local filesystem. Full functionality requires no network.
- **Tailscale reverse proxy**: For remote/mobile access. TLS and HTTP basic auth terminate at the proxy, not at exod.
- **Minio backup**: Blob replication to remote S3-compatible storage, managed by an async sync queue in exod. This is a backup/restore mechanism, not a primary access path.
## Key Design Decisions
| Decision | Alternative | Rationale |
|----------|-------------|-----------|
| Single unified SQLite database | Split databases per pillar | Shared tag/category pool, single transaction scope, simpler backup. exod resolves SQLite locking concerns. |
| Content-addressable blob store | Store blobs in SQLite | Blobs can be arbitrarily large (PDFs, videos). CAS provides deduplication. SQLite isn't designed for large binary storage. |
| gRPC / Protobuf | REST / JSON | Typed contracts, efficient binary serialization, bidirectional streaming for future use (e.g., upload progress). |
| Kotlin desktop app | Web frontend | Desktop-native performance for large document collections. Offline-capable. No browser dependency. |
| SQLite | PostgreSQL | Zero ops cost, single-file backup, embedded in server process. Single-user system doesn't need concurrent write scaling. |

View File

@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
**kExocortex** is a personal knowledge management system — an "exocortex" for capturing, organizing, and retrieving knowledge. It combines two pillars: an **artifact repository** (for storing source documents like PDFs, papers, webpages) and a **knowledge graph** (for linking notes and ideas). **kExocortex** is a personal knowledge management system — an "exocortex" for capturing, organizing, and retrieving knowledge. It combines two pillars: an **artifact repository** (for storing source documents like PDFs, papers, webpages) and a **knowledge graph** (for linking notes and ideas).
The project is in active design and early implementation. The design docs in `docs/` are the primary working material. The project is in active design and early implementation. See `ARCHITECTURE.md` for the technical system design and `PROJECT_PLAN.md` for the phased implementation plan.
## Repository Structure ## Repository Structure
@@ -38,7 +38,7 @@ The system design calls for:
3. Local blob store (content-addressable) 3. Local blob store (content-addressable)
4. Remote Minio backup for blobs 4. Remote Minio backup for blobs
5. Reverse-proxy frontend over Tailscale for remote/mobile access 5. Reverse-proxy frontend over Tailscale for remote/mobile access
6. Kotlin desktop apps covering four UI facets: query, exploration, presentation, and update 6. Single Kotlin desktop app (Obsidian-style layout) with tree sidebar, contextual main panel, graph view, and unified search with selector prefixes
## Git Remote ## Git Remote

140
PROJECT_PLAN.md Normal file
View File

@@ -0,0 +1,140 @@
# Project Plan
Implementation plan for kExocortex, organized into phases with concrete deliverables.
## Current State
**What exists:**
- Comprehensive design documentation in `docs/KExocortex/` covering the artifact data model, knowledge graph design, system architecture, UI considerations, and taxonomy
- Three archived implementations in `ark/` (Go v1, Go v2, Java) that validated the artifact repository data model, SQLite schema, and content-addressable blob store
- A proven database schema (11 tables) and Go domain types for the artifact pillar
- Protobuf sketches for the knowledge graph EAV tuple model
**What doesn't exist yet:**
- Active codebase (all code is archived)
- The `exod` gRPC server
- Knowledge graph implementation beyond stubs
- Kotlin desktop application
- Minio sync queue
- Protobuf/gRPC service definitions
## Phase 1: Foundation
Establish the Go project structure, shared types, and database infrastructure.
**Deliverables:**
- Go module (`go.mod`) with project structure
- `core` package: `Header`, `Metadata`, `Value` types, UUID generation
- SQLite migration framework and initial schema (ported from `ark/go-v2/schema/artifacts.sql`)
- Database access layer: connection management, transaction helpers (`StartTX`/`EndTX` pattern)
- Configuration: paths for database, blob store, Minio endpoint
**Key references:**
- `docs/KExocortex/Spec/Artifacts.md` — Header and Metadata type definitions
- `ark/go-v2/types/common/common.go` — Proven shared type implementations
- `ark/go-v2/types/artifacts/db.go` — Proven database access patterns
- `ark/go-v2/schema/artifacts.sql` — Proven schema
## Phase 2: Artifact Repository
Build the artifact pillar — the most mature and validated part of the design.
**Deliverables:**
- `artifacts` package: `Artifact`, `Snapshot`, `Blob`, `Citation`, `Publisher` types with `Get`/`Store` methods
- Tag and category management (shared pool, idempotent upserts)
- Content-addressable blob store (SHA256 hashing, hierarchical directory layout, read/write)
- YAML import for bootstrapping from existing artifact files
- Protobuf message definitions for all artifact types
- gRPC service: create/get/update/delete artifacts, store/retrieve blobs, manage tags and categories
**Key references:**
- `docs/KExocortex/Spec/Artifacts.md` — Canonical type definitions
- `ark/go-v2/types/artifacts/*.go` — Proven implementations of all artifact types
- `ark/go-v2/cmd/exo-repo/cmd/import.go` — Proven import flow
## Phase 3: CLI Tools
Build command-line tools that connect to exod via gRPC for scripting and administrative use.
**Deliverables:**
- `exo` CLI binary using Cobra (or similar)
- Commands: `import` (YAML artifacts), `tag` (add/list/delete), `cat` (add/list/delete), `search` (by tag, category, title, DOI)
- `exod` server binary with startup, shutdown, and configuration
**Key references:**
- `ark/go-v2/cmd/exo-repo/cmd/*.go` — Proven command structure (import, tags, cat)
## Phase 4: Knowledge Graph
Build the knowledge graph pillar — the less mature component requiring more design work.
**Deliverables:**
- `kg` package: `Node`, `Cell`, `Fact` types
- Database schema additions for knowledge graph tables (nodes, cells, facts, graph edges) in the unified SQLite database
- EAV tuple storage with transaction timestamps and retraction support
- Node-to-artifact linking (cross-pillar references)
- Cell content types (markdown, code, etc.)
- gRPC service: create/get/update nodes, add cells, record facts, traverse graph
- CLI commands for node creation and graph queries
**Key references:**
- `docs/KExocortex/KnowledgeGraph/Tuple.md` — EAV/Fact protobuf model
- `docs/KExocortex/KnowledgeGraph.md` — Graph structure design
- `docs/KExocortex/Taxonomy.md` — Note naming conventions (C2 wiki style)
- `docs/KExocortex/Elements.md` — Note and structure definitions
- `ark/go-v2/types/kg/` — Type stubs (Node, Cell)
## Phase 5: Desktop Application
Single Kotlin desktop app for both artifact management and knowledge graph interaction. Obsidian-style layout: tree/outline sidebar, contextual main panel, graph visualization, unified search.
**Deliverables (incremental):**
1. **App shell and sidebar** — gRPC client connecting to exod. Tree/outline sidebar showing knowledge graph hierarchy and an unlinked-artifacts section. Basic navigation.
2. **Artifact views** — Artifact detail panel (citation, snapshot history, blob preview). Import flow (file or URL → citation form → tags/categories). Catalog view for untagged/unlinked artifacts needing attention.
3. **Note editor** — Cell-based editor (markdown, code blocks). Ctrl+L note creation. Inline display of associated artifacts.
4. **Unified search** — Single search bar across both pillars. Selector prefixes for precision (`artifact:`, `note:`, `cell:`, `tag:`, `author:`, `doi:`). Fuzzy matching for partial recall.
5. **Graph view** — Visual node graph (toggle or separate pane, Obsidian-style). Exploration by traversing connections and discovering clusters.
6. **Command palette** — Ctrl+Shift+A for quick actions: create note, import artifact, search, switch views, manage tags.
7. **Presentation/export** — Export curated notes with associated artifacts to HTML or PDF.
**Key references:**
- `docs/KExocortex/UI.md` — Interaction patterns to adopt (IntelliJ action menu, Dendron Ctrl+L)
- `docs/KExocortex/Elements.md` — Interface definitions (query, exploration, presentation, update)
- `docs/KExocortex/About.md` — Litmus test: Camerata article retrieval with readable snapshot
- `docs/KExocortex/Taxonomy.md` — C2 wiki style node naming for sidebar hierarchy
## Phase 6: Remote Access & Backup
Enable remote capture and blob backup.
**Deliverables:**
- Minio sync queue in exod: async blob replication, retry on failure, restore from remote
- Tailscale reverse proxy configuration with TLS and HTTP basic auth
- Quick-capture endpoint: accept URL or document from mobile, stash in artifact repository for later categorization
- Cataloging view: list artifacts needing tags or node attachment
**Key references:**
- `docs/KExocortex/Spec.md` — Remote access architecture, mobile reading use case
- `docs/KExocortex/RDD/2022/02/23.md` — Original web server goal for URL/PDF stashing
- `docs/KExocortex/Agents.md` — Future agent integration via Tailscale
## Phase Dependencies
```
Phase 1: Foundation
Phase 2: Artifact Repository ──► Phase 3: CLI Tools
Phase 4: Knowledge Graph
Phase 5: Desktop Application
Phase 6: Remote Access & Backup
```
Phases 2 and 3 can overlap — CLI commands can be built as gRPC endpoints come online. Phase 5 can begin its Update facet once Phase 2 is complete, with remaining facets built as Phase 4 delivers.