ARCHITECTURE.md covers the system design: exod backend, single Kotlin desktop app (Obsidian-style), layered architecture, data flow, CAS blob store, cross-pillar integration, and key design decisions. PROJECT_PLAN.md defines six implementation phases from foundation through remote access, with concrete deliverables per phase. CLAUDE.md updated to reference both documents and reflect the single-app UI decision with unified search. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
206 lines
13 KiB
Markdown
206 lines
13 KiB
Markdown
# Architecture
|
|
|
|
Technical reference for kExocortex — a personal knowledge management system that combines an **artifact repository** (source documents) with a **knowledge graph** (notes and ideas) into a unified, searchable exocortex.
|
|
|
|
Core formula: **artifacts + notes + graph structure = exocortex**
|
|
|
|
## Tech Stack
|
|
|
|
| Role | Technology |
|
|
|------|-----------|
|
|
| Backend server, CLI tools | Go |
|
|
| Desktop application | Kotlin |
|
|
| Metadata storage | SQLite |
|
|
| Blob storage | Content-addressable store (SHA256) |
|
|
| Client-server communication | gRPC / Protobuf |
|
|
| Remote blob backup | Minio |
|
|
| Secure remote access | Tailscale |
|
|
|
|
## System Components
|
|
|
|
```
|
|
┌──────────────────┐ gRPC ┌──────────────────────────┐
|
|
│ Kotlin Desktop │◄════════════════►│ │
|
|
│ (all UI facets) │ │ │
|
|
└──────────────────┘ │ exod │
|
|
│ (Go daemon) │
|
|
┌──────────────────┐ gRPC │ │
|
|
│ CLI tools │◄════════════════►│ sole owner of all data │
|
|
│ (Go binaries) │ │ │
|
|
└──────────────────┘ └─────┬──────┬──────┬──────┘
|
|
│ │ │
|
|
┌─────────────┘ │ └─────────────┐
|
|
│ │ │
|
|
┌─────▼──────┐ ┌──────▼───────┐ ┌──────▼──────┐
|
|
│ SQLite │ │ Local Blob │ │ Minio │
|
|
│ Database │ │ Store (CAS) │ │ (remote) │
|
|
└────────────┘ └──────────────┘ └─────────────┘
|
|
|
|
Remote access:
|
|
┌────────┐ HTTPS ┌─────────────────────┐ Tailscale ┌──────┐
|
|
│ Mobile │──────────►│ Reverse Proxy │────────────►│ exod │
|
|
│ Device │ │ (TLS + basic auth) │ │ │
|
|
└────────┘ └─────────────────────┘ └──────┘
|
|
```
|
|
|
|
Three runtime components exist:
|
|
|
|
- **exod** — The Go backend daemon. Sole owner of the SQLite database and blob store. All reads and writes go through exod. No client accesses storage directly.
|
|
- **Kotlin desktop app** — A single application for both artifact management and knowledge graph interaction. Obsidian-style layout: tree/outline sidebar for navigation, contextual main panel, graph visualization, and unified search with selector prefixes. Connects to exod via gRPC.
|
|
- **CLI tools** — Go binaries for scripting, bulk operations, and administrative tasks. Also connect via gRPC.
|
|
|
|
## Layered Architecture
|
|
|
|
### Layer 1: Storage
|
|
|
|
Two storage mechanisms, separated by purpose:
|
|
|
|
**SQLite database** stores all metadata — everything that needs to be queried, filtered, or joined. This includes artifact headers, citations, tags, categories, publisher info, snapshot records, blob registry entries, and knowledge graph facts. A single unified database is used (rather than split databases) so that tags and categories are shared across both pillars.
|
|
|
|
**Content-addressable blob store** stores the actual artifact content (PDFs, images, web snapshots, etc.) on the local filesystem. Files are addressed by the SHA256 hash of their contents, stored in a hierarchical directory layout. This separation exists because blobs are large, opaque, and benefit from deduplication, while SQLite is not suited for large binary storage.
|
|
|
|
Together, the database and blob store form a single logical unit that must stay consistent.
|
|
|
|
### Layer 2: Domain Model
|
|
|
|
Three Go packages implement the data model:
|
|
|
|
**`core`** — Shared types used by both pillars:
|
|
- `Header` (ID, Type, Created, Modified, Categories, Tags, Meta)
|
|
- `Metadata` (map of string keys to typed `Value` structs)
|
|
- UUID generation
|
|
|
|
**`artifacts`** — The artifact repository pillar. Key relationship chain:
|
|
|
|
```
|
|
Artifact ──► Snapshot(s) ──► Blob(s)
|
|
│ │
|
|
▼ ▼
|
|
Citation Citation (can override parent)
|
|
```
|
|
|
|
An Artifact has a type (Article, Book, URL, Paper, Video, Image, etc.), a history of Snapshots keyed by datetime, and a top-level Citation. Each Snapshot can have its own Citation that overrides or extends the artifact-level one (e.g., a specific edition of a book). Each Snapshot contains Blobs keyed by MIME type.
|
|
|
|
See `docs/KExocortex/Spec/Artifacts.md` for canonical type definitions.
|
|
|
|
**`kg`** — The knowledge graph pillar:
|
|
- **Node** — An entity in the graph, containing Cells
|
|
- **Cell** — A content unit within a note (markdown, code, etc.), inspired by Quiver's cell-based structure
|
|
- **Fact** — An entity-attribute-value tuple with a transaction timestamp and retraction flag, based on the protobuf model in `docs/KExocortex/KnowledgeGraph/Tuple.md`
|
|
|
|
Nodes are conceptually `Node = Note | ArtifactLink` — they can be original analysis or references to artifacts.
|
|
|
|
### Layer 3: Service
|
|
|
|
The `exod` gRPC server is the exclusive gateway to all data:
|
|
|
|
- Manages transaction boundaries (begin, commit/rollback)
|
|
- Handles blob lifecycle (hash content, write to CAS, register in SQLite, queue for Minio sync)
|
|
- Runs the Minio sync queue for asynchronous backup replication
|
|
- Exposes gRPC endpoints defined in `.proto` files for all CRUD operations on both pillars
|
|
|
|
### Layer 4: Presentation
|
|
|
|
A single Kotlin desktop application handles both artifact management and knowledge graph interaction, following the Obsidian model. CLI tools provide a scriptable alternative.
|
|
|
|
#### Desktop Application Layout
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ [Command Palette: Ctrl+Shift+A] [Search: Ctrl+F] │
|
|
├──────────────┬──────────────────────────────────────────────┤
|
|
│ │ │
|
|
│ Sidebar │ Main Panel │
|
|
│ │ │
|
|
│ Tree/ │ Contextual view based on selection: │
|
|
│ Outline │ • Note editor (cell-based) │
|
|
│ View │ • Artifact detail (citation, snapshots) │
|
|
│ │ • Search results │
|
|
│ │ • Catalog (items needing attention) │
|
|
│ │ │
|
|
├──────────────┴──────────────────────────────────────────────┤
|
|
│ [Graph View toggle] │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**Sidebar** — Tree/outline view of the knowledge graph hierarchy as primary navigation. Artifacts appear under their linked nodes; unlinked artifacts appear in a dedicated section. Collapsible, like Obsidian's file explorer.
|
|
|
|
**Main panel** — Changes contextually:
|
|
- **Note view**: Cell-based editor (markdown, code blocks). Associated artifacts listed inline. Dendron-style Ctrl+L for note creation.
|
|
- **Artifact view**: Citation details, snapshot history, blob preview (PDF, HTML). Tag/category editing. Link to nodes.
|
|
- **Search view**: Unified results from both pillars. Selector prefixes for precision: `artifact:`, `note:`, `cell:`, `tag:`, `author:`, `doi:`.
|
|
- **Catalog view**: Surfaces untagged, uncategorized, or unlinked artifacts needing attention.
|
|
|
|
**Graph view** — Secondary visualization available as a toggle or separate pane, showing nodes and their connections (like Obsidian's graph view). Useful for exploration and discovering clusters.
|
|
|
|
**Command palette** — Ctrl+Shift+A (IntelliJ-style) for quick actions: create note, import artifact, search, switch views, manage tags.
|
|
|
|
#### CLI Tools
|
|
|
|
Go binaries connecting to exod via gRPC for automation, bulk operations, and scripting. Commands: `import`, `tag`, `cat`, `search`.
|
|
|
|
## Data Flow
|
|
|
|
### Importing an Artifact
|
|
|
|
1. Client sends artifact metadata (citation, tags, categories) and blob data to exod via gRPC
|
|
2. exod begins a database transaction
|
|
3. Tags and categories are created if they don't exist (idempotent upsert)
|
|
4. Publisher is resolved (lookup by name+address, create if missing)
|
|
5. Citation is stored with publisher FK and author records
|
|
6. Artifact header is stored with citation FK
|
|
7. For each snapshot: store snapshot record, then for each blob: compute SHA256, write file to CAS directory, insert blob record
|
|
8. History entries are recorded linking artifact to snapshots by datetime
|
|
9. Transaction commits
|
|
10. Blobs are queued for Minio sync
|
|
|
|
### Querying by Tag
|
|
|
|
1. Client sends a tag string to exod
|
|
2. Tag name is resolved to its UUID via the `tags` table
|
|
3. The `artifact_tags` junction table is queried for matching artifact IDs
|
|
4. Full artifact headers are hydrated (citation, publisher, tags, categories, metadata)
|
|
5. Results are returned; blob data is not fetched until explicitly requested
|
|
|
|
### Creating a Knowledge Graph Note
|
|
|
|
1. Client sends node metadata and cell contents
|
|
2. exod creates a Node with a UUID
|
|
3. Cells are stored with their content type (markdown, code, etc.)
|
|
4. Facts are recorded as EAV tuples linking the node to attributes, other nodes, and artifacts
|
|
5. Tags from the note content are cross-referenced with the shared tag pool
|
|
|
|
## Content-Addressable Store
|
|
|
|
- **Addressing**: SHA256 hash of blob contents, rendered as a hex string
|
|
- **Directory layout**: Hash split into 4-character segments as nested directories (e.g., `a1b2c3d4...` → `a1b2/c3d4/.../a1b2c3d4...`)
|
|
- **Deduplication**: Identical content from different snapshots shares the same blob — same hash, same file
|
|
- **Registry**: The `blobs` table in SQLite stores `(snapshot_id, blob_id, format)` where `blob_id` is the SHA256 hash
|
|
- **Backup**: Minio sync queue replicates blobs to remote S3-compatible storage asynchronously
|
|
- **Retrieval**: An optional HTTP endpoint (`GET /artifacts/blob/{id}`) may be added for direct blob access
|
|
|
|
## Cross-Pillar Integration
|
|
|
|
The architectural core that makes kExocortex more than the sum of its parts:
|
|
|
|
- **Shared taxonomy**: Tags and categories exist in a single pool used by both artifacts and knowledge graph nodes. This enables cross-pillar queries: "show me everything tagged X."
|
|
- **Node-to-artifact links**: Knowledge graph nodes can reference artifacts by ID, so the graph contains both original analysis and source material references.
|
|
- **Shared metadata**: The polymorphic `metadata` table uses the owner's UUID as a foreign key, attaching key-value metadata to any object in either pillar.
|
|
- **Cell-artifact bridging**: A Cell within a note can embed references to artifacts, linking prose analysis directly to source material.
|
|
|
|
## Network & Access
|
|
|
|
- **Local-first**: exod, the database, and the blob store all live on the local filesystem. Full functionality requires no network.
|
|
- **Tailscale reverse proxy**: For remote/mobile access. TLS and HTTP basic auth terminate at the proxy, not at exod.
|
|
- **Minio backup**: Blob replication to remote S3-compatible storage, managed by an async sync queue in exod. This is a backup/restore mechanism, not a primary access path.
|
|
|
|
## Key Design Decisions
|
|
|
|
| Decision | Alternative | Rationale |
|
|
|----------|-------------|-----------|
|
|
| Single unified SQLite database | Split databases per pillar | Shared tag/category pool, single transaction scope, simpler backup. exod resolves SQLite locking concerns. |
|
|
| Content-addressable blob store | Store blobs in SQLite | Blobs can be arbitrarily large (PDFs, videos). CAS provides deduplication. SQLite isn't designed for large binary storage. |
|
|
| gRPC / Protobuf | REST / JSON | Typed contracts, efficient binary serialization, bidirectional streaming for future use (e.g., upload progress). |
|
|
| Kotlin desktop app | Web frontend | Desktop-native performance for large document collections. Offline-capable. No browser dependency. |
|
|
| SQLite | PostgreSQL | Zero ops cost, single-file backup, embedded in server process. Single-user system doesn't need concurrent write scaling. |
|