Restart with Claude

2026-03-20 11:54:56 -07:00
parent 02a6356158
commit ea32237279
27 changed files with 1400 additions and 83 deletions
--- a/docs/exocortex.org
+++ b/docs/exocortex.org
@@ -0,0 +1,275 @@
+#+TITLE: On Exocortices
+#+AUTHOR: Kyle Isom
+
+* Document history
+  + [2021-02-10 Wed] first draft
+
+* Background
+
+  An exocortex is [[https://en.wiktionary.org/wiki/exocortex]["a
+  hypothetical artificial information-processing system that would
+  augment a brain's biological cognitive processes."]] I have made
+  many attempts at building my own, including
+
+  + A web-based wiki (including my own custom solution, [[https://github.com/jgm/gitit][gitit]],
+    [[https://www.mediawiki.org/wiki/MediaWiki][MediaWiki]], and others)
+  + Org-mode based notes, including my current =notes/notes.org= system
+    (with subdirectories for other things such as book notes)
+  + [[https://evernote.com/][Evernote]] / [[https://www.notion.so/][Notion.so]]
+  + The [[https://happenapps.com/][Quiver MacOS app]]
+  + Experimenting in building custom exocortex software (e.g. kortex)
+  + A daily weblog (e.g. the old ai6ua.net site) and gemlog to
+    summarize important knowledge gained that day.
+
+  Each of these has their own shortcomings that don't quite match up
+  with my expectations or desires. An exocortex must be a personalized
+  system adapted to its user to maximise knowledge capture.
+
+  Succinctly put, the goal of an exocortex is to collect artifacts and
+  notes (including daily notes), organize them, and allow for written
+  summaries of current snapshots of my knowledge. Put another way,
+  /artifacts + notes + graph structure = exocortex/. Note that a folder
+  hierarchy is a tree, which is a form of directed graph. Symlinks
+  inside a folder act as edges to notes outside of that folder,
+  refining the graph structure.
+
+  This writeup is an attempt at characterising and exploring the
+  exocortex problem space to capture my goals, serve as a foundation
+  for the construction of such a system, and, through discussion of
+  the problem space, tease out the structure of the problem to
+  discover a closer approximation to the idealized reality of an
+  exocortex system.
+
+* The elements of exocortices
+
+  The elements of an exocortex, briefly touched on above and expanded
+  below, include
+
+  + artifacts,
+  + the artifact repository,
+  + notes,
+  + structure,
+  + a query interface,
+  + an exploratory interface,
+  + a presentation interface,
+  + an update interface, and
+  + locality.
+
+** Artifacts
+
+   An artifact is any object that is not a textual writeup by me that
+   should be referenceable as part of the exocortex. A copy of a paper
+   from ArXiV might serve as an artifact. Importantly, artifacts must
+   be locally-available. They serve as a snapshot of some source of
+   knowledge, and should not be subject to link decay, future
+   pay-walling (or loss of access to a pay-walled system), or loss of
+   connectivity. An artifact should be timestamped: when was it
+   captured? When was the artifact created upstream? An artifact must
+   also have some associated upstream information --- how did it come
+   to be in the repository?
+
+** The artifact repository
+
+   An artifact may be relevant to more than one field of interest;
+   accordingly, all artifacts should exist in a central
+   repository. This repository should support artifact histories
+   (e.g. collecting updates to artifacts, where the history is
+   important in capturing a historical view of knowledge), multiple
+   formats (a book may exist in PDF, EPUB, or other formats), and a
+   mechanism for exploring, finding, and updating docs. The repository
+   must capture relevant metadata about each artifact.
+
+** Notes
+
+   A note is a written summary of a certain field. It should be in
+   some rich-text format that supports linking as well as basic
+   formatting. The ideal text format appears to be the org-mode format
+   given its rich formatting and ability to transition fluidly between
+   outline and full document; however, this may not be the final, most
+   effective format. A note is the distillation of artifacts into an
+   understandable form, providing avenues to discover specifics that
+   may need to be held in working memory only briefly.
+
+** Structure
+
+   A structured format allows for fast and efficient knowledge
+   lookups. It grants the researcher a starting place with a set of
+   rules governing where and how things may be found. It imposes order
+   over chaos such that relevant kernels of knowledge may be retrieved
+   and examined in an expedient manner. The metaphor that humans seem
+   to adapt to the most readily is a graph structure, particularly
+   those that are generally hierarchical in nature.
+
+** A query interface
+
+   The exocortex and the artifact repository both require a query
+   interface; they may be part of the same UI. A query UI allows a
+   researcher to pose questions of the exocortex, directly looking for
+   specific knowledge.
+
+   The four interfaces (query, exploration, presentation, and update)
+   may all be facets of the same interface, and they may benefit from
+   a cohesive and unified interface; however, it is important that all
+   of these use cases are considered and supported.
+
+** An exploratory interface
+
+   The exploratory interface allows a researcher to meander through
+   the knowledge store, exploring topics and potentially identifying
+   new areas to push the knowledge sphere out further.
+
+** A presentation interface
+
+   The presentation interface allows a set of notes to be shared with
+   others; it should be possible to include some or all artifacts
+   associated with these notes. For example, it may not be appropriate
+   to share a copy of a book with the presentation, but it may be
+   appropriate to share a copy of some of the supporting papers.
+
+** An update interface
+
+   The update interface is where knowledge is added to the exocortex,
+   whether through capturing an artifact or writing notes.
+
+** Locality
+
+   An exocortex must be localized to the user, with the full
+   repository available offline. Quick input or scratch pad notes
+   might be available, but realistically, the cost of cloud storage
+   and the transfer sizes mean that having the full exocortex
+   available is unlikely. Instead, a hybrid model allowing quick
+   captures of knowledge available remotely combined with a full
+   exocortex on a local system presents the probably best solution.
+
+* Exploring the problem space
+
+  In order to map out the structure of an exocortex, it's useful to
+  review what has worked and what hasn't. Each alternative presented
+  will consider what worked and what didn't to clarify what an
+  effective exocortex looks like.
+
+** Git-backed wikis and plaintext folders
+
+   At a high-level, wikis like Gitit and folders of plain-text
+   (including org-mode) data are roughly equivalent; the differences
+   lie primarily in how they are presented. Neither approach works
+   well for indexing or organizing artifacts, and while some
+   approaches like a scanner that adds notes to a SQLite database (for
+   improved search performance).
+
+   Using a folder of org-mode notes is probably one of the better
+   note-taking interfaces that I have found; however, there is no
+   notion of an artifact repository without considerable manual work.
+
+   The main downsides to this approach are the lack of good query and
+   exploration UIs, along with the lack of a useful artifact
+   repository. The upsides are good updates and presentation
+   interfaces.
+
+** Evernote and Notion
+
+   Evernote (and also notion) provide a unified, searchable interface
+   across multiple machines. Evernote in particular has a usable
+   artifact repository, although information about upstream sources
+   isn't available, nor are metadata about the object or the idea of
+   multiple formats and history.
+
+   Evernote is a paid service, and neither is particularly extensible
+   to a user's needs. Exploring the exocortex is difficult, as there's
+   no notion of an entry point. Presenting nodes is met with some
+   success, albeit limited.
+
+** Quiver
+
+   Quiver is an excellent note-taking application; however, it is
+   MacOS-only. It does have some ability to import web pages, but in
+   general it lacks any idea of an artifact repository. The ability to
+   intersperse different cell types is good.
+
+** Jupyter notebooks
+
+   Jupyter notebooks provide an excellent interface for interspersing
+   computational ideas with prose; there is no notion of an artifact
+   repository, however. Linking notebooks isn't supported, and there
+   is no overall structure besides manual hyperlinking and a directory
+   structure.
+
+* The artifact repository
+
+  The artifact repository is one of the two pillars of the exocortex;
+  it stores the "first hand" sources of knowledge.
+
+** The central index
+
+  The first part of an artifact repository is a central index that
+  provides
+
+  + references and linking to artifacts,
+  + a "blob" store that contains the artifacts, and
+  + some management interface that allows adding and editing metadata
+    as well as adding artifacts.
+
+  An artifact entry in the index contains, at a minimm,
+
+  + An artifact identifier
+  + Authorship information
+
+  The artifact identifier is used to associate all related artifacts
+  (e.g. previous revisions, different formats, etc.)
+
+** Artifacts
+
+   An artifact consists of multiple components:
+
+   + A primary metadata entry that organizes artifacts
+   + Pointers to artifact "blobs"
+   + A historical record of changed blobs
+
+   The metadata header for an artifact should contain, at a minimum,
+   fields for
+
+   + Artifact identifier
+   + A list of revisions
+
+   Each artifact can have zero or more blobs associated. For example,
+   a physical book reference might not have a blob associated; an
+   ebook might have multiple blobs corresponding to different formats;
+   and a webpage snapshot may have mulitple blobs representing
+   revisions to the page.
+
+   A blob header stores
+
+   + The artifact identifier
+   + The date retrieved or stored
+   + The date of the artifact itself
+   + The source
+   + Blob type information (e.g. a MIME type)
+   + A list of categories
+   + A list of tags
+
+  The headers should probably be stored in a database of some kind;
+  SQLite is a good example for the first iteration. Blobs themselves
+  will need to be stored on disk, probably in a format related to a
+  hash of the blob contents, such as in a [[https://en.wikipedia.org/wiki/Content-addressable_storage][content-addressable store]]
+  (CAM).
+
+* The exocortex
+
+  The exocortex consists of a graph database that links notes. At a
+  broad level, it should probably start with a root node that points
+  to broad fields. The update interface should allow manipulation of
+  nodes as graph nodes in addition to allowing for adding and editing
+  notes. A node might be thought of as =type node = Note |
+  ArtifactLink=. That is, a note can link to other notes or to
+  artifacts. A proper node title is the sum of the paths. For example,
+  consider the following structure:
+
+
+
+
+* Next steps
+
+  A first step is to start constructing an artifact repository. Once
+  this is in place, a suitable graph database (for example, [[https://github.com/cayleygraph/cayley][cayley]])
+  should be identified, and an exocortex core developed. User
+  interfaces will necessarily be developed alongside these systems.