Restart with Claude
This commit is contained in:
275
docs/exocortex.org
Normal file
275
docs/exocortex.org
Normal file
@@ -0,0 +1,275 @@
|
||||
#+TITLE: On Exocortices
|
||||
#+AUTHOR: Kyle Isom
|
||||
|
||||
* Document history
|
||||
+ [2021-02-10 Wed] first draft
|
||||
|
||||
* Background
|
||||
|
||||
An exocortex is [[https://en.wiktionary.org/wiki/exocortex]["a
|
||||
hypothetical artificial information-processing system that would
|
||||
augment a brain's biological cognitive processes."]] I have made
|
||||
many attempts at building my own, including
|
||||
|
||||
+ A web-based wiki (including my own custom solution, [[https://github.com/jgm/gitit][gitit]],
|
||||
[[https://www.mediawiki.org/wiki/MediaWiki][MediaWiki]], and others)
|
||||
+ Org-mode based notes, including my current =notes/notes.org= system
|
||||
(with subdirectories for other things such as book notes)
|
||||
+ [[https://evernote.com/][Evernote]] / [[https://www.notion.so/][Notion.so]]
|
||||
+ The [[https://happenapps.com/][Quiver MacOS app]]
|
||||
+ Experimenting in building custom exocortex software (e.g. kortex)
|
||||
+ A daily weblog (e.g. the old ai6ua.net site) and gemlog to
|
||||
summarize important knowledge gained that day.
|
||||
|
||||
Each of these has their own shortcomings that don't quite match up
|
||||
with my expectations or desires. An exocortex must be a personalized
|
||||
system adapted to its user to maximise knowledge capture.
|
||||
|
||||
Succinctly put, the goal of an exocortex is to collect artifacts and
|
||||
notes (including daily notes), organize them, and allow for written
|
||||
summaries of current snapshots of my knowledge. Put another way,
|
||||
/artifacts + notes + graph structure = exocortex/. Note that a folder
|
||||
hierarchy is a tree, which is a form of directed graph. Symlinks
|
||||
inside a folder act as edges to notes outside of that folder,
|
||||
refining the graph structure.
|
||||
|
||||
This writeup is an attempt at characterising and exploring the
|
||||
exocortex problem space to capture my goals, serve as a foundation
|
||||
for the construction of such a system, and, through discussion of
|
||||
the problem space, tease out the structure of the problem to
|
||||
discover a closer approximation to the idealized reality of an
|
||||
exocortex system.
|
||||
|
||||
* The elements of exocortices
|
||||
|
||||
The elements of an exocortex, briefly touched on above and expanded
|
||||
below, include
|
||||
|
||||
+ artifacts,
|
||||
+ the artifact repository,
|
||||
+ notes,
|
||||
+ structure,
|
||||
+ a query interface,
|
||||
+ an exploratory interface,
|
||||
+ a presentation interface,
|
||||
+ an update interface, and
|
||||
+ locality.
|
||||
|
||||
** Artifacts
|
||||
|
||||
An artifact is any object that is not a textual writeup by me that
|
||||
should be referenceable as part of the exocortex. A copy of a paper
|
||||
from ArXiV might serve as an artifact. Importantly, artifacts must
|
||||
be locally-available. They serve as a snapshot of some source of
|
||||
knowledge, and should not be subject to link decay, future
|
||||
pay-walling (or loss of access to a pay-walled system), or loss of
|
||||
connectivity. An artifact should be timestamped: when was it
|
||||
captured? When was the artifact created upstream? An artifact must
|
||||
also have some associated upstream information --- how did it come
|
||||
to be in the repository?
|
||||
|
||||
** The artifact repository
|
||||
|
||||
An artifact may be relevant to more than one field of interest;
|
||||
accordingly, all artifacts should exist in a central
|
||||
repository. This repository should support artifact histories
|
||||
(e.g. collecting updates to artifacts, where the history is
|
||||
important in capturing a historical view of knowledge), multiple
|
||||
formats (a book may exist in PDF, EPUB, or other formats), and a
|
||||
mechanism for exploring, finding, and updating docs. The repository
|
||||
must capture relevant metadata about each artifact.
|
||||
|
||||
** Notes
|
||||
|
||||
A note is a written summary of a certain field. It should be in
|
||||
some rich-text format that supports linking as well as basic
|
||||
formatting. The ideal text format appears to be the org-mode format
|
||||
given its rich formatting and ability to transition fluidly between
|
||||
outline and full document; however, this may not be the final, most
|
||||
effective format. A note is the distillation of artifacts into an
|
||||
understandable form, providing avenues to discover specifics that
|
||||
may need to be held in working memory only briefly.
|
||||
|
||||
** Structure
|
||||
|
||||
A structured format allows for fast and efficient knowledge
|
||||
lookups. It grants the researcher a starting place with a set of
|
||||
rules governing where and how things may be found. It imposes order
|
||||
over chaos such that relevant kernels of knowledge may be retrieved
|
||||
and examined in an expedient manner. The metaphor that humans seem
|
||||
to adapt to the most readily is a graph structure, particularly
|
||||
those that are generally hierarchical in nature.
|
||||
|
||||
** A query interface
|
||||
|
||||
The exocortex and the artifact repository both require a query
|
||||
interface; they may be part of the same UI. A query UI allows a
|
||||
researcher to pose questions of the exocortex, directly looking for
|
||||
specific knowledge.
|
||||
|
||||
The four interfaces (query, exploration, presentation, and update)
|
||||
may all be facets of the same interface, and they may benefit from
|
||||
a cohesive and unified interface; however, it is important that all
|
||||
of these use cases are considered and supported.
|
||||
|
||||
** An exploratory interface
|
||||
|
||||
The exploratory interface allows a researcher to meander through
|
||||
the knowledge store, exploring topics and potentially identifying
|
||||
new areas to push the knowledge sphere out further.
|
||||
|
||||
** A presentation interface
|
||||
|
||||
The presentation interface allows a set of notes to be shared with
|
||||
others; it should be possible to include some or all artifacts
|
||||
associated with these notes. For example, it may not be appropriate
|
||||
to share a copy of a book with the presentation, but it may be
|
||||
appropriate to share a copy of some of the supporting papers.
|
||||
|
||||
** An update interface
|
||||
|
||||
The update interface is where knowledge is added to the exocortex,
|
||||
whether through capturing an artifact or writing notes.
|
||||
|
||||
** Locality
|
||||
|
||||
An exocortex must be localized to the user, with the full
|
||||
repository available offline. Quick input or scratch pad notes
|
||||
might be available, but realistically, the cost of cloud storage
|
||||
and the transfer sizes mean that having the full exocortex
|
||||
available is unlikely. Instead, a hybrid model allowing quick
|
||||
captures of knowledge available remotely combined with a full
|
||||
exocortex on a local system presents the probably best solution.
|
||||
|
||||
* Exploring the problem space
|
||||
|
||||
In order to map out the structure of an exocortex, it's useful to
|
||||
review what has worked and what hasn't. Each alternative presented
|
||||
will consider what worked and what didn't to clarify what an
|
||||
effective exocortex looks like.
|
||||
|
||||
** Git-backed wikis and plaintext folders
|
||||
|
||||
At a high-level, wikis like Gitit and folders of plain-text
|
||||
(including org-mode) data are roughly equivalent; the differences
|
||||
lie primarily in how they are presented. Neither approach works
|
||||
well for indexing or organizing artifacts, and while some
|
||||
approaches like a scanner that adds notes to a SQLite database (for
|
||||
improved search performance).
|
||||
|
||||
Using a folder of org-mode notes is probably one of the better
|
||||
note-taking interfaces that I have found; however, there is no
|
||||
notion of an artifact repository without considerable manual work.
|
||||
|
||||
The main downsides to this approach are the lack of good query and
|
||||
exploration UIs, along with the lack of a useful artifact
|
||||
repository. The upsides are good updates and presentation
|
||||
interfaces.
|
||||
|
||||
** Evernote and Notion
|
||||
|
||||
Evernote (and also notion) provide a unified, searchable interface
|
||||
across multiple machines. Evernote in particular has a usable
|
||||
artifact repository, although information about upstream sources
|
||||
isn't available, nor are metadata about the object or the idea of
|
||||
multiple formats and history.
|
||||
|
||||
Evernote is a paid service, and neither is particularly extensible
|
||||
to a user's needs. Exploring the exocortex is difficult, as there's
|
||||
no notion of an entry point. Presenting nodes is met with some
|
||||
success, albeit limited.
|
||||
|
||||
** Quiver
|
||||
|
||||
Quiver is an excellent note-taking application; however, it is
|
||||
MacOS-only. It does have some ability to import web pages, but in
|
||||
general it lacks any idea of an artifact repository. The ability to
|
||||
intersperse different cell types is good.
|
||||
|
||||
** Jupyter notebooks
|
||||
|
||||
Jupyter notebooks provide an excellent interface for interspersing
|
||||
computational ideas with prose; there is no notion of an artifact
|
||||
repository, however. Linking notebooks isn't supported, and there
|
||||
is no overall structure besides manual hyperlinking and a directory
|
||||
structure.
|
||||
|
||||
* The artifact repository
|
||||
|
||||
The artifact repository is one of the two pillars of the exocortex;
|
||||
it stores the "first hand" sources of knowledge.
|
||||
|
||||
** The central index
|
||||
|
||||
The first part of an artifact repository is a central index that
|
||||
provides
|
||||
|
||||
+ references and linking to artifacts,
|
||||
+ a "blob" store that contains the artifacts, and
|
||||
+ some management interface that allows adding and editing metadata
|
||||
as well as adding artifacts.
|
||||
|
||||
An artifact entry in the index contains, at a minimm,
|
||||
|
||||
+ An artifact identifier
|
||||
+ Authorship information
|
||||
|
||||
The artifact identifier is used to associate all related artifacts
|
||||
(e.g. previous revisions, different formats, etc.)
|
||||
|
||||
** Artifacts
|
||||
|
||||
An artifact consists of multiple components:
|
||||
|
||||
+ A primary metadata entry that organizes artifacts
|
||||
+ Pointers to artifact "blobs"
|
||||
+ A historical record of changed blobs
|
||||
|
||||
The metadata header for an artifact should contain, at a minimum,
|
||||
fields for
|
||||
|
||||
+ Artifact identifier
|
||||
+ A list of revisions
|
||||
|
||||
Each artifact can have zero or more blobs associated. For example,
|
||||
a physical book reference might not have a blob associated; an
|
||||
ebook might have multiple blobs corresponding to different formats;
|
||||
and a webpage snapshot may have mulitple blobs representing
|
||||
revisions to the page.
|
||||
|
||||
A blob header stores
|
||||
|
||||
+ The artifact identifier
|
||||
+ The date retrieved or stored
|
||||
+ The date of the artifact itself
|
||||
+ The source
|
||||
+ Blob type information (e.g. a MIME type)
|
||||
+ A list of categories
|
||||
+ A list of tags
|
||||
|
||||
The headers should probably be stored in a database of some kind;
|
||||
SQLite is a good example for the first iteration. Blobs themselves
|
||||
will need to be stored on disk, probably in a format related to a
|
||||
hash of the blob contents, such as in a [[https://en.wikipedia.org/wiki/Content-addressable_storage][content-addressable store]]
|
||||
(CAM).
|
||||
|
||||
* The exocortex
|
||||
|
||||
The exocortex consists of a graph database that links notes. At a
|
||||
broad level, it should probably start with a root node that points
|
||||
to broad fields. The update interface should allow manipulation of
|
||||
nodes as graph nodes in addition to allowing for adding and editing
|
||||
notes. A node might be thought of as =type node = Note |
|
||||
ArtifactLink=. That is, a note can link to other notes or to
|
||||
artifacts. A proper node title is the sum of the paths. For example,
|
||||
consider the following structure:
|
||||
|
||||
|
||||
|
||||
|
||||
* Next steps
|
||||
|
||||
A first step is to start constructing an artifact repository. Once
|
||||
this is in place, a suitable graph database (for example, [[https://github.com/cayleygraph/cayley][cayley]])
|
||||
should be identified, and an exocortex core developed. User
|
||||
interfaces will necessarily be developed alongside these systems.
|
||||
Reference in New Issue
Block a user