300 lines
11 KiB
Markdown
300 lines
11 KiB
Markdown
|
Title: On Exocortices
|
||
|
|
||
|
*This is from a gemlog post written on 2021-02-10.*
|
||
|
|
||
|
This is a rough draft on some thoughts about exocortices that has been
|
||
|
simmering in the back of my mind lately. The catalyst for writing it
|
||
|
was reading Stephen Wolfram's (with all caveats that come with reading
|
||
|
his posts) entry "Seeking the Productive Life: Some Details of My
|
||
|
Personal Infrastructure".
|
||
|
|
||
|
This is a rough draft on some thoughts about exocortices that has been
|
||
|
simmering in the back of my mind lately. The catalyst for writing it
|
||
|
was reading Stephen Wolfram's (with all caveats that come with reading
|
||
|
his posts) entry "Seeking the Productive Life: Some Details of My
|
||
|
Personal Infrastructure".
|
||
|
|
||
|
## Background
|
||
|
|
||
|
An exocortex is "a hypothetical artificial information-processing
|
||
|
system that would augment a brain's biological cognitive processes." I
|
||
|
have made many attempts at building my own, including
|
||
|
|
||
|
* A web-based wiki (including my own custom solution, gitit,
|
||
|
MediaWiki, and others.
|
||
|
* Org-mode based notes, including my current notes/notes.org system
|
||
|
(with subdirectories for other things such as book notes)
|
||
|
* Evernote / Notion
|
||
|
* The Quiver MacOS app
|
||
|
* Experimenting in building custom exocortex software (e.g. kortex)
|
||
|
* A daily weblog (e.g. the old ai6ua.net site) and gemlog to summarize
|
||
|
important knowledge gained that day.
|
||
|
|
||
|
Each of these has their own shortcomings that don't quite match up
|
||
|
with my expectations or desires. An exocortex must be a personalized
|
||
|
system adapted to its user to maximise knowledge capture.
|
||
|
|
||
|
Succinctly put, the goal of an
|
||
|
[exocortex](https://en.wiktionary.org/wiki/exocortex) is to collect
|
||
|
artifacts and notes (including daily notes), organize them, and allow
|
||
|
for written summaries of current snapshots of my knowledge. Put
|
||
|
another way, "artifacts + notes + graph structure = exocortex". Note
|
||
|
that a folder hierarchy is a tree, which is a form of directed
|
||
|
graph. Symlinks inside a folder act as edges to notes outside of that
|
||
|
folder, refining the graph structure.
|
||
|
|
||
|
This writeup is an attempt at characterising and exploring the
|
||
|
exocortex problem space to capture my goals, serve as a foundation for
|
||
|
the construction of such a system, and, through discussion of the
|
||
|
problem space, tease out the structure of the problem to discover a
|
||
|
closer approximation to the idealized reality of an exocortex system.
|
||
|
|
||
|
## The elements of exocortices
|
||
|
|
||
|
The elements of an exocortex, briefly touched on above and expanded
|
||
|
below, include
|
||
|
|
||
|
* artifacts,
|
||
|
* the artifact repository,
|
||
|
* notes,
|
||
|
* structure,
|
||
|
* a query interface,
|
||
|
* an exploratory interface,
|
||
|
* a presentation interface,
|
||
|
* an update interface,
|
||
|
* locality, and
|
||
|
* totality.
|
||
|
|
||
|
### Artifacts
|
||
|
|
||
|
An artifact is any object that is not a textual writeup by me that
|
||
|
should be referenceable as part of the exocortex. A copy of a paper
|
||
|
from ArXiV might serve as an artifact. Importantly, artifacts must be
|
||
|
locally-available. They serve as a snapshot of some source of
|
||
|
knowledge, and should not be subject to link decay, future pay-walling
|
||
|
(or loss of access to a pay-walled system), or loss of
|
||
|
connectivity. An artifact should be timestamped: when was it captured?
|
||
|
When was the artifact created upstream? An artifact must also have
|
||
|
some associated upstream information --- how did it come to be in the
|
||
|
repository?
|
||
|
|
||
|
### The artifact repository
|
||
|
|
||
|
An artifact may be relevant to more than one field of interest;
|
||
|
accordingly, all artifacts should exist in a central repository. This
|
||
|
repository should support artifact histories (e.g. collecting updates
|
||
|
to artifacts, where the history is important in capturing a historical
|
||
|
view of knowledge), multiple formats (a book may exist in PDF, EPUB,
|
||
|
or other formats), and a mechanism for exploring, finding, and
|
||
|
updating docs. The repository must capture relevant metadata about
|
||
|
each artifact.
|
||
|
|
||
|
### Notes
|
||
|
|
||
|
A note is a written summary of a certain field. It should be in some
|
||
|
rich-text format that supports linking as well as basic
|
||
|
formatting. The ideal text format appears to be the org-mode format
|
||
|
given its rich formatting and ability to transition fluidly between
|
||
|
outline and full document; however, this may not be the final, most
|
||
|
effective format. A note is the distillation of artifacts into an
|
||
|
understandable form, providing avenues to discover specifics that may
|
||
|
need to be held in working memory only briefly.
|
||
|
|
||
|
### Structure
|
||
|
|
||
|
A structured format allows for fast and efficient knowledge
|
||
|
lookups. It grants the researcher a starting place with a set of rules
|
||
|
governing where and how things may be found. It imposes order over
|
||
|
chaos such that relevant kernels of knowledge may be retrieved and
|
||
|
examined in an expedient manner. The metaphor that humans seem to
|
||
|
adapt to the most readily is a graph structure, particularly those
|
||
|
that are generally hierarchical in nature.
|
||
|
|
||
|
### A query interface
|
||
|
|
||
|
The exocortex and the artifact repository both require a query
|
||
|
interface; they may be part of the same UI. A query UI allows a
|
||
|
researcher to pose questions of the exocortex, directly looking for
|
||
|
specific knowledge.
|
||
|
|
||
|
The four interfaces (query, exploration, presentation, and update) may
|
||
|
all be facets of the same interface, and they may benefit from a
|
||
|
cohesive and unified interface; however, it is important that all of
|
||
|
these use cases are considered and supported.
|
||
|
|
||
|
### An exploratory interface
|
||
|
|
||
|
The exploratory interface allows a researcher to meander through the
|
||
|
knowledge store, exploring topics and potentially identifying new
|
||
|
areas to push the knowledge sphere out further.
|
||
|
|
||
|
### A presentation interface
|
||
|
|
||
|
The presentation interface allows a set of notes to be shared with
|
||
|
others; it should be possible to include some or all artifacts
|
||
|
associated with these notes. For example, it may not be appropriate to
|
||
|
share a copy of a book with the presentation, but it may be
|
||
|
appropriate to share a copy of some of the supporting papers.
|
||
|
|
||
|
### An update interface
|
||
|
|
||
|
The update interface is where knowledge is added to the exocortex,
|
||
|
whether through capturing an artifact or writing notes.
|
||
|
|
||
|
### Locality
|
||
|
|
||
|
An exocortex must be localized to the user, with the full repository
|
||
|
available offline. Quick input or scratch pad notes might be
|
||
|
available, but realistically, the cost of cloud storage and the
|
||
|
transfer sizes mean that having the full exocortex available is
|
||
|
unlikely. Instead, a hybrid model allowing quick captures of knowledge
|
||
|
available remotely combined with a full exocortex on a local system
|
||
|
presents the probably best solution.
|
||
|
|
||
|
### Totality
|
||
|
|
||
|
An exocortex represents the sum of the user's knowledge. There aren't
|
||
|
separate exocortices for different areas. Everything I know should go
|
||
|
into my exocortex.
|
||
|
|
||
|
## Exploring the problem space
|
||
|
|
||
|
In order to map out the structure of an exocortex, it's useful to
|
||
|
review what has worked and what hasn't. Each alternative presented
|
||
|
will consider what worked and what didn't to clarify what an effective
|
||
|
exocortex looks like.
|
||
|
|
||
|
### Git-backed wikis and plaintext folders
|
||
|
|
||
|
At a high-level, wikis like Gitit and folders of plain-text (including
|
||
|
org-mode) data are roughly equivalent; the differences lie primarily
|
||
|
in how they are presented. Neither approach works well for indexing or
|
||
|
organizing artifacts, and while some approaches like a scanner that
|
||
|
adds notes to a SQLite database (for improved search performance).
|
||
|
|
||
|
Using a folder of org-mode notes is probably one of the better
|
||
|
note-taking interfaces that I have found; however, there is no notion
|
||
|
of an artifact repository without considerable manual work.
|
||
|
|
||
|
The main downsides to this approach are the lack of good query and
|
||
|
exploration UIs, along with the lack of a useful artifact
|
||
|
repository. The upsides are good updates and presentation interfaces.
|
||
|
|
||
|
### Evernote and Notion
|
||
|
|
||
|
Evernote (and also notion) provide a unified, searchable interface
|
||
|
across multiple machines. Evernote in particular has a usable artifact
|
||
|
repository, although information about upstream sources isn't
|
||
|
available, nor are metadata about the object or the idea of multiple
|
||
|
formats and history.
|
||
|
|
||
|
Evernote is a paid service, and neither is particularly extensible to
|
||
|
a user's needs. Exploring the exocortex is difficult, as there's no
|
||
|
notion of an entry point. Presenting nodes is met with some success,
|
||
|
albeit limited.
|
||
|
|
||
|
### Quiver
|
||
|
|
||
|
Quiver is an excellent note-taking application; however, it is
|
||
|
MacOS-only. It does have some ability to import web pages, but in
|
||
|
general it lacks any idea of an artifact repository. The ability to
|
||
|
intersperse different cell types is good.
|
||
|
|
||
|
### Jupyter notebooks
|
||
|
|
||
|
Jupyter notebooks provide an excellent interface for interspersing
|
||
|
computational ideas with prose; there is no notion of an artifact
|
||
|
repository, however. Linking notebooks isn't supported, and there is
|
||
|
no overall structure besides manual hyperlinking and a directory
|
||
|
structure.
|
||
|
|
||
|
## The artifact repository
|
||
|
|
||
|
The artifact repository is one of the two pillars of the exocortex; it
|
||
|
stores the "first hand" sources of knowledge.
|
||
|
|
||
|
### The central index
|
||
|
|
||
|
The first part of an artifact repository is a central index that
|
||
|
provides
|
||
|
|
||
|
* references and linking to artifacts,
|
||
|
* a "blob" store that contains the artifacts, and
|
||
|
* some management interface that allows adding and editing metadata as
|
||
|
well as adding artifacts.
|
||
|
|
||
|
An artifact entry in the index contains, at a minimm,
|
||
|
|
||
|
* An artifact identifier
|
||
|
* Authorship information
|
||
|
|
||
|
The artifact identifier is used to associate all related artifacts
|
||
|
(e.g. previous revisions, different formats, etc.)
|
||
|
|
||
|
### Artifacts
|
||
|
|
||
|
An artifact consists of multiple components:
|
||
|
|
||
|
* A primary metadata entry that organizes artifacts
|
||
|
* Pointers to artifact "blobs"
|
||
|
* A historical record of changed blobs
|
||
|
|
||
|
The metadata header for an artifact should contain, at a minimum,
|
||
|
fields for
|
||
|
|
||
|
* Artifact identifier
|
||
|
* A list of revisions
|
||
|
|
||
|
Each artifact can have zero or more blobs associated. For example, a
|
||
|
physical book reference might not have a blob associated; an ebook
|
||
|
might have multiple blobs corresponding to different formats; and a
|
||
|
webpage snapshot may have mulitple blobs representing revisions to the
|
||
|
page.
|
||
|
|
||
|
A blob header stores
|
||
|
|
||
|
* The artifact identifier
|
||
|
* The date retrieved or stored
|
||
|
* The date of the artifact itself
|
||
|
* The source
|
||
|
* Blob type information (e.g. a MIME type)
|
||
|
* A list of categories
|
||
|
* A list of tags
|
||
|
|
||
|
The headers should probably be stored in a database of some kind;
|
||
|
SQLite is a good example for the first iteration. Blobs themselves
|
||
|
will need to be stored on disk, probably in a format related to a hash
|
||
|
of the blob contents, such as in a content-addressable store (CAS).
|
||
|
|
||
|
## The exocortex
|
||
|
|
||
|
The exocortex consists of a graph database that links notes. At a
|
||
|
broad level, it should probably start with a root node that points to
|
||
|
broad fields. The update interface should allow manipulation of nodes
|
||
|
as graph nodes in addition to allowing for adding and editing notes. A
|
||
|
node might be thought of as "type node = Note | ArtifactLink". That
|
||
|
is, a note can link to other notes or to artifacts. A proper node
|
||
|
title is the sum of the paths. For example, consider the following
|
||
|
structure linked below:
|
||
|
|
||
|
[![](/files/i/t/on_exocortices_graph.jpg)](/files/i/on_exocortices_graph.jpg)
|
||
|
|
||
|
Different possibilities for naming note3 include:
|
||
|
|
||
|
* root->note2->note3
|
||
|
* root=>note2=>note3
|
||
|
* root/note2/note3
|
||
|
|
||
|
Personally, I prefer the arrow notation with equal sign. Each note can
|
||
|
be shortened to a partial path; e.g. "note2=>note3". The title for
|
||
|
each note can be stored in a metadata entry.
|
||
|
|
||
|
|
||
|
## Next steps
|
||
|
|
||
|
A first step is to start constructing an artifact repository. Once
|
||
|
this is in place, a suitable graph database (for example,
|
||
|
[cayley](https://github.com/cayleygraph/cayley)) should be identified,
|
||
|
and an exocortex core developed. User interfaces will necessarily be
|
||
|
developed alongside these systems.
|