276 lines
11 KiB
Org Mode
276 lines
11 KiB
Org Mode
#+TITLE: On Exocortices
|
|
#+AUTHOR: Kyle Isom
|
|
|
|
* Document history
|
|
+ [2021-02-10 Wed] first draft
|
|
|
|
* Background
|
|
|
|
An exocortex is [[https://en.wiktionary.org/wiki/exocortex]["a
|
|
hypothetical artificial information-processing system that would
|
|
augment a brain's biological cognitive processes."]] I have made
|
|
many attempts at building my own, including
|
|
|
|
+ A web-based wiki (including my own custom solution, [[https://github.com/jgm/gitit][gitit]],
|
|
[[https://www.mediawiki.org/wiki/MediaWiki][MediaWiki]], and others)
|
|
+ Org-mode based notes, including my current =notes/notes.org= system
|
|
(with subdirectories for other things such as book notes)
|
|
+ [[https://evernote.com/][Evernote]] / [[https://www.notion.so/][Notion.so]]
|
|
+ The [[https://happenapps.com/][Quiver MacOS app]]
|
|
+ Experimenting in building custom exocortex software (e.g. kortex)
|
|
+ A daily weblog (e.g. the old ai6ua.net site) and gemlog to
|
|
summarize important knowledge gained that day.
|
|
|
|
Each of these has their own shortcomings that don't quite match up
|
|
with my expectations or desires. An exocortex must be a personalized
|
|
system adapted to its user to maximise knowledge capture.
|
|
|
|
Succinctly put, the goal of an exocortex is to collect artifacts and
|
|
notes (including daily notes), organize them, and allow for written
|
|
summaries of current snapshots of my knowledge. Put another way,
|
|
/artifacts + notes + graph structure = exocortex/. Note that a folder
|
|
hierarchy is a tree, which is a form of directed graph. Symlinks
|
|
inside a folder act as edges to notes outside of that folder,
|
|
refining the graph structure.
|
|
|
|
This writeup is an attempt at characterising and exploring the
|
|
exocortex problem space to capture my goals, serve as a foundation
|
|
for the construction of such a system, and, through discussion of
|
|
the problem space, tease out the structure of the problem to
|
|
discover a closer approximation to the idealized reality of an
|
|
exocortex system.
|
|
|
|
* The elements of exocortices
|
|
|
|
The elements of an exocortex, briefly touched on above and expanded
|
|
below, include
|
|
|
|
+ artifacts,
|
|
+ the artifact repository,
|
|
+ notes,
|
|
+ structure,
|
|
+ a query interface,
|
|
+ an exploratory interface,
|
|
+ a presentation interface,
|
|
+ an update interface, and
|
|
+ locality.
|
|
|
|
** Artifacts
|
|
|
|
An artifact is any object that is not a textual writeup by me that
|
|
should be referenceable as part of the exocortex. A copy of a paper
|
|
from ArXiV might serve as an artifact. Importantly, artifacts must
|
|
be locally-available. They serve as a snapshot of some source of
|
|
knowledge, and should not be subject to link decay, future
|
|
pay-walling (or loss of access to a pay-walled system), or loss of
|
|
connectivity. An artifact should be timestamped: when was it
|
|
captured? When was the artifact created upstream? An artifact must
|
|
also have some associated upstream information --- how did it come
|
|
to be in the repository?
|
|
|
|
** The artifact repository
|
|
|
|
An artifact may be relevant to more than one field of interest;
|
|
accordingly, all artifacts should exist in a central
|
|
repository. This repository should support artifact histories
|
|
(e.g. collecting updates to artifacts, where the history is
|
|
important in capturing a historical view of knowledge), multiple
|
|
formats (a book may exist in PDF, EPUB, or other formats), and a
|
|
mechanism for exploring, finding, and updating docs. The repository
|
|
must capture relevant metadata about each artifact.
|
|
|
|
** Notes
|
|
|
|
A note is a written summary of a certain field. It should be in
|
|
some rich-text format that supports linking as well as basic
|
|
formatting. The ideal text format appears to be the org-mode format
|
|
given its rich formatting and ability to transition fluidly between
|
|
outline and full document; however, this may not be the final, most
|
|
effective format. A note is the distillation of artifacts into an
|
|
understandable form, providing avenues to discover specifics that
|
|
may need to be held in working memory only briefly.
|
|
|
|
** Structure
|
|
|
|
A structured format allows for fast and efficient knowledge
|
|
lookups. It grants the researcher a starting place with a set of
|
|
rules governing where and how things may be found. It imposes order
|
|
over chaos such that relevant kernels of knowledge may be retrieved
|
|
and examined in an expedient manner. The metaphor that humans seem
|
|
to adapt to the most readily is a graph structure, particularly
|
|
those that are generally hierarchical in nature.
|
|
|
|
** A query interface
|
|
|
|
The exocortex and the artifact repository both require a query
|
|
interface; they may be part of the same UI. A query UI allows a
|
|
researcher to pose questions of the exocortex, directly looking for
|
|
specific knowledge.
|
|
|
|
The four interfaces (query, exploration, presentation, and update)
|
|
may all be facets of the same interface, and they may benefit from
|
|
a cohesive and unified interface; however, it is important that all
|
|
of these use cases are considered and supported.
|
|
|
|
** An exploratory interface
|
|
|
|
The exploratory interface allows a researcher to meander through
|
|
the knowledge store, exploring topics and potentially identifying
|
|
new areas to push the knowledge sphere out further.
|
|
|
|
** A presentation interface
|
|
|
|
The presentation interface allows a set of notes to be shared with
|
|
others; it should be possible to include some or all artifacts
|
|
associated with these notes. For example, it may not be appropriate
|
|
to share a copy of a book with the presentation, but it may be
|
|
appropriate to share a copy of some of the supporting papers.
|
|
|
|
** An update interface
|
|
|
|
The update interface is where knowledge is added to the exocortex,
|
|
whether through capturing an artifact or writing notes.
|
|
|
|
** Locality
|
|
|
|
An exocortex must be localized to the user, with the full
|
|
repository available offline. Quick input or scratch pad notes
|
|
might be available, but realistically, the cost of cloud storage
|
|
and the transfer sizes mean that having the full exocortex
|
|
available is unlikely. Instead, a hybrid model allowing quick
|
|
captures of knowledge available remotely combined with a full
|
|
exocortex on a local system presents the probably best solution.
|
|
|
|
* Exploring the problem space
|
|
|
|
In order to map out the structure of an exocortex, it's useful to
|
|
review what has worked and what hasn't. Each alternative presented
|
|
will consider what worked and what didn't to clarify what an
|
|
effective exocortex looks like.
|
|
|
|
** Git-backed wikis and plaintext folders
|
|
|
|
At a high-level, wikis like Gitit and folders of plain-text
|
|
(including org-mode) data are roughly equivalent; the differences
|
|
lie primarily in how they are presented. Neither approach works
|
|
well for indexing or organizing artifacts, and while some
|
|
approaches like a scanner that adds notes to a SQLite database (for
|
|
improved search performance).
|
|
|
|
Using a folder of org-mode notes is probably one of the better
|
|
note-taking interfaces that I have found; however, there is no
|
|
notion of an artifact repository without considerable manual work.
|
|
|
|
The main downsides to this approach are the lack of good query and
|
|
exploration UIs, along with the lack of a useful artifact
|
|
repository. The upsides are good updates and presentation
|
|
interfaces.
|
|
|
|
** Evernote and Notion
|
|
|
|
Evernote (and also notion) provide a unified, searchable interface
|
|
across multiple machines. Evernote in particular has a usable
|
|
artifact repository, although information about upstream sources
|
|
isn't available, nor are metadata about the object or the idea of
|
|
multiple formats and history.
|
|
|
|
Evernote is a paid service, and neither is particularly extensible
|
|
to a user's needs. Exploring the exocortex is difficult, as there's
|
|
no notion of an entry point. Presenting nodes is met with some
|
|
success, albeit limited.
|
|
|
|
** Quiver
|
|
|
|
Quiver is an excellent note-taking application; however, it is
|
|
MacOS-only. It does have some ability to import web pages, but in
|
|
general it lacks any idea of an artifact repository. The ability to
|
|
intersperse different cell types is good.
|
|
|
|
** Jupyter notebooks
|
|
|
|
Jupyter notebooks provide an excellent interface for interspersing
|
|
computational ideas with prose; there is no notion of an artifact
|
|
repository, however. Linking notebooks isn't supported, and there
|
|
is no overall structure besides manual hyperlinking and a directory
|
|
structure.
|
|
|
|
* The artifact repository
|
|
|
|
The artifact repository is one of the two pillars of the exocortex;
|
|
it stores the "first hand" sources of knowledge.
|
|
|
|
** The central index
|
|
|
|
The first part of an artifact repository is a central index that
|
|
provides
|
|
|
|
+ references and linking to artifacts,
|
|
+ a "blob" store that contains the artifacts, and
|
|
+ some management interface that allows adding and editing metadata
|
|
as well as adding artifacts.
|
|
|
|
An artifact entry in the index contains, at a minimm,
|
|
|
|
+ An artifact identifier
|
|
+ Authorship information
|
|
|
|
The artifact identifier is used to associate all related artifacts
|
|
(e.g. previous revisions, different formats, etc.)
|
|
|
|
** Artifacts
|
|
|
|
An artifact consists of multiple components:
|
|
|
|
+ A primary metadata entry that organizes artifacts
|
|
+ Pointers to artifact "blobs"
|
|
+ A historical record of changed blobs
|
|
|
|
The metadata header for an artifact should contain, at a minimum,
|
|
fields for
|
|
|
|
+ Artifact identifier
|
|
+ A list of revisions
|
|
|
|
Each artifact can have zero or more blobs associated. For example,
|
|
a physical book reference might not have a blob associated; an
|
|
ebook might have multiple blobs corresponding to different formats;
|
|
and a webpage snapshot may have mulitple blobs representing
|
|
revisions to the page.
|
|
|
|
A blob header stores
|
|
|
|
+ The artifact identifier
|
|
+ The date retrieved or stored
|
|
+ The date of the artifact itself
|
|
+ The source
|
|
+ Blob type information (e.g. a MIME type)
|
|
+ A list of categories
|
|
+ A list of tags
|
|
|
|
The headers should probably be stored in a database of some kind;
|
|
SQLite is a good example for the first iteration. Blobs themselves
|
|
will need to be stored on disk, probably in a format related to a
|
|
hash of the blob contents, such as in a [[https://en.wikipedia.org/wiki/Content-addressable_storage][content-addressable store]]
|
|
(CAM).
|
|
|
|
* The exocortex
|
|
|
|
The exocortex consists of a graph database that links notes. At a
|
|
broad level, it should probably start with a root node that points
|
|
to broad fields. The update interface should allow manipulation of
|
|
nodes as graph nodes in addition to allowing for adding and editing
|
|
notes. A node might be thought of as =type node = Note |
|
|
ArtifactLink=. That is, a note can link to other notes or to
|
|
artifacts. A proper node title is the sum of the paths. For example,
|
|
consider the following structure:
|
|
|
|
|
|
|
|
|
|
* Next steps
|
|
|
|
A first step is to start constructing an artifact repository. Once
|
|
this is in place, a suitable graph database (for example, [[https://github.com/cayleygraph/cayley][cayley]])
|
|
should be identified, and an exocortex core developed. User
|
|
interfaces will necessarily be developed alongside these systems.
|