Files
kte/docs/swap.md
Kyle Isom 2a6ff2a862 Introduce swap journaling crash recovery system with tests.
- Added detailed journaling system (`SwapManager`) for crash recovery, including edit recording and replay.
- Integrated recovery prompts for handling swap files during file open flows.
- Implemented swap file cleanup, checkpointing, and compaction mechanisms.
- Added extensive unit tests for swap-related behaviors such as recovery prompts, file pruning, and corruption handling.
- Updated CMake to include new test files.
2026-02-13 08:45:27 -08:00

6.7 KiB
Raw Blame History

Swap journaling (crash recovery)

kte has a small “swap” system: an append-only per-buffer journal that records edits so they can be replayed after a crash.

This document describes the currently implemented swap system (stage 2), as implemented in Swap.h / Swap.cc.

What it is (and what it is not)

  • The swap file is a journal of editing operations (currently inserts, deletes, and periodic full-buffer checkpoints).
  • It is written by a single background writer thread owned by kte::SwapManager.
  • It is intended for best-effort crash recovery.

kte automatically deletes/resets swap journals after a clean save and when closing a clean buffer, so old swap files do not accumulate under normal workflows. A best-effort prune also runs at startup to remove very old swap files.

Automatic recovery prompt

When kte opens a file-backed buffer, it checks whether a corresponding swap journal exists.

  • If a swap file exists and replay succeeds and produces different content than what is currently on disk, kte prompts:

    Recover swap edits for <path>? (y/N, C-g cancel)
    
    • y: open the file and apply swap replay (buffer becomes dirty)
    • Enter (default) / any non-y: delete the swap file ( best-effort) and open the file normally
    • C-g: cancel opening the file
  • If a swap file exists but is unreadable/corrupt, kte prompts:

    Swap file unreadable for <path>. Delete it? (y/N, C-g cancel)
    
    • y: delete the swap file (best-effort) and open the file normally
    • Enter (default): keep the swap file and open the file normally
    • C-g: cancel opening the file

Where swap files live

Swap files are stored under an XDG-style per-user state directory:

  • If XDG_STATE_HOME is set and non-empty:
    • $XDG_STATE_HOME/kte/swap/…
  • Otherwise, if HOME is set:
    • ~/.local/state/kte/swap/…
  • Last resort fallback:
    • <system-temp>/kte/state/kte/swap/… (via std::filesystem::temp_directory_path())

Swap files are always created with permissions 0600.

Swap file naming

For file-backed buffers, the swap filename is derived from the buffers path:

  1. Take a canonical-ish path key (std::filesystem::weakly_canonical, else absolute, else the raw Buffer::Filename()).
  2. Encode it so its human-identifiable:
    • Strip one leading path separator (/ or \\)
    • Replace path separators (/ and \\) with !
    • Append .swp

Example:

/home/kyle/tmp/test.txt  ->  home!kyle!tmp!test.txt.swp

If the resulting name would be long (over ~200 characters), kte falls back to a shorter stable name:

<basename>.<fnv1a64(path-key-as-hex)>.swp

For unnamed/unsaved buffers, kte uses:

unnamed-<pid>-<counter>.swp

Lifecycle (when swap is written)

kte::SwapManager is owned by Editor (see Editor.cc). Buffers are attached for journaling when they are added/opened.

  • SwapManager::Attach(Buffer*) starts tracking a buffer and establishes its swap path.
  • Buffer emits swap events from its low-level edit APIs:
    • Buffer::insert_text() calls SwapRecorder::OnInsert()
    • Buffer::delete_text() calls SwapRecorder::OnDelete()
    • Buffer::split_line() / join_lines() are represented as insert/delete of \n (they do not emit SPLIT/JOIN records in stage 1).
  • SwapManager::Detach(Buffer*) flushes queued records, fsync()s, and closes the journal.
  • On Save As / filename changes, SwapManager::NotifyFilenameChanged(Buffer&) closes the existing journal and switches to a new path.
    • Note: the old swap file is currently left on disk (no cleanup/rotation yet).

Durability and performance

Swap writing is best-effort and asynchronous:

  • Records are queued from the UI/editing thread(s).
  • A background writer thread wakes at least every SwapConfig::flush_interval_ms (default 200ms) to write any queued records.
  • fsync() is throttled to at most once per SwapConfig::fsync_interval_ms (default 1000ms) per open swap file.
  • SwapManager::Flush() blocks until the queue is fully written; it is primarily used by tests and shutdown paths.

If a crash happens while writing, the swap file may end with a partial record. Replay detects truncation/CRC mismatch and fails safely.

On-disk format (v1)

The file is:

  1. A fixed-size 64-byte header
  2. Followed by a stream of records

All multi-byte integers in the swap file are little-endian.

Header (64 bytes)

Layout (stage 1):

  • magic (8 bytes): KTE_SWP\0
  • version (u32): currently 1
  • flags (u32): currently 0
  • created_time (u64): Unix seconds
  • remaining bytes are reserved/padding (currently zeroed)

Record framing

Each record is:

[type: u8][len: u24][payload: len bytes][crc32: u32]
  • len is a 24-bit little-endian length of the payload (0..0xFFFFFF).
  • crc32 is computed over the 4-byte record header (type + len) followed by the payload bytes.

Record types

Type codes are defined in SwapRecType (Swap.h). Stage 1 primarily emits:

  • INS (1): insert bytes at (row, col)
  • DEL (2): delete len bytes at (row, col)

Other type codes exist for forward compatibility (SPLIT, JOIN, META, CHKPT), but are not produced by the current SwapRecorder interface.

Payload encoding (v1)

Every payload starts with:

[encver: u8]

Currently encver must be 1.

INS payload (encver = 1)

[encver: u8 = 1]
[row:   u32]
[col:   u32]
[nbytes:u32]
[bytes: nbytes]

DEL payload (encver = 1)

[encver: u8 = 1]
[row:   u32]
[col:   u32]
[len:   u32]

row/col are 0-based and are interpreted the same way as Buffer::insert_text() / Buffer::delete_text().

Replay / recovery

Swap replay is implemented as a low-level API:

bool kte::SwapManager::ReplayFile(Buffer &buf, const std::string &swap_path, std::string &err)

Behavior:

  • The caller supplies an already-open Buffer (typically loaded from the on-disk file) and a swap path.
  • ReplayFile() validates header magic/version, then iterates records.
  • On a truncated file or CRC mismatch, it returns false and sets err.
  • On unknown record types, it ignores them (forward compatibility).
  • On failure, the buffer may have had a prefix of records applied; callers should treat this as “recovery failed”.

Important: if the buffer is currently attached to a SwapManager, you should suspend/disable recording during replay (or detach first), otherwise replayed edits would be re-journaled.

Tests

Swap behavior and format are validated by unit tests:

  • tests/test_swap_writer.cc (header, permissions, record CRC framing)
  • tests/test_swap_replay.cc (record replay and truncation handling)