Add swap file journaling for crash recovery.
- Introduced `SwapManager` for buffering and writing incremental edits to sidecar `.kte.swp` files. - Implemented basic operations: insertion, deletion, split, join, and checkpointing. - Added recovery design doc (`docs/plans/swap-files.md`). - Updated editor initialization to integrate `SwapManager` instance for crash recovery across buffers.
This commit is contained in:
@@ -2,27 +2,43 @@
|
||||
|
||||
## Overview
|
||||
|
||||
`TestFrontend` is a headless implementation of the `Frontend` interface designed to facilitate programmatic testing of editor features. It allows you to queue commands and text input manually, execute them step-by-step, and inspect the editor/buffer state.
|
||||
`TestFrontend` is a headless implementation of the `Frontend` interface
|
||||
designed to facilitate programmatic testing of editor features. It
|
||||
allows you to queue commands and text input manually, execute them
|
||||
step-by-step, and inspect the editor/buffer state.
|
||||
|
||||
## Components
|
||||
|
||||
### TestInputHandler
|
||||
|
||||
A programmable input handler that uses a queue-based system:
|
||||
- `QueueCommand(CommandId id, const std::string &arg = "", int count = 0)` - Queue a specific command
|
||||
- `QueueText(const std::string &text)` - Queue text for insertion (character by character)
|
||||
|
||||
-
|
||||
`QueueCommand(CommandId id, const std::string &arg = "", int count = 0)` -
|
||||
Queue a specific command
|
||||
- `QueueText(const std::string &text)` - Queue text for insertion (
|
||||
character by character)
|
||||
- `Poll(MappedInput &out)` - Returns queued commands one at a time
|
||||
- `IsEmpty()` - Check if the input queue is empty
|
||||
|
||||
### TestRenderer
|
||||
|
||||
A minimal no-op renderer for testing:
|
||||
- `Draw(Editor &ed)` - No-op implementation, just increments draw counter
|
||||
|
||||
- `Draw(Editor &ed)` - No-op implementation, just increments draw
|
||||
counter
|
||||
- `GetDrawCount()` - Returns the number of times Draw() was called
|
||||
- `ResetDrawCount()` - Resets the draw counter
|
||||
|
||||
### TestFrontend
|
||||
The main frontend class that integrates TestInputHandler and TestRenderer:
|
||||
- `Init(Editor &ed)` - Initializes the frontend (sets editor dimensions to 24x80)
|
||||
- `Step(Editor &ed, bool &running)` - Processes one command from the queue and renders
|
||||
|
||||
The main frontend class that integrates TestInputHandler and
|
||||
TestRenderer:
|
||||
|
||||
- `Init(Editor &ed)` - Initializes the frontend (sets editor dimensions
|
||||
to 24x80)
|
||||
- `Step(Editor &ed, bool &running)` - Processes one command from the
|
||||
queue and renders
|
||||
- `Shutdown()` - Cleanup (no-op for TestFrontend)
|
||||
- `Input()` - Access the TestInputHandler
|
||||
- `Renderer()` - Access the TestRenderer
|
||||
@@ -75,31 +91,55 @@ int main() {
|
||||
|
||||
## Key Features
|
||||
|
||||
1. **Programmable Input**: Queue any sequence of commands or text programmatically
|
||||
1. **Programmable Input**: Queue any sequence of commands or text
|
||||
programmatically
|
||||
2. **Step-by-Step Execution**: Run the editor one command at a time
|
||||
3. **State Inspection**: Access and verify editor/buffer state between commands
|
||||
4. **No UI Dependencies**: Headless operation, no terminal or GUI required
|
||||
5. **Integration Testing**: Test command sequences, undo/redo, multi-line editing, etc.
|
||||
3. **State Inspection**: Access and verify editor/buffer state between
|
||||
commands
|
||||
4. **No UI Dependencies**: Headless operation, no terminal or GUI
|
||||
required
|
||||
5. **Integration Testing**: Test command sequences, undo/redo,
|
||||
multi-line editing, etc.
|
||||
|
||||
## Available Commands
|
||||
|
||||
All commands from `CommandId` enum can be queued, including:
|
||||
|
||||
- `CommandId::InsertText` - Insert text (use `QueueText()` helper)
|
||||
- `CommandId::Newline` - Insert newline
|
||||
- `CommandId::Backspace` - Delete character before cursor
|
||||
- `CommandId::Backspace` - Delete character before cursor
|
||||
- `CommandId::DeleteChar` - Delete character at cursor
|
||||
- `CommandId::MoveLeft`, `MoveRight`, `MoveUp`, `MoveDown` - Cursor movement
|
||||
- `CommandId::MoveLeft`, `MoveRight`, `MoveUp`, `MoveDown` - Cursor
|
||||
movement
|
||||
- `CommandId::Undo`, `CommandId::Redo` - Undo/redo operations
|
||||
- `CommandId::Save`, `CommandId::Quit` - File operations
|
||||
- And many more (see Command.h)
|
||||
|
||||
## Integration
|
||||
|
||||
TestFrontend is built into both `kte` and `kge` executables as part of the common source files. You can create standalone test programs by linking against the same source files and ncurses.
|
||||
TestFrontend is built into both `kte` and `kge` executables as part of
|
||||
the common source files. You can create standalone test programs by
|
||||
linking against the same source files and ncurses.
|
||||
|
||||
## Notes
|
||||
|
||||
- Always call `InstallDefaultCommands()` before using any commands
|
||||
- Buffer must be initialized (via `OpenFile()` or `AddBuffer()`) before queuing edit commands
|
||||
- Buffer must be initialized (via `OpenFile()` or `AddBuffer()`) before
|
||||
queuing edit commands
|
||||
- Undo/redo requires the buffer to have an UndoSystem attached
|
||||
- The test frontend sets editor dimensions to 24x80 by default
|
||||
|
||||
## Highlighter stress harness
|
||||
|
||||
For renderer/highlighter race testing without a UI, `kte` provides a
|
||||
lightweight stress mode:
|
||||
|
||||
```
|
||||
kte --stress-highlighter=5
|
||||
```
|
||||
|
||||
This runs a short synthetic workload (5 seconds by default) that edits
|
||||
and scrolls a buffer while
|
||||
exercising `HighlighterEngine::PrefetchViewport` and `GetLine`
|
||||
concurrently. Use Debug builds with
|
||||
AddressSanitizer enabled for best effect.
|
||||
|
||||
144
docs/plans/swap-files.md
Normal file
144
docs/plans/swap-files.md
Normal file
@@ -0,0 +1,144 @@
|
||||
Swap files for kte — design plan
|
||||
================================
|
||||
|
||||
Goals
|
||||
-----
|
||||
|
||||
- Preserve user work across crashes, power failures, and OS kills.
|
||||
- Keep the editor responsive; avoid blocking the UI on disk I/O.
|
||||
- Bound recovery time and swap size.
|
||||
- Favor simple, robust primitives that work well on POSIX and macOS;
|
||||
keep Windows feasibility in mind.
|
||||
|
||||
Model overview
|
||||
--------------
|
||||
Per open buffer, maintain a sidecar swap journal next to the file:
|
||||
|
||||
- Path: `.<basename>.kte.swp` in the same directory as the file (for
|
||||
unnamed/unsaved buffers, use a per‑session temp dir like
|
||||
`$TMPDIR/kte/` with a random UUID).
|
||||
- Format: append‑only journal of editing operations with periodic
|
||||
checkpoints.
|
||||
- Crash safety: only append, fsync as per policy; checkpoint via
|
||||
write‑to‑temp + fsync + atomic rename.
|
||||
|
||||
File format (v1)
|
||||
----------------
|
||||
Header (fixed 64 bytes):
|
||||
|
||||
- Magic: `KTE_SWP\0` (8 bytes)
|
||||
- Version: 1 (u32)
|
||||
- Flags: bitset (u32) — e.g., compression, checksums, endian.
|
||||
- Created time (u64)
|
||||
- Host info hash (u64) — optional, for telemetry/debug.
|
||||
- File identity: hash of canonical path (u64) and original file
|
||||
size+mtime (u64+u64) at start.
|
||||
- Reserved/padding.
|
||||
|
||||
Records (stream after header):
|
||||
|
||||
- Each record: [type u8][len u24][payload][crc32 u32]
|
||||
- Types:
|
||||
- `CHKPT` — full snapshot checkpoint of entire buffer content and
|
||||
minimal metadata (cursor pos, filetype). Payload optionally
|
||||
compressed. Written occasionally to cap replay time.
|
||||
- `INS` — insert at (row, col) text bytes (text may contain
|
||||
newlines). Encoded with varints.
|
||||
- `DEL` — delete length at (row, col). If spanning lines, semantics
|
||||
defined as in Buffer::delete_text.
|
||||
- `SPLIT`, `JOIN` — explicit structural ops (optional; can be
|
||||
expressed via INS/DEL).
|
||||
- `META` — update metadata (e.g., filetype, encoding hints).
|
||||
|
||||
Durability policy
|
||||
-----------------
|
||||
Configurable knobs (sane defaults in parentheses):
|
||||
|
||||
- Time‑based flush: group edits and flush every 150–300 ms (200 ms).
|
||||
- Operation count flush: after N ops (200).
|
||||
- Idle flush: on 500 ms idle lull, flush immediately.
|
||||
- Checkpoint cadence: after M KB of journal (512–2048 KB) or T seconds (
|
||||
30–120 s), whichever first.
|
||||
- fsync policy:
|
||||
- `always`: fsync every flush (safest, slowest).
|
||||
- `grouped` (default): fsync at most every 1–2 s or on
|
||||
idle/blur/quit.
|
||||
- `never`: rely on OS flush (fastest, riskier).
|
||||
- On POSIX, prefer `fdatasync` when available; fall back to `fsync`.
|
||||
|
||||
Performance & threading
|
||||
-----------------------
|
||||
|
||||
- Background writer thread per editor instance (shared) with a bounded
|
||||
MPSC queue of per‑buffer records.
|
||||
- Each Buffer has a small in‑memory journal buffer; UI thread enqueues
|
||||
ops (non‑blocking) and may coalesce adjacent inserts/deletes.
|
||||
- Writer batch‑writes records to the swap file, computes CRCs, and
|
||||
decides checkpoint boundaries.
|
||||
- Backpressure: if the queue grows beyond a high watermark, signal the
|
||||
UI to start coalescing more aggressively and slow enqueue (never block
|
||||
hard editing path; at worst drop optional `META`).
|
||||
|
||||
Recovery flow
|
||||
-------------
|
||||
|
||||
On opening a file:
|
||||
|
||||
1. Detect swap sidecar `.<basename>.kte.swp`.
|
||||
2. Validate header, iterate records verifying CRCs.
|
||||
3. Compare recorded original file identity against actual file; if
|
||||
mismatch, warn user but allow recovery (content wins).
|
||||
4. Reconstruct buffer: start from the last good `CHKPT` (if any), then
|
||||
replay subsequent ops. If trailing partial record encountered (EOF
|
||||
mid‑record), truncate at last good offset.
|
||||
5. Present a choice: Recover (load recovered buffer; keep the swap file
|
||||
until user saves) or Discard (delete swap file and open clean file).
|
||||
|
||||
Stability & corruption mitigation
|
||||
---------------------------------
|
||||
|
||||
- Append‑only with per‑record CRC32 guards against torn writes.
|
||||
- Atomic checkpoint rotation: write `.<basename>.kte.swp.tmp`, fsync,
|
||||
then rename over old `.swp`.
|
||||
- Size caps: rotate or compact when `.swp` exceeds a threshold (e.g.,
|
||||
64–128 MB). Compaction creates a fresh file with a single checkpoint.
|
||||
- Low‑disk‑space behavior: on write failures, surface a non‑modal
|
||||
warning and temporarily fall back to in‑memory only; retry
|
||||
opportunistically.
|
||||
|
||||
Security considerations
|
||||
-----------------------
|
||||
|
||||
- Swap files mirror buffer content, which may be sensitive. Options:
|
||||
- Configurable location (same dir vs. `$XDG_STATE_HOME/kte/swap`).
|
||||
- Optional per‑file encryption (future work) using OS keychain.
|
||||
- Ensure permissions are 0600.
|
||||
|
||||
Interoperability & UX
|
||||
---------------------
|
||||
|
||||
- Use a distinctive extension `.kte.swp` to avoid conflicts with other
|
||||
editors.
|
||||
- Status bar indicator when swap is active; commands to purge/compact.
|
||||
- On save: do not delete swap immediately; keep until the buffer is
|
||||
clean and idle for a short grace period (allows undo of accidental
|
||||
external changes).
|
||||
|
||||
Implementation plan (staged)
|
||||
----------------------------
|
||||
|
||||
1. Minimal journal writer (append‑only INS/DEL) with grouped fsync;
|
||||
single per‑editor writer thread.
|
||||
2. Reader/recovery path with CRC validation and replay.
|
||||
3. Checkpoints + atomic rotation; compaction path.
|
||||
4. Config surface and UI prompts; telemetry counters.
|
||||
5. Optional compression and advanced coalescing.
|
||||
|
||||
Defaults balancing performance and stability
|
||||
-------------------------------------------
|
||||
|
||||
- Grouped flush with fsync every ~1 s or on idle/quit.
|
||||
- Checkpoint every 1 MB or 60 s.
|
||||
- Bounded queue and batch writes to minimize syscalls.
|
||||
- Immediate flush on critical events (buffer close, app quit, power
|
||||
source change on laptops if detectable).
|
||||
119
docs/syntax.md
119
docs/syntax.md
@@ -4,67 +4,118 @@ Syntax highlighting in kte
|
||||
Overview
|
||||
--------
|
||||
|
||||
kte provides lightweight syntax highlighting with a pluggable highlighter interface. The initial implementation targets C/C++ and focuses on speed and responsiveness.
|
||||
kte provides lightweight syntax highlighting with a pluggable
|
||||
highlighter interface. The initial implementation targets C/C++ and
|
||||
focuses on speed and responsiveness.
|
||||
|
||||
Core types
|
||||
----------
|
||||
|
||||
- `TokenKind` — token categories (keywords, types, strings, comments, numbers, preprocessor, operators, punctuation, identifiers, whitespace, etc.).
|
||||
- `HighlightSpan` — a half-open column range `[col_start, col_end)` with a `TokenKind`.
|
||||
- `LineHighlight` — a vector of `HighlightSpan` and the buffer `version` used to compute it.
|
||||
- `TokenKind` — token categories (keywords, types, strings, comments,
|
||||
numbers, preprocessor, operators, punctuation, identifiers,
|
||||
whitespace, etc.).
|
||||
- `HighlightSpan` — a half-open column range `[col_start, col_end)` with
|
||||
a `TokenKind`.
|
||||
- `LineHighlight` — a vector of `HighlightSpan` and the buffer `version`
|
||||
used to compute it.
|
||||
|
||||
Engine and caching
|
||||
------------------
|
||||
|
||||
- `HighlighterEngine` maintains a per-line cache of `LineHighlight` keyed by row and buffer version.
|
||||
- Cache invalidation occurs when the buffer version changes or when the buffer calls `InvalidateFrom(row)`, which clears cached lines and line states from `row` downward.
|
||||
- The engine supports both stateless and stateful highlighters. For stateful highlighters, it memoizes a simple per-line state and computes lines sequentially when necessary.
|
||||
- `HighlighterEngine` maintains a per-line cache of `LineHighlight`
|
||||
keyed by row and buffer version.
|
||||
- Cache invalidation occurs when the buffer version changes or when the
|
||||
buffer calls `InvalidateFrom(row)`, which clears cached lines and line
|
||||
states from `row` downward.
|
||||
- The engine supports both stateless and stateful highlighters. For
|
||||
stateful highlighters, it memoizes a simple per-line state and
|
||||
computes lines sequentially when necessary.
|
||||
|
||||
Stateful highlighters
|
||||
---------------------
|
||||
|
||||
- `LanguageHighlighter` is the base interface for stateless per-line tokenization.
|
||||
- `StatefulHighlighter` extends it with a `LineState` and the method `HighlightLineStateful(buf, row, prev_state, out)`.
|
||||
- The engine detects `StatefulHighlighter` via dynamic_cast and feeds each line the previous line’s state, caching the resulting state per line.
|
||||
- `LanguageHighlighter` is the base interface for stateless per-line
|
||||
tokenization.
|
||||
- `StatefulHighlighter` extends it with a `LineState` and the method
|
||||
`HighlightLineStateful(buf, row, prev_state, out)`.
|
||||
- The engine detects `StatefulHighlighter` via dynamic_cast and feeds
|
||||
each line the previous line’s state, caching the resulting state per
|
||||
line.
|
||||
|
||||
C/C++ highlighter
|
||||
-----------------
|
||||
|
||||
- `CppHighlighter` implements `StatefulHighlighter`.
|
||||
- Stateless constructs: line comments `//`, strings `"..."`, chars `'...'`, numbers, identifiers (keywords/types), preprocessor at beginning of line after leading whitespace, operators/punctuation, and whitespace.
|
||||
- Stateless constructs: line comments `//`, strings `"..."`, chars
|
||||
`'...'`, numbers, identifiers (keywords/types), preprocessor at
|
||||
beginning of line after leading whitespace, operators/punctuation, and
|
||||
whitespace.
|
||||
- Stateful constructs (v2):
|
||||
- Multi-line block comments `/* ... */` — the state records whether the next line continues a comment.
|
||||
- Raw strings `R"delim(... )delim"` — the state tracks whether we are inside a raw string and its delimiter `delim` until the closing sequence appears.
|
||||
- Multi-line block comments `/* ... */` — the state records whether
|
||||
the next line continues a comment.
|
||||
- Raw strings `R"delim(... )delim"` — the state tracks whether we
|
||||
are inside a raw string and its delimiter `delim` until the
|
||||
closing sequence appears.
|
||||
|
||||
Limitations and TODOs
|
||||
---------------------
|
||||
|
||||
- Raw string detection is intentionally simple and does not handle all corner cases of the C++ standard.
|
||||
- Preprocessor handling is line-based; continuation lines with `\\` are not yet tracked.
|
||||
- No semantic analysis; identifiers are classified via small keyword/type sets.
|
||||
- Additional languages (JSON, Markdown, Shell, Python, Go, Rust, Lisp, …) are planned.
|
||||
- Terminal color mapping is conservative to support 8/16-color terminals. Rich color-pair themes can be added later.
|
||||
- Raw string detection is intentionally simple and does not handle all
|
||||
corner cases of the C++ standard.
|
||||
- Preprocessor handling is line-based; continuation lines with `\\` are
|
||||
not yet tracked.
|
||||
- No semantic analysis; identifiers are classified via small
|
||||
keyword/type sets.
|
||||
- Additional languages (JSON, Markdown, Shell, Python, Go, Rust,
|
||||
Lisp, …) are planned.
|
||||
- Terminal color mapping is conservative to support 8/16-color
|
||||
terminals. Rich color-pair themes can be added later.
|
||||
|
||||
Renderer integration
|
||||
--------------------
|
||||
|
||||
- Terminal and GUI renderers request line spans via `Highlighter()->GetLine(buf, row, buf.Version())`.
|
||||
- Search highlight and cursor overlays take precedence over syntax colors.
|
||||
- Terminal and GUI renderers request line spans via
|
||||
`Highlighter()->GetLine(buf, row, buf.Version())`.
|
||||
- Search highlight and cursor overlays take precedence over syntax
|
||||
colors.
|
||||
|
||||
Renderer-side robustness
|
||||
------------------------
|
||||
|
||||
- Renderers defensively sanitize `HighlightSpan` data before use to
|
||||
ensure stability even if a highlighter misbehaves:
|
||||
- Clamp `col_start/col_end` to the line length and ensure
|
||||
`end >= start`.
|
||||
- Drop empty/invalid spans and sort by start.
|
||||
- Clip drawing to the horizontally visible region and the
|
||||
tab-expanded line length.
|
||||
- The highlighter engine returns `LineHighlight` by value to avoid
|
||||
cross-thread lifetime issues; renderers operate on a local copy for
|
||||
each frame.
|
||||
|
||||
Extensibility (Phase 4)
|
||||
-----------------------
|
||||
|
||||
- Public registration API: external code can register custom highlighters by filetype.
|
||||
- Use `HighlighterRegistry::Register("mylang", []{ return std::make_unique<MyHighlighter>(); });`
|
||||
- Registered factories are preferred over built-ins for the same filetype key.
|
||||
- Filetype keys are normalized via `HighlighterRegistry::Normalize()`.
|
||||
- Optional Tree-sitter adapter: disabled by default to keep dependencies minimal.
|
||||
- Enable with CMake option `-DKTE_ENABLE_TREESITTER=ON` and provide
|
||||
`-DTREESITTER_INCLUDE_DIR=...` and `-DTREESITTER_LIBRARY=...` if needed.
|
||||
- Register a Tree-sitter-backed highlighter for a language (example assumes you link a grammar):
|
||||
```c++
|
||||
extern "C" const TSLanguage* tree_sitter_c();
|
||||
kte::HighlighterRegistry::RegisterTreeSitter("c", &tree_sitter_c);
|
||||
```
|
||||
- Current adapter is a stub scaffold; it compiles and integrates cleanly when enabled, but
|
||||
intentionally emits no spans until Tree-sitter node-to-token mapping is implemented.
|
||||
- Public registration API: external code can register custom
|
||||
highlighters by filetype.
|
||||
- Use
|
||||
`HighlighterRegistry::Register("mylang", []{ return std::make_unique<MyHighlighter>(); });`
|
||||
- Registered factories are preferred over built-ins for the same
|
||||
filetype key.
|
||||
- Filetype keys are normalized via
|
||||
`HighlighterRegistry::Normalize()`.
|
||||
- Optional Tree-sitter adapter: disabled by default to keep dependencies
|
||||
minimal.
|
||||
- Enable with CMake option `-DKTE_ENABLE_TREESITTER=ON` and provide
|
||||
`-DTREESITTER_INCLUDE_DIR=...` and `-DTREESITTER_LIBRARY=...` if
|
||||
needed.
|
||||
- Register a Tree-sitter-backed highlighter for a language (example
|
||||
assumes you link a grammar):
|
||||
```c++
|
||||
extern "C" const TSLanguage* tree_sitter_c();
|
||||
kte::HighlighterRegistry::RegisterTreeSitter("c", &tree_sitter_c);
|
||||
```
|
||||
- Current adapter is a stub scaffold; it compiles and integrates
|
||||
cleanly when enabled, but
|
||||
intentionally emits no spans until Tree-sitter node-to-token
|
||||
mapping is implemented.
|
||||
|
||||
Reference in New Issue
Block a user