- Introduced `SwapManager` for buffering and writing incremental edits to sidecar `.kte.swp` files. - Implemented basic operations: insertion, deletion, split, join, and checkpointing. - Added recovery design doc (`docs/plans/swap-files.md`). - Updated editor initialization to integrate `SwapManager` instance for crash recovery across buffers.
4.5 KiB
4.5 KiB
Syntax highlighting in kte
Overview
kte provides lightweight syntax highlighting with a pluggable highlighter interface. The initial implementation targets C/C++ and focuses on speed and responsiveness.
Core types
TokenKind— token categories (keywords, types, strings, comments, numbers, preprocessor, operators, punctuation, identifiers, whitespace, etc.).HighlightSpan— a half-open column range[col_start, col_end)with aTokenKind.LineHighlight— a vector ofHighlightSpanand the bufferversionused to compute it.
Engine and caching
HighlighterEnginemaintains a per-line cache ofLineHighlightkeyed by row and buffer version.- Cache invalidation occurs when the buffer version changes or when the
buffer calls
InvalidateFrom(row), which clears cached lines and line states fromrowdownward. - The engine supports both stateless and stateful highlighters. For stateful highlighters, it memoizes a simple per-line state and computes lines sequentially when necessary.
Stateful highlighters
LanguageHighlighteris the base interface for stateless per-line tokenization.StatefulHighlighterextends it with aLineStateand the methodHighlightLineStateful(buf, row, prev_state, out).- The engine detects
StatefulHighlightervia dynamic_cast and feeds each line the previous line’s state, caching the resulting state per line.
C/C++ highlighter
CppHighlighterimplementsStatefulHighlighter.- Stateless constructs: line comments
//, strings"...", chars'...', numbers, identifiers (keywords/types), preprocessor at beginning of line after leading whitespace, operators/punctuation, and whitespace. - Stateful constructs (v2):
- Multi-line block comments
/* ... */— the state records whether the next line continues a comment. - Raw strings
R"delim(... )delim"— the state tracks whether we are inside a raw string and its delimiterdelimuntil the closing sequence appears.
- Multi-line block comments
Limitations and TODOs
- Raw string detection is intentionally simple and does not handle all corner cases of the C++ standard.
- Preprocessor handling is line-based; continuation lines with
\\are not yet tracked. - No semantic analysis; identifiers are classified via small keyword/type sets.
- Additional languages (JSON, Markdown, Shell, Python, Go, Rust, Lisp, …) are planned.
- Terminal color mapping is conservative to support 8/16-color terminals. Rich color-pair themes can be added later.
Renderer integration
- Terminal and GUI renderers request line spans via
Highlighter()->GetLine(buf, row, buf.Version()). - Search highlight and cursor overlays take precedence over syntax colors.
Renderer-side robustness
- Renderers defensively sanitize
HighlightSpandata before use to ensure stability even if a highlighter misbehaves:- Clamp
col_start/col_endto the line length and ensureend >= start. - Drop empty/invalid spans and sort by start.
- Clip drawing to the horizontally visible region and the tab-expanded line length.
- Clamp
- The highlighter engine returns
LineHighlightby value to avoid cross-thread lifetime issues; renderers operate on a local copy for each frame.
Extensibility (Phase 4)
- Public registration API: external code can register custom
highlighters by filetype.
- Use
HighlighterRegistry::Register("mylang", []{ return std::make_unique<MyHighlighter>(); }); - Registered factories are preferred over built-ins for the same filetype key.
- Filetype keys are normalized via
HighlighterRegistry::Normalize().
- Use
- Optional Tree-sitter adapter: disabled by default to keep dependencies
minimal.
- Enable with CMake option
-DKTE_ENABLE_TREESITTER=ONand provide-DTREESITTER_INCLUDE_DIR=...and-DTREESITTER_LIBRARY=...if needed. - Register a Tree-sitter-backed highlighter for a language (example
assumes you link a grammar):
extern "C" const TSLanguage* tree_sitter_c(); kte::HighlighterRegistry::RegisterTreeSitter("c", &tree_sitter_c); - Current adapter is a stub scaffold; it compiles and integrates cleanly when enabled, but intentionally emits no spans until Tree-sitter node-to-token mapping is implemented.
- Enable with CMake option