- Remove outdated `undo-state.md` - Add two code quality/optimization reports that were used to guide previous work: - `code-report.md` (optimization) - `code-report-quality.md` (stability and code health) - Add `themes.md`. - Update undo system docs and roadmap.
6.9 KiB
6.9 KiB
Objective
Introduce fast, minimal‑dependency syntax highlighting to kte, consistent with current architecture (Editor/Buffer + GUI/Terminal renderers), preserving ke UX and performance.
Guiding principles
- Keep core small and fast; no heavy deps (C++17 only).
- Start simple (stateless line regex), evolve incrementally (stateful, caching).
- Work in both Terminal (ncurses) and GUI (ImGui) with consistent token classes and theme mapping.
- Integrate without disrupting existing search highlight, selection, or cursor rendering.
Scope of v1
- Languages: plain text (off), C/C++ minimal set (keywords, types, strings, chars, comments, numbers, preprocessor).
- Stateless per‑line highlighting; handle single‑line comments and strings; defer multi‑line state to v2.
- Toggle:
:syntax on|offand per‑buffer filetype selection.
Architecture
-
Core types (new):
enum class TokenKind { Default, Keyword, Type, String, Char, Comment, Number, Preproc, Constant, Function, Operator, Punctuation, Identifier, Whitespace, Error };struct HighlightSpan { int col_start; int col_end; TokenKind kind; };// 0‑based columns in buffer indices per rendered linestruct LineHighlight { std::vector<HighlightSpan> spans; uint64_t version; };
-
Interfaces (new):
class LanguageHighlighter { public: virtual ~LanguageHighlighter() = default; virtual void HighlightLine(const Buffer& buf, int row, std::vector<HighlightSpan>& out) const = 0; virtual bool Stateful() const { return false; } };class HighlighterEngine { public: void SetHighlighter(std::unique_ptr<LanguageHighlighter>); const LineHighlight& GetLine(const Buffer&, int row, uint64_t buf_version); void InvalidateFrom(int row); };class HighlighterRegistry { public: static const LanguageHighlighter& ForFiletype(std::string_view ft); static std::string DetectForPath(std::string_view path, std::string_view first_line); };
-
Editor/Buffer integration:
- Per‑Buffer settings:
bool syntax_enabled; std::string filetype; std::unique_ptr<HighlighterEngine> highlighter; - Buffer emits a monotonically increasing
versionon edit; renderers request line highlights by(row, version). - Invalidate cache minimally on edits (v1: current line only; v2: from current line down when stateful constructs present).
- Per‑Buffer settings:
Rendering integration
- TerminalRenderer/GUIRenderer changes:
- During line rendering, query
Editor.CurrentBuffer()->highlighter->GetLine(buf, row, buf_version)to obtain spans. - Apply token styles while drawing glyph runs.
- During line rendering, query
- Z‑order and blending:
- Backgrounds (e.g., selection, search highlight rectangles)
- Text with syntax colors
- Cursor/IME decorations
- Search highlights must remain visible over syntax colors:
- Terminal: combine color/attr with reverse/bold for search; if color conflicts, prefer search.
- GUI: draw semi‑transparent rects behind text (already present); keep syntax color for text.
Theme and color mapping
- Extend
GUITheme.hwith aSyntaxPalettemappingTokenKind -> ImVec4 ink(and optional background tint for comments/strings disabled by default). Provide default Light/Dark palettes. - Terminal: map
TokenKindto ncurses color pairs where available; degrade gracefully on 8/16‑color terminals (e.g., comments=dim, keywords=bold, strings=yellow/green if available).
Language detection
- v1: by file extension; allow manual
:set filetype=<lang>. - v2: add shebang detection for scripts, simple modelines (optional).
Commands/UX
:syntax on|off— global default; buffer inherits on open.:set filetype=<lang>— per‑buffer override.:syntax reload— rebuild patterns/themes.- Status line shows filetype and syntax state when changed.
Implementation plan (phased)
-
Phase 1 — Minimal regex highlighter for C/C++
- Implement
CppRegexHighlighter : LanguageHighlighterwith precompiledstd::regex(or hand‑rolled simple scanners to avoid regex backtracking). Classes: line comment//…, block comment start/*(no state), string"…", char'…'(no multiline), numbers, keywords/types, preprocessor^\s*#\w+. - Add
HighlighterEnginewith a simple per‑row cache keyed by(row, buf_version); no background worker. - Integrate into both renderers; add palette to
GUITheme.h; add terminal color selection. - Add commands.
- Implement
-
Phase 2 — Stateful constructs and more languages
- Add state machine for multiline comments
/*…*/and multiline strings (C++11 raw strings), with invalidation from edit line downward until state stabilizes. - Add simple highlighters: JSON (strings, numbers, booleans, null, punctuation), Markdown (headers/emphasis/code fences), Shell (comments, strings, keywords), Go (types, constants, keywords), Python (strings, comments, keywords), Rust (strings, comments, keywords), Lisp (comments, strings, keywords),.
- Filetype detection by extension + shebang.
- Add state machine for multiline comments
-
Phase 3 — Performance and caching
- Viewport‑first highlighting: compute only visible rows each frame; background task warms cache around viewport.
- Reuse span buffers, avoid allocations; small‑vector optimization if needed.
- Bench with large files; ensure O(n_visible) cost per frame.
-
Phase 4 — Extensibility
- Public registration API for external highlighters.
- Optional Tree‑sitter adapter behind a compile flag (off by default) to keep dependencies minimal.
Data flow (per frame)
- Renderer asks Editor for Buffer and viewport rows.
- For each row:
engine.GetLine(buf, row, buf.version)→ spans. - Renderer emits runs with style from
SyntaxPalette[kind]. - Search highlights are applied as separate background rectangles (GUI) or attribute toggles (Terminal), not overriding text color.
Testing
- Unit tests for tokenization per language: golden inputs → spans.
- Fuzz/edge cases: escaped quotes, numeric literals, preprocessor lines.
- Renderer tests with
TestRendererasserting the sequence of style changes for a line. - Performance tests: highlight 1k visible lines repeatedly; assert time under threshold.
Risks and mitigations
- Regex backtracking/perf: prefer linear scans; precompute keyword tables; avoid nested regex.
- Terminal color limitations: feature‑detect colors; provide bold/dim fallbacks.
- Stateful correctness: invalidate conservatively (from edit line downward) and cap work per frame.
Deliverables
- New files:
Highlight.h/.cc,HighlighterEngine.h/.cc,LanguageHighlighter.h,CppHighlighter.h/.cc, optionalHighlighterRegistry.h/.cc. - Renderer updates:
GUIRenderer.cc,TerminalRenderer.ccto consume spans. - Theming:
GUITheme.hadditions for syntax colors. - Editor/Buffer: per‑buffer syntax settings and highlighter handle.
- Commands in
Command.ccand help text updates. - Docs: README/ROADMAP update and a brief
docs/syntax.md. - Tests: unit and renderer golden tests.