Add benchmarks, migration tests, and dev guide

Add benchmarks for core operations, migration edge case tests, improved buffer I/O tests, and developer guide - Introduced `test_benchmarks.cc` for performance benchmarking of key operations in `PieceTable` and `Buffer`, including syntax highlighting and iteration patterns. - Added `test_migration_coverage.cc` to provide comprehensive tests for migration of `Buffer::Rows()` to `PieceTable` APIs, with edge cases, boundary handling, and consistency checks. - Enhanced `test_buffer_io.cc` with additional cases for save/load workflows, file handling, and better integration with the core API. - Documented architectural details and core concepts in a new `DEVELOPER_GUIDE.md`. Highlighted design principles, code organization, and contribution workflows.
2026-02-17 16:08:23 -08:00
parent 337b585ba0
commit 8ec0d6ac41
30 changed files with 1916 additions and 102 deletions
--- a/docs/BENCHMARKS.md
+++ b/docs/BENCHMARKS.md
@@ -0,0 +1,245 @@
+# kte Benchmarking and Testing Guide
+
+This document describes the benchmarking infrastructure and testing
+improvements added to ensure high performance and correctness of core
+operations.
+
+## Overview
+
+The kte test suite now includes comprehensive benchmarks and migration
+coverage tests to:
+
+- Measure performance of core operations (PieceTable, Buffer, syntax
+  highlighting)
+- Ensure no performance regressions from refactorings
+- Validate correctness of API migrations (Buffer::Rows() →
+  GetLineString/GetLineView)
+- Provide performance baselines for future optimizations
+
+## Running Tests
+
+### All Tests (including benchmarks)
+
+```bash
+cmake --build cmake-build-debug --target kte_tests && ./cmake-build-debug/kte_tests
+```
+
+### Test Organization
+
+- **58 existing tests**: Core functionality, undo/redo, swap recovery,
+  search, etc.
+- **15 benchmark tests**: Performance measurements for critical
+  operations
+- **30 migration coverage tests**: Edge cases and correctness validation
+
+Total: **98 tests**
+
+## Benchmark Results
+
+### Buffer Iteration Patterns (5,000 lines)
+
+| Pattern                                 | Time    | Speedup vs Rows() |
+|-----------------------------------------|---------|-------------------|
+| `Rows()` + iteration                    | 3.1 ms  | 1.0x (baseline)   |
+| `Nrows()` + `GetLineString()`           | 1.9 ms  | **1.7x faster**   |
+| `Nrows()` + `GetLineView()` (zero-copy) | 0.28 ms | **11x faster**    |
+
+**Key Insight**: `GetLineView()` provides zero-copy access and is
+dramatically faster than materializing the entire rows cache.
+
+### PieceTable Operations (10,000 lines)
+
+| Operation                   | Time    |
+|-----------------------------|---------|
+| Sequential inserts (10K)    | 2.1 ms  |
+| Random inserts (5K)         | 32.9 ms |
+| `GetLine()` sequential      | 4.7 ms  |
+| `GetLineRange()` sequential | 1.3 ms  |
+
+### Buffer Operations
+
+| Operation                            | Time    |
+|--------------------------------------|---------|
+| `Nrows()` (1M calls)                 | 13.0 ms |
+| `GetLineString()` (10K lines)        | 4.8 ms  |
+| `GetLineView()` (10K lines)          | 1.6 ms  |
+| `Rows()` materialization (10K lines) | 6.2 ms  |
+
+### Syntax Highlighting
+
+| Operation                          | Time    | Notes          |
+|------------------------------------|---------|----------------|
+| C++ highlighting (~1000 lines)     | 2.0 ms  | First pass     |
+| HighlighterEngine cache population | 19.9 ms |                |
+| HighlighterEngine cache hits       | 0.52 ms | **38x faster** |
+
+### Large File Performance
+
+| Operation                       | Time    |
+|---------------------------------|---------|
+| Insert 50K lines                | 0.53 ms |
+| Iterate 50K lines (GetLineView) | 2.7 ms  |
+| Random access (10K accesses)    | 1.8 ms  |
+
+## API Differences: GetLineString vs GetLineView
+
+Understanding the difference between these APIs is critical:
+
+### `GetLineString(row)`
+
+- Returns: `std::string` (copy)
+- Content: Line text **without** trailing newline
+- Use case: When you need to modify the string or store it
+- Example: `"hello"` for line `"hello\n"`
+
+### `GetLineView(row)`
+
+- Returns: `std::string_view` (zero-copy)
+- Content: Raw line range **including** trailing newline
+- Use case: Read-only access, maximum performance
+- Example: `"hello\n"` for line `"hello\n"`
+- **Warning**: View becomes invalid after buffer modifications
+
+### `Rows()`
+
+- Returns: `std::vector<Buffer::Line>&` (materialized cache)
+- Content: Lines **without** trailing newlines
+- Use case: Legacy code, being phased out
+- Performance: Slower due to materialization overhead
+
+## Migration Coverage Tests
+
+The `test_migration_coverage.cc` file provides 30 tests covering:
+
+### Edge Cases
+
+- Empty buffers
+- Single lines (with/without newlines)
+- Very long lines (10,000 characters)
+- Many empty lines (1,000 newlines)
+
+### Consistency
+
+- `GetLineString()` vs `GetLineView()` vs `Rows()`
+- Consistency after edits (insert, delete, split, join)
+
+### Boundary Conditions
+
+- First line access
+- Last line access
+- Line range boundaries
+
+### Special Characters
+
+- Tabs, carriage returns, null bytes
+- Unicode (UTF-8 multibyte characters)
+
+### Stress Tests
+
+- Large files (10,000 lines)
+- Many small operations (100+ inserts)
+- Alternating insert/delete patterns
+
+### Regression Tests
+
+- Shebang detection pattern (Editor.cc)
+- Empty buffer check pattern (Editor.cc)
+- Syntax highlighter pattern (all highlighters)
+- Swap snapshot pattern (Swap.cc)
+
+## Performance Recommendations
+
+Based on benchmark results:
+
+1. **Prefer `GetLineView()` for read-only access**
+    - 11x faster than `Rows()` for iteration
+    - Zero-copy, minimal overhead
+    - Use immediately (view invalidates on edit)
+
+2. **Use `GetLineString()` when you need a copy**
+    - Still 1.7x faster than `Rows()`
+    - Safe to store and modify
+    - Strips trailing newlines automatically
+
+3. **Avoid `Rows()` in hot paths**
+    - Materializes entire line cache
+    - Slower for large files
+    - Being phased out (legacy API)
+
+4. **Cache `Nrows()` in tight loops**
+    - Very fast (13ms for 1M calls)
+    - But still worth caching in inner loops
+
+5. **Leverage HighlighterEngine caching**
+    - 38x speedup on cache hits
+    - Automatically invalidates on edits
+    - Prefetch viewport for smooth scrolling
+
+## Adding New Benchmarks
+
+To add a new benchmark:
+
+1. Add a `TEST(Benchmark_YourName)` in `tests/test_benchmarks.cc`
+2. Use `BenchmarkTimer` to measure critical sections:
+   ```cpp
+   {
+       BenchmarkTimer timer("Operation description");
+       // ... code to benchmark ...
+   }
+   ```
+3. Print section headers with `std::cout` for clarity
+4. Use `ASSERT_EQ` or `EXPECT_TRUE` to validate results
+
+Example:
+
+```cpp
+TEST(Benchmark_MyOperation) {
+    std::cout << "\n=== My Operation Benchmark ===\n";
+    
+    // Setup
+    Buffer buf;
+    std::string data = generate_test_data();
+    buf.insert_text(0, 0, data);
+    
+    std::size_t result = 0;
+    {
+        BenchmarkTimer timer("My operation on 10K lines");
+        for (std::size_t i = 0; i < buf.Nrows(); ++i) {
+            result += my_operation(buf, i);
+        }
+    }
+    
+    EXPECT_TRUE(result > 0);
+}
+```
+
+## Continuous Performance Monitoring
+
+Run benchmarks regularly to detect regressions:
+
+```bash
+# Run tests and save output
+./cmake-build-debug/kte_tests > benchmark_results.txt
+
+# Compare with baseline
+diff benchmark_baseline.txt benchmark_results.txt
+```
+
+Look for:
+
+- Significant time increases (>20%) in any benchmark
+- New operations that are slower than expected
+- Cache effectiveness degradation
+
+## Conclusion
+
+The benchmark suite provides:
+
+- **Performance validation**: Ensures migrations don't regress
+  performance
+- **Optimization guidance**: Identifies fastest APIs for each use case
+- **Regression detection**: Catches performance issues early
+- **Documentation**: Demonstrates correct API usage patterns
+
+All 98 tests pass with 0 failures, confirming both correctness and
+performance of the migrated codebase.
--- a/docs/DEVELOPER_GUIDE.md
+++ b/docs/DEVELOPER_GUIDE.md
@@ -0,0 +1,522 @@
+# kte Developer Guide
+
+Welcome to kte development! This guide will help you understand the
+codebase, make changes, and contribute effectively.
+
+## Table of Contents
+
+1. [Architecture Overview](#architecture-overview)
+2. [Core Components](#core-components)
+3. [Code Organization](#code-organization)
+4. [Building and Testing](#building-and-testing)
+5. [Making Changes](#making-changes)
+6. [Code Style](#code-style)
+7. [Common Tasks](#common-tasks)
+
+## Architecture Overview
+
+kte follows a clean separation of concerns with three main layers:
+
+```
+┌─────────────────────────────────────────┐
+│  Frontend Layer (Terminal/ImGui/Qt)     │
+│  - TerminalFrontend / ImGuiFrontend     │
+│  - InputHandler + Renderer interfaces   │
+└─────────────────────────────────────────┘
+                    ↓
+┌─────────────────────────────────────────┐
+│  Command Layer                          │
+│  - Command registry and execution       │
+│  - All editing operations               │
+└─────────────────────────────────────────┘
+                    ↓
+┌─────────────────────────────────────────┐
+│  Core Model Layer                       │
+│  - Editor (top-level state)             │
+│  - Buffer (document model)              │
+│  - PieceTable (text storage)            │
+│  - UndoSystem (undo/redo)               │
+│  - SwapManager (crash recovery)         │
+└─────────────────────────────────────────┘
+```
+
+### Design Principles
+
+- **Frontend Independence**: Core editing logic is independent of UI.
+  Frontends implement `Frontend`, `InputHandler`, and `Renderer`
+  interfaces.
+- **Command Pattern**: All editing operations go through the command
+  system, enabling consistent undo/redo and testing.
+- **Piece Table**: Efficient text storage using a piece table data
+  structure that avoids copying large buffers.
+- **Lazy Materialization**: Text is materialized on-demand to minimize
+  memory allocations.
+
+## Core Components
+
+### Editor (`Editor.h/.cc`)
+
+The top-level editor state container. Manages:
+
+- Multiple buffers
+- Editor modes (normal, k-command prefix, prompts)
+- Kill ring (clipboard history)
+- Universal argument state
+- Search state
+- Status messages
+- Swap file management
+
+**Key Insight**: Editor is primarily a state holder with many
+getter/setter pairs. It doesn't contain editing logic - that's in
+commands.
+
+### Buffer (`Buffer.h/.cc`)
+
+Represents an open document. Manages:
+
+- File I/O (open, save, external modification detection)
+- Cursor position and viewport offsets
+- Mark (selection start point)
+- Visual line mode state
+- Syntax highlighting integration
+- Undo system integration
+- Swap recording integration
+
+**Key Insight**: Buffer wraps a PieceTable and provides a higher-level
+interface. The nested `Buffer::Line` class is a legacy wrapper that has
+been largely phased out in favor of direct PieceTable operations.
+
+**Line Access APIs**: Buffer provides three ways to access line content:
+
+- `GetLineView(row)` - Zero-copy `string_view` (fastest, 11x faster than
+  Rows())
+- `GetLineString(row)` - Returns `std::string` copy (1.7x faster than
+  Rows())
+- `Rows()` - Materializes all lines into cache (legacy, avoid in new
+  code)
+
+See `docs/BENCHMARKS.md` for detailed performance analysis and usage
+guidance.
+
+### PieceTable (`PieceTable.h/.cc`)
+
+The core text storage data structure. Provides:
+
+- Efficient insert/delete operations without copying entire buffer
+- Line-based queries (line count, get line, line ranges)
+- Position conversion (byte offset ↔ line/column)
+- Substring extraction
+- Search functionality
+- Automatic consolidation to prevent piece fragmentation
+
+**Key Insight**: PieceTable uses lazy materialization - the full text is
+only assembled when `Data()` is called. Most operations work directly on
+the piece list.
+
+### UndoSystem (`UndoSystem.h/.cc`, `UndoTree.h/.cc`, `UndoNode.h/.cc`)
+
+Implements undo/redo with a tree structure supporting:
+
+- Linear undo/redo
+- Branching history (future enhancement)
+- Checkpointing and compaction
+- Memory-efficient node pooling
+
+**Key Insight**: The undo system records operations at the PieceTable
+level, not at the command level.
+
+### Command System (`Command.h/.cc`)
+
+All editing operations are implemented as commands:
+
+- File operations (save, open, close)
+- Navigation (move cursor, page up/down, word movement)
+- Editing (insert, delete, kill, yank)
+- Search and replace
+- Buffer management
+- Configuration (syntax, theme, font)
+
+**Key Insight**: `Command.cc` is currently a monolithic 5000-line file.
+This is the biggest maintainability challenge in the codebase.
+
+### Frontend Abstraction
+
+Three interfaces define the frontend contract:
+
+- **Frontend** (`Frontend.h`): Top-level lifecycle (Init/Step/Shutdown)
+- **InputHandler** (`InputHandler.h`): Converts UI events to commands
+- **Renderer** (`Renderer.h`): Draws the editor state
+
+Implementations:
+
+- **Terminal**: ncurses-based (`TerminalFrontend`,
+  `TerminalInputHandler`, `TerminalRenderer`)
+- **ImGui**: Dear ImGui-based (`ImGuiFrontend`, `ImGuiInputHandler`,
+  `ImGuiRenderer`)
+- **Qt**: Qt-based (`QtFrontend`, `QtInputHandler`, `QtRenderer`)
+- **Test**: Programmatic testing (`TestFrontend`, `TestInputHandler`,
+  `TestRenderer`)
+
+## Code Organization
+
+### Directory Structure
+
+```
+kte/
+├── *.h, *.cc           # Core implementation (root level)
+├── main.cc             # Entry point
+├── docs/               # Documentation
+│   ├── ke.md          # Original ke editor reference (keybindings)
+│   ├── swap.md        # Swap file design
+│   ├── syntax.md      # Syntax highlighting
+│   ├── themes.md      # Theme system
+│   └── plans/         # Design documents
+├── tests/              # Test suite
+│   ├── Test.h         # Minimal test framework
+│   ├── TestRunner.cc  # Test runner
+│   └── test_*.cc      # Individual test files
+├── syntax/             # Syntax highlighting engines
+├── fonts/              # Embedded fonts for GUI
+├── themes/             # Color themes
+└── ext/                # External dependencies (imgui)
+```
+
+### File Naming Conventions
+
+- Headers: `ComponentName.h`
+- Implementation: `ComponentName.cc`
+- Tests: `test_feature_name.cc`
+
+### Key Files by Size
+
+Large files that may need attention:
+
+- `Command.cc` (4995 lines) - **Needs refactoring**: Consider splitting
+  into logical groups
+- `Swap.cc` (1300 lines) - Crash recovery system (migrated to direct
+  PieceTable operations)
+- `QtFrontend.cc` (985 lines) - Qt integration
+- `ImGuiRenderer.cc` (930 lines) - ImGui rendering
+- `PieceTable.cc` (800 lines) - Core data structure
+- `Buffer.cc` (763 lines) - Document model
+
+## Building and Testing
+
+### Build System
+
+kte uses CMake with multiple build profiles:
+
+```bash
+# Debug build (terminal only)
+cmake -S . -B cmake-build-debug -DCMAKE_BUILD_TYPE=Debug
+cmake --build cmake-build-debug
+
+# Release build with GUI
+cmake -S . -B cmake-build-release -DCMAKE_BUILD_TYPE=Release -DBUILD_GUI=ON
+cmake --build cmake-build-release
+
+# Build specific target
+cmake --build cmake-build-debug --target kte_tests
+```
+
+### CMake Targets
+
+- `kte` - Terminal editor executable
+- `kge` - GUI editor executable (when `BUILD_GUI=ON`)
+- `kte_tests` - Test suite
+- `imgui` - Dear ImGui library (when `BUILD_GUI=ON`)
+
+### Running Tests
+
+```bash
+# Build and run all tests
+cmake --build cmake-build-debug --target kte_tests && ./cmake-build-debug/kte_tests
+
+# Run tests with verbose output
+./cmake-build-debug/kte_tests
+```
+
+### Test Organization
+
+The test suite uses a minimal custom framework (`Test.h`):
+
+```cpp
+TEST(TestName) {
+    // Test body
+    ASSERT_EQ(actual, expected);
+    ASSERT_TRUE(condition);
+    EXPECT_TRUE(condition);  // Non-fatal
+}
+```
+
+Test files by category:
+
+- **Core Data Structures**:
+    - `test_piece_table.cc` - PieceTable operations, line indexing,
+      random edits
+    - `test_buffer_rows.cc` - Buffer row operations
+    - `test_buffer_io.cc` - File I/O (open, save, SaveAs)
+
+- **Editing Operations**:
+    - `test_command_semantics.cc` - Command execution
+    - `test_kkeymap.cc` - Keybinding system
+    - `test_visual_line_mode.cc` - Visual line selection
+
+- **Search and Replace**:
+    - `test_search.cc` - Search functionality
+    - `test_search_replace_flow.cc` - Interactive search/replace
+
+- **Text Reflow**:
+    - `test_reflow_paragraph.cc` - Paragraph reformatting
+    - `test_reflow_indented_bullets.cc` - Indented list handling
+
+- **Undo System**:
+    - `test_undo.cc` - Undo/redo operations
+
+- **Swap Files** (Crash Recovery):
+    - `test_swap_recorder.cc` - Recording operations
+    - `test_swap_writer.cc` - Writing swap files
+    - `test_swap_replay.cc` - Replaying operations
+    - `test_swap_recovery_prompt.cc` - Recovery UI
+    - `test_swap_cleanup.cc` - Cleanup logic
+    - `test_swap_git_editor.cc` - Git editor integration
+
+- **Performance and Migration**:
+    - `test_benchmarks.cc` - Performance benchmarks for core operations
+    - `test_migration_coverage.cc` - Buffer::Line migration validation
+
+- **Integration Tests**:
+    - `test_daily_workflows.cc` - Real-world editing scenarios
+    - `test_daily_driver_harness.cc` - Workflow test infrastructure
+
+**Total**: 98 tests across 22 test files. See `docs/BENCHMARKS.md` for
+performance benchmark results.
+
+### Writing Tests
+
+When adding new functionality:
+
+1. **Add a test first** - Write a failing test that demonstrates the
+   desired behavior
+2. **Use descriptive names** - Test names should explain what's being
+   validated
+3. **Test edge cases** - Empty buffers, EOF, beginning of file, etc.
+4. **Use TestFrontend** - For integration tests, use the programmatic
+   test frontend
+
+Example test structure:
+
+```cpp
+TEST(Feature_Behavior_Scenario) {
+    // Setup
+    Buffer buf;
+    buf.insert_text(0, 0, "test content\n");
+    
+    // Exercise
+    buf.delete_text(0, 5, 4);
+    
+    // Verify
+    ASSERT_EQ(buf.GetLineString(0), std::string("test\n"));
+}
+```
+
+## Making Changes
+
+### Development Workflow
+
+1. **Understand the change scope**:
+    - Pure UI change? → Modify frontend only
+    - New editing operation? → Add command in `Command.cc`
+    - Core data structure? → Modify `PieceTable` or `Buffer`
+
+2. **Find relevant code**:
+    - Use `git grep` or IDE search to find similar functionality
+    - Check `Command.cc` for existing command patterns
+    - Look at tests to understand expected behavior
+
+3. **Make the change**:
+    - Follow existing code style (see below)
+    - Add or update tests
+    - Update documentation if needed
+
+4. **Test thoroughly**:
+    - Run the full test suite
+    - Manually test in both terminal and GUI (if applicable)
+    - Test edge cases (empty files, large files, EOF, etc.)
+
+### Common Pitfalls
+
+- **Don't modify `Buffer::Rows()` directly** - Use the PieceTable API (
+  `insert_text`, `delete_text`, etc.) to ensure undo and swap recording
+  work correctly.
+- **Prefer efficient line access** - Use `GetLineView()` for read-only
+  access (11x faster than `Rows()`), or `GetLineString()` when you need
+  a copy. Avoid `Rows()` in new code.
+- **Remember to invalidate caches** - If you modify PieceTable
+  internals, ensure line index and materialization caches are
+  invalidated.
+- **Cursor visibility** - After editing operations, call
+  `ensure_cursor_visible()` to update viewport offsets.
+- **Undo boundaries** - Use `buf.Undo()->BeginGroup()` and `EndGroup()`
+  to group related operations.
+- **GetLineView() lifetime** - The returned `string_view` is only valid
+  until the next buffer modification. Use immediately or copy to
+  `std::string`.
+
+## Code Style
+
+kte uses C++20 with these conventions:
+
+### Naming
+
+- **Classes/Structs**: `PascalCase` (e.g., `PieceTable`, `Buffer`)
+- **Functions/Methods**: `PascalCase` (e.g., `GetLine`, `Insert`)
+- **Variables**: `snake_case` with trailing underscore for members (
+  e.g., `total_size_`, `line_index_`)
+- **Constants**: `snake_case` or `UPPER_CASE` depending on context
+- **Private members**: Trailing underscore (e.g., `pieces_`, `dirty_`)
+
+### Formatting
+
+- **Indentation**: Tabs (width 8 in most files, but follow existing
+  style)
+- **Braces**: Opening brace on same line for functions, control
+  structures
+- **Line length**: No strict limit, but keep reasonable (~100-120 chars)
+- **Includes**: Group by category (system, external, project) with blank
+  lines between
+
+### Comments
+
+- **File headers**: Brief description of the file's purpose
+- **Function comments**: Explain non-obvious behavior, not what the code
+  obviously does
+- **Inline comments**: Explain *why*, not *what*
+- **TODO comments**: Use `TODO:` prefix for future work
+
+Example:
+
+```cpp
+// Consolidate small pieces to prevent fragmentation.
+// This is a heuristic: we only consolidate when piece count exceeds
+// a threshold, and we cap the bytes processed per consolidation run.
+void maybeConsolidate() {
+    if (pieces_.size() < piece_limit_)
+        return;
+    // ... implementation
+}
+```
+
+## Common Tasks
+
+### Adding a New Command
+
+1. **Define the command function** in `Command.cc`:
+
+```cpp
+bool cmd_my_feature(CommandContext &ctx) {
+    Editor &ed = ctx.ed;
+    Buffer *buf = ed.CurrentBuffer();
+    if (!buf) return false;
+    
+    // Implement the command
+    buf->insert_text(buf->Cury(), buf->Curx(), "text");
+    
+    return true;
+}
+```
+
+2. **Register the command** in `InstallDefaultCommands()`:
+
+```cpp
+CommandRegistry::Register({
+    CommandId::MyFeature,
+    "my-feature",
+    "Description of what it does",
+    cmd_my_feature
+});
+```
+
+3. **Add keybinding** in the appropriate `InputHandler` (e.g.,
+   `TerminalInputHandler.cc`).
+
+4. **Write tests** in `tests/test_command_semantics.cc` or a new test
+   file.
+
+### Adding a New Frontend
+
+1. **Implement the three interfaces**:
+    - `Frontend` - Lifecycle management
+    - `InputHandler` - Event → Command translation
+    - `Renderer` - Draw the editor state
+
+2. **Study existing implementations**:
+    - `TerminalFrontend` - Simplest, good starting point
+    - `ImGuiFrontend` - More complex, shows GUI patterns
+
+3. **Register in `main.cc`** to make it selectable.
+
+### Modifying the PieceTable
+
+The PieceTable is performance-critical. When making changes:
+
+1. **Understand the piece list** - Each piece references a range in
+   either `original_` or `add_` buffer
+2. **Maintain invariants**:
+    - `total_size_` must match sum of piece lengths
+    - Line index must be invalidated on content changes
+    - Version must increment on mutations
+3. **Test thoroughly** - Use `test_piece_table.cc` random edit test as a
+   reference model
+4. **Profile if needed** - Large file performance is a key goal
+
+### Adding Syntax Highlighting
+
+1. **Create a new highlighter** in `syntax/` directory:
+    - Inherit from `HighlighterEngine`
+    - Implement `HighlightLine()` method
+
+2. **Register in `HighlighterRegistry`** (
+   `syntax/HighlighterRegistry.cc`)
+
+3. **Add file extension mapping** in the registry
+
+4. **Test with sample files** of that language
+
+### Debugging Tips
+
+- **Use the test frontend** - Write a test that reproduces the issue
+- **Enable assertions** - Build in Debug mode
+- **Check swap files** - Look in `/tmp/kte-swap-*` for recorded
+  operations
+- **Print debugging** - Use `std::cerr` (stdout is used by ncurses)
+- **GDB/LLDB** - Standard debuggers work fine with kte
+
+## Getting Help
+
+- **Read the code** - kte is designed to be understandable; follow the
+  data flow
+- **Check existing tests** - Tests often show how to use APIs correctly
+- **Look at git history** - See how similar features were implemented
+- **Read design docs** - Check `docs/plans/` for design rationale
+
+## Future Improvements
+
+Areas where the codebase could be improved:
+
+1. **Split Command.cc** - Break into logical groups (editing,
+   navigation, file ops, etc.)
+2. **Complete Buffer::Line migration** - A few legacy editing functions
+   in Command.cc still use `Buffer::Rows()` directly (see lines 86-90
+   comment)
+3. **Add more inline documentation** - Especially for complex algorithms
+4. **Improve test coverage** - Add more edge case tests (current: 98
+   tests)
+5. **Performance profiling** - Continue monitoring performance with
+   benchmark suite
+6. **API documentation** - Consider adding Doxygen-style comments
+
+---
+
+Welcome aboard! Start small, read the code, and don't hesitate to ask
+questions.