Skip to content

fix: panic on multi-byte UTF-8 in grep submatches#2170

Open
LeonidasZhak wants to merge 1 commit into
dandavison:mainfrom
LeonidasZhak:codex/fix-grep-panic-unicode
Open

fix: panic on multi-byte UTF-8 in grep submatches#2170
LeonidasZhak wants to merge 1 commit into
dandavison:mainfrom
LeonidasZhak:codex/fix-grep-panic-unicode

Conversation

@LeonidasZhak

Copy link
Copy Markdown

Summary

Fixes a panic when git grep output contains multi-byte UTF-8 characters (e.g. ©, emoji, CJK characters).

Problem

The byte offsets from ripgrep submatches may not align with UTF-8 char boundaries. When delta tries to slice the string at these offsets, it panics:

thread 'main' panicked at 'byte index 57 is not a char boundary; it is inside '©' (bytes 56..59)

Solution

Use floor_char_boundary() to snap byte indices to valid char boundaries before slicing. This ensures safe string slicing regardless of the character encoding.

Changes

src/handlers/grep.rs: In make_style_sections(), added char boundary checks before string slicing.

Fixes #1448

When git grep output contains multi-byte UTF-8 characters (e.g.
©, emoji, CJK), the byte offsets from ripgrep may not align with
char boundaries, causing a panic when slicing the string.

Use floor_char_boundary() to snap indices to valid char boundaries
before slicing.

Fixes dandavison#1448
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Panic running git grep through delta pager when txt file contains ©

1 participant