fix: handle invalid UTF-8 in Ruby and Vue preprocessors #19588

khasinski · 2026-01-21T22:52:27Z

Summary

This PR fixes a panic that occurs when the Ruby or Vue preprocessors encounter files with invalid UTF-8 bytes.

The issue:

ruby.rs:37 and vue.rs:18 used std::str::from_utf8(content).unwrap()
This panics when processing files containing invalid UTF-8 bytes

Error message:

thread panicked at crates/oxide/src/extractor/pre_processors/ruby.rs:37:59:
called `Result::unwrap()` on an `Err` value: Utf8Error { valid_up_to: 45, error_len: Some(1) }

The fix:

Wrap UTF-8 conversion in if let Ok(...) to gracefully handle invalid UTF-8
Skip regex-based template extraction when UTF-8 conversion fails
Allow byte-level processing to continue (in Ruby's case)

This can happen in Rails projects when:

Binary files are inadvertently scanned
Files contain non-UTF-8 encodings
Files are truncated at multi-byte character boundaries during parallel processing

Test plan

Added test_invalid_utf8_does_not_panic test for Ruby preprocessor
Added test_valid_utf8_with_multibyte_chars test for Ruby preprocessor
Added test_invalid_utf8_does_not_panic test for Vue preprocessor
All existing tests pass (cargo test pre_processors - 43 tests)

The Ruby and Vue preprocessors were using `from_utf8().unwrap()` which panics when processing files containing invalid UTF-8 bytes. This can happen when: - Binary files are inadvertently scanned - Files are truncated at multi-byte character boundaries - Files use non-UTF-8 encodings This change wraps the UTF-8 conversion in `if let Ok(...)` to gracefully skip the regex-based template extraction when UTF-8 conversion fails, while still allowing the byte-level processing to continue (in Ruby's case). Fixes panic: `thread panicked at crates/oxide/src/extractor/pre_processors/ruby.rs:37:59`

coderabbitai · 2026-01-21T22:55:55Z

Walkthrough

The changes introduce UTF-8 validation checks in two pre-processor modules. In the Ruby processor, HEREDOC extraction logic is now conditional on UTF-8 validity; invalid UTF-8 skips this extraction while byte-level processing continues. In the Vue processor, template processing similarly gates execution on UTF-8 validation. Both modifications include tests that verify invalid UTF-8 handling and valid character processing. No public API signatures were altered.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: fixing invalid UTF-8 handling in Ruby and Vue preprocessors, which matches the changeset content.
Description check	✅ Passed	The description is directly related to the changeset, clearly explaining the UTF-8 panic issue and the fixes applied to both Ruby and Vue preprocessors.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

khasinski requested a review from a team as a code owner January 21, 2026 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle invalid UTF-8 in Ruby and Vue preprocessors #19588

fix: handle invalid UTF-8 in Ruby and Vue preprocessors #19588

khasinski commented Jan 21, 2026

Uh oh!

coderabbitai bot commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

fix: handle invalid UTF-8 in Ruby and Vue preprocessors #19588

Are you sure you want to change the base?

fix: handle invalid UTF-8 in Ruby and Vue preprocessors #19588

Conversation

khasinski commented Jan 21, 2026

Summary

Test plan

Uh oh!

coderabbitai bot commented Jan 21, 2026

Walkthrough

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant