-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
wc: fix word undercount with invalid byte sequences #10348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
GNU testsuite comparison: |
CodSpeed Performance ReportMerging this PR will not alter performanceComparing Summary
Footnotes
|
|
some jobs are failing |
|
They're failing because the test_utf8 test expects a count of 2119, but after the commit it counts invalid byte sequences as words, increasing the total to 2178. Should I go ahead and update the test? |
|
The general guide is if the gnu implementation conflicts with the test then the test is wrong and needs to be updated. If gnu is giving the same value as the test then there is something wrong with the implementation |
|
Validated it locally, this is tricky because there was a difference between different versions of GNU. Yes, the 9.9 version of GNU outputs 2178, so it should be okay to change this test |
|
GNU testsuite comparison: |
|
Can you clean up the linting with clippy? Looks all good to go just failing some of the CI tests |
|
GNU testsuite comparison: |
* wc: fix word undercount with invalid byte sequences * wc: update utf8 test counts to account invalid byte sequences * wc: remove unnecessary borrow in test
Closes #10322