Fix HTMLStripCharFilter dropping content after =''>#15756
Fix HTMLStripCharFilter dropping content after =''>#15756msfroh merged 3 commits intoapache:mainfrom
Conversation
This fixes an issue where HTMLStripCharFilter fails to recognize the closing double quote of an attribute value if the value ends with an equals sign and is immediately followed by the tag closer (>). Closes apache#15754
|
I think this looks promising. (The fact that existing test cases don't break is a good sign.) Thanks a lot @Seungmin123! It looks like this was a side-effect of #11724. Can you double-check with the unit test in the PR opened by @mjustice3 in #13157? It looks very similar to your test cases, so I'm pretty sure it's the same bug. |
Adds the testForIssue10520Regression test case originally from PR apache#13157 to verify the fix correctly handles general attribute values without breaking backwards compatibility. Also adds a test case for single quoted attributes ending with an equals sign to ensure comprehensive test coverage.
798375f to
c221d58
Compare
|
Hi @msfroh, thank you for the helpful suggestion and the review! I have added the testForIssue10520Regression test case from PR #13157 as you suggested, along with another edge case test for single-quoted attributes ending with an equals sign. All tests, including existing ones and the newly added cases, pass successfully. I appreciate you pointing that out! |
|
Thanks for the review! I've moved the entry in CHANGES.txt to the Lucene 10.5.0 section as suggested for backporting. |
This fixes an issue where HTMLStripCharFilter fails to recognize the closing double quote of an attribute value if the value ends with an equals sign and is immediately followed by the tag closer (>). Closes #15754
|
Backport commit to 10.x: b2640a8 |
Description
This PR fixes an issue where
HTMLStripCharFilterfails to recognize the closing double quote of an attribute value if the value ends with an equals sign and is immediately followed by the tag closer (>). This causes the filter to incorrectly discard content until the next double quote.Resolves #15754.