Route TOC-without-page-numbers documents to the correct strategy#285
Open
Me3sP wants to merge 1 commit into
Open
Route TOC-without-page-numbers documents to the correct strategy#285Me3sP wants to merge 1 commit into
Me3sP wants to merge 1 commit into
Conversation
tree_parser only had two branches: a TOC with page numbers, or everything else. A document with a printed TOC that lists no page numbers fell into the else branch and was processed with process_no_toc, regenerating the structure from scratch and ignoring the existing TOC entirely. process_toc_no_page_numbers was therefore unreachable as a primary strategy and only ran as a fallback from process_toc_with_page_numbers. Add the missing branch so a TOC with no page numbers is dispatched to process_toc_no_page_numbers directly. Also forward start_index from meta_processor into process_toc_no_page_numbers, which previously relied on the default and would index incorrectly for non-top-level nodes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
tree_parseronly had two dispatch branches: a TOC with page numbers, or everything else. A document with a printed table of contents that lists no page numbers fell into theelsebranch and was handled byprocess_no_toc— regenerating the structure from scratch and ignoring the existing TOC entirely.As a result,
process_toc_no_page_numberswas unreachable as a primary strategy. It only ever ran as a fallback fromprocess_toc_with_page_numbersinsidemeta_processor.Fix
tree_parserbranch so a TOC with no page numbers is dispatched toprocess_toc_no_page_numbersdirectly, using the TOC instead of discarding it.start_indexfrommeta_processorintoprocess_toc_no_page_numbers. It previously relied on the default (1), which would index incorrectly when invoked for non-top-level nodes.The existing fallback chain is preserved:
process_toc_no_page_numbersstill degrades toprocess_no_tocon low verification accuracy.Impact
Additive only — no existing branch behavior changes. Documents that previously hit
process_no_tocdespite having a usable TOC now keep their authored structure.