Fix Windows lexer error on CRLF code blocks#2804
Open
hhugo wants to merge 1 commit into
Open
Conversation
8af14ea to
3c7630c
Compare
handle_last_newline scans backward and matches '\n' immediately, leaving
the preceding '\r' (if any) on the stripped string. Because deindent then
splits on '\n' and rejoins with '\n', every '\r' on an intermediate line
ends up followed by a '\n' — but the '\r' on the last line is bare,
since String.concat puts no separator after the final element.
The OCaml lexer's newline regex is '\r*\n', so a bare trailing '\r'
falls through to the Illegal_character rule. This caused the Windows CI
to fail on any {@ocaml[...]} block in a CRLF-checked-out file.
Mirror handle_first_newline (which already eats a leading '\r' before
matching '\n'): when handle_last_newline finds '\n', also drop a
preceding '\r' so the full CRLF is consumed.
3c7630c to
77b7b28
Compare
Julow
reviewed
Jun 1, 2026
Collaborator
Julow
left a comment
There was a problem hiding this comment.
Thanks! The build is failing in CI on linux with dune-pkg. Do you have an idea why ?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the Windows CI failure (e.g. run 25160035941) where
{@ocaml[...]}blocks in.mld/.mlidoc comments produceinvalid code block: Ocamlformat_parser_extended.Lexer.Error(_, _)on CRLF-checked-out files.In
vendor/odoc-parser/odoc_parser.ml,handle_last_newlinescans backward over trailing whitespace and stops on'\n', returning the substring up to that index. The symmetrichandle_first_newlinealready eats a leading'\r'before matching'\n', buthandle_last_newlinedid not — so the'\r'immediately preceding the final'\n'survived.deindentthen splits on'\n'and rejoins with'\n', which leaves intermediate\rs followed by\n(tolerated by the lexer's\r*\nrule) but a bare'\r'on the last line, which falls through toIllegal_character.The fix mirrors
handle_first_newline: whenhandle_last_newlinehits'\n', also drop a preceding'\r'so the full CRLF is consumed.Test plan
test/passing/tests/doc_repl.mldto CRLF and runocamlformat --parse-toplevel-phrases— reproduces the failure before the patch, succeeds after.dune build @runtestclean on Linux (no regressions on LF input).