Proof of concept of exact printer by leana8959 · Pull Request #11425 · haskell/cabal

leana8959 · 2026-01-15T15:21:58Z

Hello,

I made a proof of concept of the exact printer. The idea is that we collect all the trivia that is lost at the leaves, label them with corresponding data and propagate them upwards to the ParsecFieldGrammar. We call this data TriviaTree. This TriviaTree is later propagated to the pretty printer from PrettyFieldGrammar and within each pretty printer instance we can make wiser choice based on what whitespaces is lost and recreate the original output.

I would like to know what you think about this approach!

In 408a8b3 where I accept the format test changes, you can see that some whitespaces have changed. The changes are not in the right order yet, but it proves that I can retrive whitespace data from the printer.

leana8959 · 2026-01-15T15:35:37Z

The TriviaTree should make the following functions work as intended, beside having an instance of Semigroup and Monoid. It is very similar to the idea of "complement" in the biparser paper.

The idea is to model the parser as of type Text -> (Complement, a) and the printer of type (Maybe Complement, a) -> Text.
Concretely ParsecFieldGrammar is changed to always output a TriviaTree along side the data (TriviaTree, a), and on the otherhand the PrettyFieldGrammar takes an extra argument of type TriviaTree to pretty print.

-- Namespace is a sumtype representing all the data I want to index at the moment, but I'm changing it to a type class. See "annotating generic outputs" below.

-- | Label a trivia with it's associated data "Namespace"
fromNamedTrivia :: Namespace -> Trivia -> TriviaTree

-- | Mark a given tree being associated with an aggregated data "Namespace"
mark :: Namespace -> TriviaTree -> TriviaTree

-- | What's the trivia associated with data "Namespace"
unmark :: Namespace -> TriviaTree -> TriviaTree

However there are some problems I am seeing. I don't think they are covered in the biparser paper since these problems arise due to the non-sequential nature of final stage of cabal parser.

performance

Data are duplicated all over the place as keys to indicate which of the sub triviatree contains trivia that interests us. I know little about how sharing works in Haskell/GHC and I wonder if this will cause memory pressure.

annotating generic outputs

I can't use data as Namespace when I don't know its exact type to put it into a namespace. This is a problem for parser combinators that takes a Parser (TriviaTree, a) and outputs a Parser (f (TriviaTree, a)) where f is a list or nonempty list, etc.
To annotate the outputted trivia tree (e.g. adding a numbering annotation) of the subparser in a combinator, I need to be able to turn its output data into a Namespace so that numbering is associated to the right data. But I can't because I don't know the right constructor to turn some data into aNamespace due to the data being generic.

I suppose one solution among others is to make Namespace a type class where all instances can be turned into a Namespace, and I add that constraint where I need it. Something like https://hackage.haskell.org/package/xmonad-0.18.0/docs/src/XMonad.Core.html#fromMessage

type-safety

The marking and unmarking is very error-prone. Currently I annotate and mark the trees in a way as simple as possible, but there's no way to check at build time if I mark things in the right order or if I double marked something. Mismatch in the parser "marking" order and the printer "unmarking" order will cause the trivia to not be seen and not used.

Bodigrim

It looks like a great work, but I personally find it too big to review. There seems to be a lot of shuffling stuff around intermingled with actual changes. Could we separate splitting Distribution.Types.Foo into Distribution.Types.Foo.{Internal,Parser,Pretty} into their own commits at the beginning of the branch?

(I'm not a maintainer here and you don't need my review, so feel free to ignore)

Bodigrim · 2026-01-18T13:53:36Z

    -- See also https://github.com/ekmett/transformers-compat/issues/35
    , transformers (>= 0.3      && < 0.4) || (>=0.4.1.0 && <0.7)

+    , pretty-simple


Cabal-syntax is a GHC boot library and cannot depend on non-boot ones like pretty-simple.

I know, this was for the demonstration. The trivia map is highly nested and is hard to visualize without a pretty printer.

Bodigrim · 2026-01-18T13:55:27Z

 -- | Skips /zero/ or more white space characters. See also 'skipMany'.
-spaces :: CharParsing m => m ()
+spaces :: (Monad m, CharParsing m) => m ()
 spaces = skipMany space <?> "white space"


You can define spaces = void spaces' now, so that the relation between combinators becomes even more apparent.

Bodigrim · 2026-01-18T13:59:17Z

+{-# LANGUAGE DeriveGeneric #-}
+{-# LANGUAGE StrictData #-}
+
+module Distribution.Types.Annotation where


Could we possibly have more comments about the purpose and semantics of this module and its constituents? It's hard to review otherwise.

ulysses4ever · 2026-01-18T17:26:18Z

Point of order

(I'm not a maintainer here and you don't need my review, so feel free to ignore)

@Bodigrim's comments are very welcome by the Cabal maintainers, and we hope they will be addressed (and we hope for more of them!).

leana8959 · 2026-01-19T07:27:41Z

Thanks for the comments bodigrim :)
In fact, I have started implementing my idea of using an existential type to represent the namespace. This should solve the module cycle problem. I'll do that, clean up the code and let you know!

Blankline doesn't roundtrip: our triviatree model doesn't associate it with anything. To be fixed in another way.

packagedescription 4 roundtrips

we don't use ordering numbers anymore

leana8959 added the exact-print label Jan 15, 2026

leana8959 marked this pull request as draft January 15, 2026 15:24

leana8959 added 6 commits January 16, 2026 10:40

checkpoint

e26def3

checkpoint

1a2721e

split pretty type classes

d8c7d11

checkpoint

fdbe263

checkpoint

84f7588

complete rewrite in prettier super class

20f1ec2

Bodigrim reviewed Jan 18, 2026

View reviewed changes

leana8959 added 18 commits January 19, 2026 09:07

fix trivia passing

9c1db53

restore UnqualComponentName

19d3455

restore VersionRange

8a7ef94

restore Backpack

81a1783

restore Dependency

1daf21e

restore Version

22c2b5c

restore PackageName

056b286

restore System

51cf689

restore LicenseReference

72ebadc

restore LicenseId

3845195

restore Flag

4e501ea

restore License

d0e6229

restore LicenseExceptionId

6c81d4a

resotre LicenseExpression

ef23729

restore IncludeRenaming

4ac13bc

restore LibraryName

0163db3

restore Parsec

dce50a1

add notes for reviewer

00ba256

leana8959 added 30 commits February 24, 2026 15:55

accept packageDescription golden tests

6dddd34

retrieve position in freeTextFieldDef

b2abfa0

register placement in freetextfield

3fe83ec

register exact placement perline instead

0afdc9f

retrieve exact fieldline representation for fieldlines

f9aa14b

render the freetext field using exact representation

68ca48e

propagate prettyfieldline position to exactdoc

a1844c4

fix linejump logic and decribe post condition in doc

8051ae8

fix assertion call

7e43702

fix cabal-version exact print

0d4c463

remove blankline for now

ce6caa4

Blankline doesn't roundtrip: our triviatree model doesn't associate it with anything. To be fixed in another way.

display parser error reason in parsertests

ff3c234

add packageDescription4 test

2220132

register optionalFieldAla position

dbeb91c

recover optionalFieldAla position

ee18323

recover optionalFieldAla position in exactdoc

ae72c3b

add packagedescription 4 roundtrip test

f41cf06

recover position in BuildType

f5b86a3

packagedescription 4 roundtrips

update comments

b9700cd

we don't use ordering numbers anymore

test "tested-with" field in packageDescription4

559887b

accept prettyfield

1d0391a

register TestedWith position

110f4c2

recover position in tested-with

fe5ae71

accept tested-with exactdoc transformation

3d3d517

recover ordering in fsep and pass on trivia

2a2628e

tested-with field content golden

aa0ace5

test-with field roundtrip test

6935764

write down problem about monoidal field grouping logic

f904dff

refactor exactParseSep

9479592

deduplicate function in pretty printer

f0eba2a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proof of concept of exact printer#11425

Proof of concept of exact printer#11425
leana8959 wants to merge 345 commits intohaskell:masterfrom
leana8959:field-grammar-trivia

leana8959 commented Jan 15, 2026 •

edited

Loading

Uh oh!

leana8959 commented Jan 15, 2026

Uh oh!

Bodigrim left a comment

Uh oh!

Bodigrim Jan 18, 2026

Uh oh!

leana8959 Jan 19, 2026

Uh oh!

Bodigrim Jan 18, 2026

Uh oh!

Bodigrim Jan 18, 2026

Uh oh!

ulysses4ever commented Jan 18, 2026

Uh oh!

leana8959 commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

leana8959 commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leana8959 commented Jan 15, 2026

performance

annotating generic outputs

type-safety

Uh oh!

Bodigrim left a comment

Choose a reason for hiding this comment

Uh oh!

Bodigrim Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

leana8959 Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Bodigrim Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

Bodigrim Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

ulysses4ever commented Jan 18, 2026

Uh oh!

leana8959 commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

leana8959 commented Jan 15, 2026 •

edited

Loading