Proof of concept of exact printer#11425
Conversation
|
The The idea is to model the parser as of type -- Namespace is a sumtype representing all the data I want to index at the moment, but I'm changing it to a type class. See "annotating generic outputs" below.
-- | Label a trivia with it's associated data "Namespace"
fromNamedTrivia :: Namespace -> Trivia -> TriviaTree
-- | Mark a given tree being associated with an aggregated data "Namespace"
mark :: Namespace -> TriviaTree -> TriviaTree
-- | What's the trivia associated with data "Namespace"
unmark :: Namespace -> TriviaTree -> TriviaTreeHowever there are some problems I am seeing. I don't think they are covered in the biparser paper since these problems arise due to the non-sequential nature of final stage of cabal parser. performanceData are duplicated all over the place as keys to indicate which of the sub triviatree contains trivia that interests us. I know little about how sharing works in Haskell/GHC and I wonder if this will cause memory pressure. annotating generic outputsI can't use data as I suppose one solution among others is to make type-safetyThe marking and unmarking is very error-prone. Currently I annotate and mark the trees in a way as simple as possible, but there's no way to check at build time if I mark things in the right order or if I double marked something. Mismatch in the parser "marking" order and the printer "unmarking" order will cause the trivia to not be seen and not used. |
Bodigrim
left a comment
There was a problem hiding this comment.
It looks like a great work, but I personally find it too big to review. There seems to be a lot of shuffling stuff around intermingled with actual changes. Could we separate splitting Distribution.Types.Foo into Distribution.Types.Foo.{Internal,Parser,Pretty} into their own commits at the beginning of the branch?
(I'm not a maintainer here and you don't need my review, so feel free to ignore)
| -- See also https://github.com/ekmett/transformers-compat/issues/35 | ||
| , transformers (>= 0.3 && < 0.4) || (>=0.4.1.0 && <0.7) | ||
|
|
||
| , pretty-simple |
There was a problem hiding this comment.
Cabal-syntax is a GHC boot library and cannot depend on non-boot ones like pretty-simple.
There was a problem hiding this comment.
I know, this was for the demonstration. The trivia map is highly nested and is hard to visualize without a pretty printer.
| -- | Skips /zero/ or more white space characters. See also 'skipMany'. | ||
| spaces :: CharParsing m => m () | ||
| spaces :: (Monad m, CharParsing m) => m () | ||
| spaces = skipMany space <?> "white space" |
There was a problem hiding this comment.
You can define spaces = void spaces' now, so that the relation between combinators becomes even more apparent.
| {-# LANGUAGE DeriveGeneric #-} | ||
| {-# LANGUAGE StrictData #-} | ||
|
|
||
| module Distribution.Types.Annotation where |
There was a problem hiding this comment.
Could we possibly have more comments about the purpose and semantics of this module and its constituents? It's hard to review otherwise.
|
Point of order
@Bodigrim's comments are very welcome by the Cabal maintainers, and we hope they will be addressed (and we hope for more of them!). |
|
Thanks for the comments bodigrim :) |
Blankline doesn't roundtrip: our triviatree model doesn't associate it with anything. To be fixed in another way.
packagedescription 4 roundtrips
we don't use ordering numbers anymore
Hello,
I made a proof of concept of the exact printer. The idea is that we collect all the trivia that is lost at the leaves, label them with corresponding data and propagate them upwards to the
ParsecFieldGrammar. We call this dataTriviaTree. ThisTriviaTreeis later propagated to the pretty printer fromPrettyFieldGrammarand within each pretty printer instance we can make wiser choice based on what whitespaces is lost and recreate the original output.I would like to know what you think about this approach!
In 408a8b3 where I accept the format test changes, you can see that some whitespaces have changed. The changes are not in the right order yet, but it proves that I can retrive whitespace data from the printer.