I'm writing a pretty-printer for a simple white-space sensitive language.
I like the Leijen pretty-printer library more than I like the Wadler library, but the Leijen library has one problem in my domain: any line break I insert may be overridden by the group
construct, which may compress any line, which might change the semantics of the output.
I don't think I can implement an ungroupable line in the wl-pprint (although I'd love to be wrong).
Looking a bit at the wl-pprint-extras package, I don't think that even the exposed internal interface allows me to create a line which will not be squashed by group
.
Do I just have to rely on the fact that I never use group
, or do I have some better option?
Given that you want to be able to group and you also need to be able to ensure some lines aren't uninserted, why don't we use the fact that the library designers encoded the semantics in the data type, instead of in code. This fabulous decision makes it eminently re-engineerable.
The Doc
data type encodes a line break using the constructor Line :: Bool -> Doc
.
The Bool represents whether to omit a space when removing a line. (Lines indent when they're there.)
Let's replace the Bool:
data LineBehaviour = OmitSpace | AddSpace | Keep
data Doc = ...
...
Line !LineBehaviour -- not Bool any more
The beautiful thing about the semantics-as-data design is that if we replace
this Bool
data with LineBehaviour
data, functions that didn't use it but
passed it on unchanged don't need editing. Functions that look inside at what
the Bool is break with the change - we'll rewrite exactly the parts of the code
that need changing to support the new semantics by changing the data type where
the old semantics resided. The program won't compile until we've made all the
changes we should, while we won't need to touch a line of code that doesn't
depend on line break semantics. Hooray!
For example, renderPretty
uses the Line
constructor, but in the pattern Line _
,
so we can leave that alone.
First, we need to replace Line True
with Line OmitSpace
, and Line False
with Line AddSpace
,
line = Line AddSpace
linebreak = Line OmitSpace
but perhaps we should add our own
hardline :: Doc
hardline = Line Keep
and we could perhaps do with a binary operator that uses it
infixr 5 <->
(<->) :: Doc -> Doc -> Doc
x <-> y = x <> hardline <> y
and the equvalent of the vertical seperator, which I can't think of a better name than very vertical separator:
vvsep,vvcat :: [Doc] -> Doc
vvsep = fold (<->)
vvcat = fold (<->)
The actual removing of lines happens in the group
function. Everything can stay the same except:
flatten (Line break) = if break then Empty else Text 1 " "
should be changed to
flatten (Line OmitSpace) = Empty
flatten (Line AddSpace) = Text 1 " "
flatten (Line Keep) = Line Keep
That's it: I can't find anything else to change!