Search code examples
haskellpandoc

Trying to recreate pandoc cli behavior by using the pandoc library


I've been using the pandoc executable to convert a markdown file to a man-page via

pandoc -f markdown -s -t man foo.1.md -o foo.1

and it's been working like a charm. However, for organization's sake, I decided I wanted to separate the markdown source into multiple files so that I can combine particular pieces to generate multiple man pages. I figured the easiest way (for my use case) would be to just use the pandoc library directly; however, I'm struggling to even recreate the above functionality. I assumed the following would do just that:

{-# LANGUAGE OverloadedStrings #-}
import Data.Default
import qualified Data.Text.IO as TIO
import Text.Pandoc

main :: IO ()
main = do
    input <- TIO.readFile "foo.1.md"
    res <- runIOorExplode $ 
        readMarkdown (def { readerExtensions = pandocExtensions, readerStandalone = True }) input >>=
            writeMan (def { writerExtensions = getDefaultExtensions "man" })
    TIO.writeFile "foo.1" res

However, this loses almost all of the formatting that is retained by the cli invocation.

Even looking at the haddocks and briefly skimming the source-code, I don't understand what options the cli invocation is passing into the readMarkdown and writeMan functions. Perhaps I could understand this if I spent time digesting the source-code for Text.Pandoc.App, but I'm hoping someone here can save me some time and effort by either telling me outright what I should be doing differently, or by pointing me to the relevant blocks of code.

Note: I've additionally tried setting readerExtensions = getDefaultExtensions "markdown" and leaving off the readerStandalone option.

Thanks in advance!

Edit: After playing around with this some more, I realized the issue. Pandoc isn't actually eating up chunks of the formatting; the issue is that pandoc isn't generating the .TH block that's required at the top of a man-page in roff (and hence man -l ./foo.1 wasn't rendering the formatting that actually was present). Currently, my markdown file begins with

% FOO(1) foo 0.1.0.0
% Mark Down
% Today

which the cli invocation correctly translates into

.TH "FOO" "1" "Today" "foo 0.1.0.0" ""

and places the author credit at the end. So I guess my question is thus: What options do I need to pass to either readMarkdown or writeMan in order to have this "header" be parsed correctly? I've tested the various options/extensions that seemed relevant to no avail.


Solution

  • Sorry for answering my own question, but I finally figured it out. The issue is that I needed to load the proper template for a man page. The following code correctly renders the man-page that I wanted:

    {-# LANGUAGE OverloadedStrings #-}
    import Data.Default
    import qualified Data.Text.IO as TIO
    import Text.Pandoc
    
    main :: IO ()
    main = do
        input <- TIO.readFile "foo.1.md"
        res <- runIOorExplode $ do
            t <- compileDefaultTemplate "man"
            readMarkdown (def { readerExtensions = pandocExtensions, readerStandalone = True }) input >>=
                writeMan (def { writerTemplate = Just t })
        TIO.writeFile "foo.1" res