Search code examples
haskelllatexpandocmathml

Pandoc filter mimicking default MathML conversion


I am writing a Pandoc JSON filter in Haskell that should transform display LaTeX math to SVG with an external application, whereas inline LaTeX math should be transformed to MathML by pandoc internally.

The first SVG bit is working fine; it is the MathML bit that should mimic standard pandoc behaviour that is giving me problems.

Browsing Hackage, I found the texMathToMathML code example (see below). This function returns Either String Element.

However, what I need is a function tex2mml (see below) returning an IO String. What needs to be added to the definition of tex2mml to achieve this?

tex2mml latex = texMathToMathML DisplayInline latex

I am doing this on (X)Ubuntu LTS 16.04 with the following pandoc 1.16.0.2 packages installed:

$ sudo apt install pandoc libghc-pandoc-prof

Here is an excerpt of what I got so far:

#!/usr/bin/env runhaskell

import Text.Pandoc.JSON
import Control.Applicative ((<$>))
import Text.TeXMath (writeMathML, readTeX, DisplayType( DisplayInline ) )
import Text.XML.Light (Element)


texMathToMathML :: DisplayType -> String -> Either String Element
texMathToMathML dt s = writeMathML dt <$> readTeX s


tex2mml :: String -> IO String
tex2mml latex = texMathToMathML DisplayInline latex


main :: IO ()
main = toJSONFilter tex2math
  where tex2math (Math (DisplayMath) latex) = do
          svg <- tex2svg latex
          return (Math (DisplayMath) (svg))

        tex2math (Math (InlineMath) latex) = do
          mml <- tex2mml latex
          return (Math (InlineMath) (mml))

        tex2math other = return other

Please, bear with me, as I am an absolute Haskell beginner. Any suggestions for code improvement are more than welcome!


Solution

  • Admittedly I'm not familiar with Pandoc and the problem domain but if correctly understood the purpose of tex2mml function then I believe this should achieve what you want:

    import Control.Applicative ((<$>))
    import Text.Pandoc.JSON
    import Text.TeXMath
           (writeMathML, readTeX, DisplayType(DisplayInline))
    import Text.XML.Light (Element,showElement)
    
    texMathToMathML :: DisplayType -> String -> Either String Element
    texMathToMathML dt s = writeMathML dt <$> readTeX s
    
    tex2mml :: String -> String
    tex2mml latex = either id showElement (texMathToMathML DisplayInline latex)
    
    -- actual definition of tex2svg goes here
    tex2svg = undefined
    
    main :: IO ()
    main = toJSONFilter tex2math
      where
        tex2math :: Inline -> IO Inline
        tex2math (Math DisplayMath latex) = do
          svg <- tex2svg latex
          return (Math DisplayMath svg)
        tex2math (Math InlineMath latex) = return (Math InlineMath (tex2mml latex))
        tex2math other = return other
    

    I'm using either function to scrutinise the result of the conversion function texMathToMathML - in case of failure the error is returned as is (id), in case of success showElement function is used to convert Element into its XML string representation.

    This could also be rewritten using pattern matching if you find that clearer:

    tex2mml :: String -> String
    tex2mml latex = case texMathToMathML DisplayInline latex of
      Left err -> err
      Right xml -> showElement xml
    

    As the computation is pure it doesn't need to be embedded in IO monad, and the result can be passed straight into the Math constructor.

    There are also other functions in Text.XML.Light.Output module if you wish to pretty print the XML string or wish to include XML document header in the output.