Search code examples
latexpandoc

Pandoc latex to html: cannot keep code formatting


I'm trying to convert latex to html while preserving the formatting of the included source code. I have the following .text file (book.tex):

\documentclass{article}
\begin{document}

\chapter{Heading}

First paragraph of text.
\begin{code}
function sum(a, b) {
  return a + b;
}
\end{code}

Last paragraph of text.

\end{document}

And when I convert it with a simple: pandoc book.tex book.html, I get the following HTML:

<h1 id="heading">Heading</h1>
<p>First paragraph of text.</p>
<div class="code">
<p>function sum(a, b) <span> return a + b; </span></p>
</div>
<p>Last paragraph of text.</p>

The code formatting is completely lost.

Ideally I would get something like this:

<h1 id="heading">Heading</h1>
<p>First paragraph of text.</p>
<pre>
function sum(a, b) {
  return a + b;
}
</pre>
<p>Last paragraph of text.</p>

I tried using lua filters pandoc --lua-filter=transform.lua book.tex book.html:

function Div(div)
    if div.classes:includes("code") then
        local content = pandoc.utils.stringify(div.content)

        local preElement = pandoc.RawBlock('html', '<pre>' .. content .. '</pre>')

        return preElement
    end

    return div
end

Which gets me the pre element, but not the formatting:

<h1 id="heading">Heading</h1>
<p>First paragraph of text.</p>
<pre>function sum(a, b)  return a + b; </pre>
<p>Last paragraph of text.</p>

Thank you


Solution

  • Pandoc doesn't know the code environment. Better results can be achieved with the lstlisting or minted environment. For example:

    \begin{minted}{javascript}
    function sum(a, b) {
      return a + b;
    }
    \end{minted}
    

    This will result in properly highlighted code blocks. The downside is that this requires changes to the LaTeX input.