What might cause "commitAndReleaseBuffer: invalid argument (invalid character)" when using pandoc as a library?

I'm using pandoc as a library, and the relevant code snippet is:

module Lib
    ( latexDirToTex, latexToTxt
    ) where

import qualified Data.ByteString as BS
import           Data.List (isSuffixOf)
import qualified Data.Text as T
import qualified Data.Text.IO as TIO
import           ForeignLib (chdir)
import           Path
import           System.Directory (getDirectoryContents )
import           Text.Pandoc
import           Text.Pandoc.UTF8 (toText)

latexToTxt :: Path b File -> IO T.Text
latexToTxt  fPath = do
  fileBS <- BS.readFile $ toFilePath fPath
  result <- runIO $ do
    doc <- readLaTeX def $ toText fileBS
    writePlain def doc
  handleError result

From this, you can see that I'm basically just calling readLaTeX to read in a LaTeX document.

However, when I try to run this code, I have lots of trouble in practice, with errors like the one in the title:

[WARNING] Could not convert TeX math '\begin{array}{ccccccccccc}
       &  & 1 & 2 & 4 & 7 & 11 & 15 & 15 &  &  \\
  \hline
      0 & \vline & 1 & 0 & 0 & 0 & 0 & 0 & 0 & \vline & 1 \\
      1 & \vline & 1 & 1 & 0 & 0 & 0 & 0 & 0 & \vline & 3 \\
      2 & \vline & 1 & 2 & 1 & 0 & 0 & 0 & 0 & \vline & 9 \\
      3 & \vline & 1 & 3 & 3 & 1 & 0 & 0 & 0 & \vline & 26 \\
      4 & \vline & 1 & 4 & 6 & 4 & 1 & 0 & 0 & \vline & 72 \\
      5 & \vline & 1 & 5 & 10 & 10 & 5 & 1 & 0 & \vline & 191 \\
      6 & \vline & 0 & 6 & 15 & 20 & 15 & 6 & 1 & \vline & 482 \\
      7 & \vline & 0 & 0 & 21 & 35 & 35 & 21 & 7 & \vline & 1134 \\
      8 & \vline & 0 & 0 & 0 & 56 & 70 & 56 & 28 & \vline & 2422 \\
      9 & \vline & 0 & 0 & 0 & 0 & 126 & 126 & 34 & \vline & 4536 \\
      10 & \vline & 0 & 0 & 0 & 0 & 0 & 252 & 210 & \vline & 6930 \\
      11 & \vline & 0 & 0 & 0 & 0 & 0 & 0 & 462 & \vline & 6930
    \end{array}', rendering as TeX:
      0 & \vline & 1 & 0 & 0 & 0 & 0 & 0 &
          ^
  unexpected "\\"
  expecting "&", "\\\\", white space or "\\end"
arxiv-pandoc-static: <stdout>: commitAndReleaseBuffer: invalid argument (invalid character)

Contrasting this to using the pandoc executable directly, no such errors occur and I receive quite good output. I'd like to configure the pandoc readers to be as flexible as possible and to not bail out on errors (or better yet, avoid the errors in the first place). How can I achieve this through the pandoc API?

Solution

I believe that this is less of a pandoc issue and more one of a GHC or the text package. The answer can be found in a completely unrelated Haskell project, the hledger docs:

Getting errors like "Illegal byte sequence" or "Invalid or incomplete multibyte or wide character" or "commitAndReleaseBuffer: invalid argument (invalid character)"

Programs compiled with GHC (hledger, haskell build tools, etc.) need to have a UTF-8-aware locale configured in the environment, otherwise they will fail with these kinds of errors when they encounter non-ascii characters.

To fix it, set the LANG environment variable to some locale which supports UTF-8. The locale you choose must be installed on your system.

So running something like export LANG=C.UTF-8 in your shell should resolve this.