Search code examples
latexpandocfigure

How can I strip figures and table during a pandoc LaTeX to Word conversion?


I am trying to use pandoc to convert a thesis from latex to docx. In general, this works well with the following command:

pandoc input.tex -f latex -t docx -s -o output.docx --bibliography references.bib --csl=mystyle.csl

However, I have an additional requirement that I am unable to fulfill. I want the output to be stripped from any figures and tables that are included in the source files. Reading the pandoc documentation and related stackoverflow question has not helped me so far.

Do you have suggestions on what could do the trick?


Solution

  • This is a poster use-case for pandoc filters. The following Lua filter will delete all tables and images:

    function Image () return {} end
    function Table () return {} end
    

    Save it to a file, say remove-tables-images.lua, and pass the file to pandoc via the --lua-filter parameter:

    pandoc input.tex -s -o output.docx \
        --bibliography references.bib --csl=mystyle.csl \
        --lua-filter remove-tables-images.lua