Search code examples
luamarkdownpandoc

Unwrapping a Markdown table with a Pandoc Lua filter


I have a DOCX file with this content:

# Heading

+---------------------+
| Paragraph           |
|                     |
| ## Subheading       |
|                     |
| +-----------------+ |
| | Nested table    | |
| +-----------------+ |
+---------------------+

One last paragraph

Here is a sample file.

I want to run it through Pandoc and get this Markdown, with all tables unwrapped:

# Heading

Paragraph      

## Subheading  

Nested table 

One last paragraph

I'm trying to write a Lua filter with walk_block but I have no experience with Lua and not making any progress. Can anyone point me in a helpful direction?

function Table(table)
    pandoc.walk_block(table, {
        Str = function(el)
            -- TODO now what???
        end
    })
end

Solution

  • The Lua interface to tables is currently rather complex, so it's much simpler to convert the table into a so-called simple table. We can do so with pandoc.utils.to_simple_table. A simple table has a header row (header) and multiple body rows (rows), and we get access to cells by iterating over a row. Each cell is just a Blocks list, which we can collect in an accumulator.

    Here's how this looks like:

    function Table (tbl)
      local simpleTable = pandoc.utils.to_simple_table(tbl)
      local blocks = pandoc.Blocks{}
      for _, headercell in ipairs(simpleTable.header) do
        blocks:extend(headercell)
      end
      for _, row in ipairs(simpleTable.rows) do
        for _, cell in ipairs(row) do
          blocks:extend(cell)
        end
      end
      return blocks, false
    end
    

    Running that filter should unwrap all tables, leaving just their contents.