Search code examples
luapandoc

Pandoc filter in Lua to alter text that are not headings


I am writing a Lua filter for pandoc that adds a glossary function to HTML output of a markdown file. The goal is to add mouseover text to each occurrence of an acronym or key definition in the document.

However, I don't want this to occur for text in headings.

My MWE works on most* text in the document:

-- Parse glossary file (summarised here for brevity)
local glossary = {CO = "Cardiac Output", DBP = "Diastolic Blood Pressure", SBP = "Systolic Blood Pressure"}

-- Substitute glossary term for span with a mouseover link
function Str(elem)
  for key, value in next, glossary do
    if elem.text == key then
      return pandoc.Span (key, {title = value, class = "glossary"})
    end
  end
end

My understanding from the documentation and poking at the AST suggests to me I need to use a block-level function first and then walk_block to alter the inline elements.

function Pandoc(doc)
  for i, el in pairs(doc.blocks) do
    if (el.t ~= "Header") then
      return pandoc.walk_block(el, {
        Str = function (el)
          for key, value in next, glossary do
            if el.text == key then
              return pandoc.Span (key, {title = value, class = "glossary"})
            end
          end
        end })
    end
  end
end

However, this attempt isn't working and returns the error: "Error while trying to get a filter's return value from Lua stack. PandocLuaError "Could not get Pandoc value: expected table, got 'nil' (nil)". I think my return structure is wrong, but I haven't been able to debug it.


My test markdown file contains:

# Acronyms: SBP, DBP & CO

Spaced acronyms: CO and SBP and DBP.

In a comma-separated list: CO, SBP, DBP; with backslashes; CO/DBP/SBP, and in bullet points:
  
* CO
* SBP
* DBP

*It fails on terms with non-space adjacent characters, such as punctuation.


Solution

  • After a couple more days, I have found a partial solution which may help anyone else with a similar problem.

    I think (but am not certain) that the Pandoc(doc) requires a return of both a list of block elements and the doc.meta, which I wasn't doing above.

    My solution has been to separate the glossary function out and then call it individually for each desired block element. This works, even though it is a little clunky.

    function glos_sub (el)
      return pandoc.walk_block(el, {
        Str = function (el)
          for key, value in next, glossary do
            if el.text == key then
              return pandoc.Span (key, {title = value, class = "glossary"})
            end
          end
        end
      })
    end
    
    -- Run on desired elements
    return {
      {BulletList = glos_sub},
      {Para = glos_sub},
      {Table = glos_sub}
    }