Search code examples
luamarkdownpandoc

Solving the first exercise on "Pandoc Filters" page


The first question on https://pandoc.org/filters.html#exercises asks to convert all text to uppercase except if it is part of a URL or a link title. So, I read the discussion about "Execution Order" in lua filters at https://pandoc.org/lua-filters.html#execution-order and came up with

text = require 'text'

links = {}

function Link(el)
  links[el.target] = el.content
  return el
end

function Str(el)
  el.text = text.upper(el.text)
  return el
end

function Inlines(elems)
  for i=1,#elems,1 do
    if elems[i].tag == 'Link' then
      elems[i].content = '<====' .. links[elems[i].target] .. '====>' -- just so that I can see it in the document.
      -- elems[i].content = pandoc.Str 'hello'
    end
  end
  return elems
end

--[[ -- Explicitly force order of filters -- from "Execution Order" list...
return {
  { Link = Link,
    Str = Str,
    Inlines = Inlines
  }
}
]]

thinking that this will solve my problem. But somehow I cannot get this to work. I have also tried arranging the table explicitly by forcing the order (at the end of the script...commented) of called filters and yet it doesn't seem to work. What am I doing wrong?


Solution

  • The exercise asks:

    Put all the regular text in a markdown document in ALL CAPS (without touching text in URLs or link titles).

    This can be done the way you describe above:

    local text = require 'text'
    function Str (s)
      s.text = text.upper(s.text)
      return s
    end
    

    This leaves URLs and link titles alone.


    Leaving the link text alone is a bit more difficult. Pandoc Lua filters traverse the document tree in depth-first postorder, so a Link node will be handled only after its content has been handled. We can verify and visualize this with a simple filter like

    function Inline (i)
      print(i.tag, pandoc.utils.stringify(i))
    end
    

    Running the above on an input like Hello, [Free Encyclopedia](https://en.wikipedia.org) will produce

    Str     Hello,
    Space    
    Str     Free
    Space    
    Str     Encyclopedia
    Link    Free Encyclopedia
    

    Using Inlines instead of Inline is no different: the nested elements are processed before we even know which element they belong to. This effectively means we cannot (easily) prevent conversions from affecting a specific subtree.

    That's unfortunate (and, as the author of the Lua filter system, something I'd like to change in the future). However, not all is lost. We can work around this with a simple trick: save, then restore the original link contents:

    local text = require 'text'
    local links = pandoc.List()
    
    function to_allcaps (s)
      s.text = text.upper(s.text)
      return s
    end
    
    function save_link (l)
      links:insert(l)
    end
    
    function restore_link (l)
      return links:remove(1)
    end
    
    return {
      {Link = save_link},
      {Str = to_allcaps},
      {Link = restore_link},
    }
    

    Here, we traverse the document three times, as indicated by the three separate filters in the returned filter list. First, we collect all links into a list; then we make everythin ALL CAPS; finally we restore the original links, thus undoing all uppercase modifications in their link captions.

    Compact version:

    local text = require 'text'
    local links = pandoc.List{}
    return {
      {Link = function (l) links:insert(l) end},
      {Str = function (s) return pandoc.Str(text.upper(s.text)) end},
      {Link = function (_) return links:remove(1) end},
    }