I have a Markdown document containing raw LaTeX commands. I am trying to use a Lua filter with Pandoc (2.0.1.1) to convert the LaTeX commands into something more portable. In particular, commands that specify the language of text should be converted into spans with a lang
attribute. The problem is that I don't know how to pass the attributes to the pandoc.Span
constructor. This is my attempt at a filter (filter.lua
):
function RawInline(elem)
if elem.format == "tex" then
text = string.match(elem.text, "\\textspanish{(.+)}")
if text then
contents = {pandoc.Str(text)}
attrs = pandoc.Attr("",{},{lang = "es-SP"})
return pandoc.Span(contents, attrs)
end
else
return elem
end
end
Sample usage:
echo '\textspanish{hola}' | pandoc -f markdown -t native --lua-filter=filter.lua
The output is [Para [Span ("",[],[]) [Str "hola"]]]
, with no attributes on the span.
If I pass a name and/or class to pandoc.Attr
, these come through, e.g., attrs = pandoc.Attr("name",{"class"},{lang = "es-SP"})
produces [Para [Span ("name",["class"],[]) [Str "hola"]]]
. But attributes I pass to the constructor never appear in the output. What is the right way to pass attributes to pandoc.Attr
?
This used to be one of the rough edges in the Lua filter implementation; it has since been ironed out and made more user friendly, so the above example now works as expected.
Internally, pandoc uses two-element tables to hold key-value pairs. It roughly looks like this:
attrs = pandoc.Attr("", {}, {{"lang", "es-SP"}})
Of course, this is not a great way to represent pairs. The reason for the current implementation is two-fold:
The last part is important when one wants to guarantee that the order of attributes won't be changed when passing through a filter. There is no rule in Lua which determines the order of keys in a table: the Lua table {one = 1, two = 2}
could be read back into pandoc as the attribute list {one="1" two="2}
or as {two="2" one="1"}
. Now, the order of attributes shouldn't matter for most applications, but we cannot be sure. Hence the less-than-intuitive representation.
The internal representation hasn't changed, but we have since improved the representation of Attr objects in Lua, extended the marshaling code, and added a Lua metatable. As a result, attribute tables are treated as expected. Furthermore, many users may find it more intuitive to use HTML-like attribute lists instead of "identifier, class, atttributes" triples. That is supported as well now:
attr = pandoc.Attr{id='some-id', class="one two", lang='es-SP'}
In fact, it is not necessary to use the pandoc.Attr
constructor at all, just passing a table will work:
return pandoc.Span(contents, {lang='es-SP'})