I'm trying to write a lua filter for Quarto/pandoc that removes all code blocks that do not match the target languages as defined in the yaml header of the document. This is the filter I got so far:
taget_lang = nil
function Meta(m)
if m.taget_lang then
taget_lang = pandoc.utils.stringify(m.taget_lang)
end
print("In Meta, taget lang is " .. taget_lang)
return m
end
function CodeBlock(el)
print("In CodeBlock, taget lang is " .. taget_lang)
if taget_lang then
if el.attr.classes[1] ~= taget_lang then
return {}
end
return el
end
end
And this is an example markdown (or rather Quarto) document:
---
title: Some title
author: Some author
date: last-modified
format:
ipynb:
toc: false
filters:
- langsplit.lua
taget_lang: "python"
---
Here is some text.
```{python test-py}
print("some python code")
```
```{r test-r}
print("some R code")
```
When I use quarto render test.qmd
, I get this print output:
nil
nil
nil
nil
In Meta, taget lang is python
And the rendered document contains all code, telling me that the CodeBlock function has no access to the taget_lang defined inside Meta. But this should work, based on the documentation. Any clues?
(I'm also unhappy with return {}
, which returns an empty code block instead of nothing, but that's a separate issue)
The docs specify that metadata is process after blocks have been filtered, so target_lang
is set only after the CodeBlock elements have been processed.
There are two ways to deal with this. One method is to filter the main Pandoc element, which gives more control:
function Pandoc (doc)
local target_lang = doc.meta.target_lang
return doc:walk {
CodeBlock = function (cb)
if cb.classes[1] ~= target_lang then
return {} -- delete block
end
end
}
end
The alternative is to control the execution order of the filters by explicitly returning a sequence of filters, like so:
return {
{ Meta = Meta},
{ CodeBlock = CodeBlock },
}