Search code examples
luapandocquarto

Filter CodeBlocks for target language in Quarto to ipynb conversion


I'm trying to write a lua filter for Quarto/pandoc that removes all code blocks that do not match the target languages as defined in the yaml header of the document. This is the filter I got from another answer:

function Pandoc (doc)
    if doc.meta.target_lang then
        local target_lang = pandoc.utils.stringify(doc.meta.target_lang)
        print("target lang is " .. target_lang)
        return doc:walk {
            CodeBlock = function (cb)
              if cb.classes[1] ~= target_lang then
                return {}
              end
            end
          }
    end  
    return doc
  end

However, this one returns an empty codeblock, which is annoying when rendering to ipynbb:

{
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [],
      "id": "c0eaebc8-2af1-42bc-840a-dda2e54fddbb"
}

This is the original document:

---
title: Some title
author: Some author
date: last-modified
format:
  ipynb: 
    toc: false
    filters: 
      - langsplit.lua
target_lang: "python"
---

Here is some text.

```{python test-py}
print("some python code")
```

```{r test-r}
print("some R code")
```

Solution

  • I can propose the following hack where we check whether a Pandoc Div contains such codeblock whose language is not our target lang, if yes, we replace these codeblock with pandoc.Para. And then later, we drop the Div whose first content is pandoc.Para.

    local str = pandoc.utils.stringify
    
    local function lang_checker(target_lang)
      -- Check for the code blocks with different language other than `target_lang`
      -- and if it is something other than `target_lang`, then return `pandoc.Para`
      -- instead of CodeBlock. 
      return {
        CodeBlock = function(cb)
          if cb.classes:includes('cell-code') and (not cb.classes:includes(target_lang)) then
            return pandoc.Para("")
          else 
            return cb
          end
        end
      }
    end
    
    
    local function remove_div_with_para(target_lang)
      -- Now remove the Div whose first content is pandoc.Para
      return {
        Div = function(el)
          if el.classes:includes('cell') then
            local check_lang = el:walk(lang_checker(target_lang))
            if check_lang.content[1].t == "Para" then
              return pandoc.Null()
            else
              return el
            end
          end
        end
      }
    end
    
    
    function Pandoc(doc)
      local target_lang = doc.meta.target_lang and str(doc.meta.target_lang) or nil
      if target_lang then
        return doc:walk(remove_div_with_para(target_lang))
      end
      return doc
    end