Read YAML metadata from a Pandoc markdown file

Is it possible to extract Pandoc's metadata (title, date, et al.) from a markdown file without a Haskell filter, or parsing the --to=json output?

The JSON output is particularly inconvenient for this, since a two-word title looks like:

$ pandoc -t json posts/test.md | jq '.meta | .title'
{
  "t": "MetaInlines",
  "c": [
    {
      "t": "Str",
      "c": "Test"
    },
    {
      "t": "Space"
    },
    {
      "t": "Str",
      "c": "post"
    }
  ]
}

so even after having jq read the title, we still need to reconstruct words, and any emphasis, code, or anything else is only going to make it more complicated.

Solution

We can use the template variable $meta-json$ for this.

Stick the variable in a file (with an extension, to stop Pandoc looking in it's own directories) and then use it with pandoc --template=file.ext.

Pandoc's output is a JSON object with keys "title", "date", "tags", etc. and their respective values from the markdown document, which we can easily parse, filter, and manipulate with jq.

$ echo '$meta-json$' > /tmp/metadata.pandoc-tpl
$ pandoc --template=/tmp/metadata.pandoc-tpl | jq '.title,.tags'
"The Title"
[
  "a tag",
  "another tag"
]