I want to convert a markdown to HTML with header numbering, starting from <h2>
.
What's the way to achieve it?
pandoc
provides the option --number-sections
(or -N
) so headers are numbered in the output.
Now I am trying to convert markdown to HTML with this option.
In default, the output HTML header level of pandoc
starts from <h1>
. It is not ideal and so I want to change it to <h2>
(whereas the original markdown may contain many first-level headers, the output HTML should contain at most 1 <h1>
).
It is possible to specify --shift-heading-level-by=1
; then, the output header level starts from <h2>
(see Official Pandoc User's Guide and maybe also this question).
However, it would mess up the section-numbering! Basically, the level of the section numbering shifts, too. Now all sections are under "0" (like 0.1, 0.2, 0.2.1, …) and no sections of 1 exist.
pandoc
provides another option --number-offset=1
but what it does is just offseting the numbers like "0.1"→"1.1". Then, all section numbers start from 1 with no sections numbered 2. Obviously, it makes no sense. The initial prefix number "1." is redundant and should be removed from all the section numbers like 1.1→1, 1.1.4→1.4, 1.2.3→2.3, etc.
For demonstration purposes, here is a sample markdown text file (abc.md
)
%Test-md
# First Header (1) #
## Header (1-1) ##
# Second Header (2) #
## Header (2-2) ##
### Header (2-3) ###
and its output HTML (simplified) with
pandoc -N --section-divs --shift-heading-level-by=1 -t html5 abc.md
<section id="first-header-1" data-number="0.1">
<h2 data-number="0.1">0.1 First Header (1)</h2>
<section id="header-1-1" data-number="0.1.1">
<h3 data-number="0.1.1">0.1.1 Header (1-1)</h3>
</section>
</section>
<section id="second-header-2" data-number="0.2">
<h2 data-number="0.2">0.2 Second Header (2)</h2>
<section id="header-2-2" data-number="0.2.1">
<h3 data-number="0.2.1">0.2.1 Header (2-2)</h3>
<section id="header-2-3" data-number="0.2.1.1">
<h4 data-number="0.2.1.1">0.2.1.1 Header (2-3)</h4>
</section>
</section>
</section>
How can one make pandoc do the numbering in the ordinary way (1, 2, 2.1, 2.2, 2.2.1) yet output the HTML with the header level starting from <h2>
?
Pandoc first shifts the headings, then does the numbering. This is not what we want here though, we'd like the numbering to happen first. A pandoc Lua filters can be used to take control of this.
The function pandoc.utils.make_sections
performs the action that's triggered by passing --section-divs
or --number-sections
on the command line. Matching the effect of --shift-heading-level-by=1
is possible by modifying all Header
elements manually:
function Pandoc (doc)
-- Create and number sections. Setting the first parameter to
-- `true` ensures that headings are numbered.
doc.blocks = pandoc.utils.make_sections(true, nil, doc.blocks)
-- Shift the heading levels by 1
doc.blocks = doc.blocks:walk {
Header = function (h)
h.level = h.level + 1
return h
end
}
-- Return the modified document
return doc
end
The filter would be used by saving it to a file shifted-numbered-headings.lua
. It can then be passed to pandoc via the --lua-filter
/-L
parameter. The --number-sections
/-N
option must still be passed for the numbering to become visible, and --section-divs
is still required to get <section>
elements.
pandoc \
--lua-filter=shifted-numbered-headings.lua \
--number-sections \
--section-divs \
...
The class
that pandoc sets on the <section>
elements will always reflect the actual tagging level: the <section>
that wraps an <h2>
heading will have class="level2"
, even if, conceptually, it is a first level heading. This may be confusing and, unfortunately, cannot be changed with a filter.