Good morning!
I was following the tutorial at Pandoc: avoid paragraphs or add css class to paragraph?, but it does not solve my problem. I also gave a look for another question: Disable pandoc convert the image’s alt text to a paragraph when docx to markdown.
Here is the small code:
1. Lorem ipsum dolor sit amet consectetur adipisicing elit. Hic, reprehenderit.
2. Lorem ipsum dolor sit, amet consectetur adipisicing elit:

3. Lorem ipsum dolor sit amet consectetur adipisicing elit. Enim voluptates similique ab doloremque delectus veniam.
I ran the following:
pandoc bug.md -f markdown_github+fenced_divs-implicit_figures-native_divs+raw_html -t html -o bug.md
Here is the output:
<ol>
<li>
<p>Lorem ipsum dolor sit amet consectetur adipisicing elit. Hic,
reprehenderit.</p>
</li>
<li>
<p>Lorem ipsum dolor sit, amet consectetur adipisicing elit:</p>
<p><img src="assets/images/iuacessos-preferences.png" alt="example" /></p>
</li>
<li>
<p>Lorem ipsum dolor sit amet consectetur adipisicing elit. Enim
voluptates similique ab doloremque delectus veniam.</p>
</li>
</ol>
You can see that Pandoc adds p
element in every line, in every element, including the li
element. It also added p
among the img
element, and nested p + img
inside the li
element.
The code should be like:
<ol>
<li>Lorem ipsum dolor sit amet consectetur adipisicing elit. Hic, reprehenderit.</li>
<li>Lorem ipsum dolor sit, amet consectetur adipisicing elit:</li>
<img src="assets/images/iuacessos-preferences.png" alt="example" />
<li>Lorem ipsum dolor sit amet consectetur adipisicing elit. Enim voluptates similique ab doloremque delectus veniam.</li>
</ol>
It is elegant and clean. Differently, GitHub has exactly this same output, doesn't wrap every line with a p
element, and doesn't nest the image inside the li
element.
Observe that I use mostly markdown_github
because it supports more features than other Pandoc Markdown variants.
Note that recent pandoc versions say Deprecated: markdown_github. Use gfm instead.
So what you should be using is:
pandoc -f gfm -o bug.html bug.md
which will use the exact same markdown parser that github itself uses.
Note that the HTML you posted under "the code should be like" is invalid, since an <ol>
can only have <li>
as direct children. Perhpas you meant:
<ol>
<li>Lorem ipsum dolor sit amet consectetur adipisicing elit. Hic, reprehenderit.</li>
<li>
Lorem ipsum dolor sit, amet consectetur adipisicing elit:
<img src="assets/images/iuacessos-preferences.png" alt="example" />
</li>
<li>Lorem ipsum dolor sit amet consectetur adipisicing elit. Enim voluptates similique ab doloremque delectus veniam.</li>
</ol>
For which pandoc -f html -t gfm
gives the correct markdown:
1. Lorem ipsum dolor sit amet consectetur adipisicing elit. Hic,
reprehenderit.
2. Lorem ipsum dolor sit, amet consectetur adipisicing elit:

3. Lorem ipsum dolor sit amet consectetur adipisicing elit. Enim
voluptates similique ab doloremque delectus veniam.
If you're wondering why you get the <p>
around the image:
From the MANUAL:
A paragraph is one or more lines of text followed by one or more blank lines.
And why you get the <p>
around the list items:
A bullet list is a list of bulleted list items. A bulleted list item begins with a bullet (*, +, or -). Here is a simple example:
* one * two * three
This will produce a “compact” list. If you want a “loose” list, in which each item is formatted as a paragraph, put spaces between the items:
* one * two * three