I have a markdown document containing remote image links named input.md
:
Using Pandoc to convert documents

Pandoc is really *awesome*!
And with the command below to convert it to docx:
pandoc input.md -o output.docx
After getting output.docx
, then converted it to markdown again:
pandoc output.docx --extract-media=. -t commonmark-raw_html -o output.md
Here the option ommonmark-raw_html
was applied to disable the image size’s tag, and the converted content of output.md
:
Using Pandoc to convert documents

this is the image caption
Pandoc is really *awesome*!
You can see the image’s alt text this is the image caption
was displayed twice. The first one was the actual image’s alt text, and second was a paragraph below. But I would like to remove the paragraph of the image’s alt text, which was redundant.
Why did I converted markdown twice here? Because I wanted to replace the remote image link with the local image link in markdown.
It would be great if you have any suggestion to resolve the issue. Thanks in advance!
The easiest way is possibly to convert directly to Markdown while using the --extract-media
option:
pandoc input.md --extract-media=media -t commonmark -o output.md
That option can be used with any input format, it's just not as common. But this is one of the use-cases where it makes sense.