I have an html file that goes like this:
<!DOCTYPE html>
<html>
<head>
<style>
h1 {text-align:center;}
p {text-align:center;}
</style>
</head>
<body>
<h1>My heading</h1>
<p>Some poetry here.</p>
</body>
</html>
And I want to convert it to docx in pandoc. I tried with the usual command
pandoc -s test.html -o test.docx
And the text is correctly rendered, but it is not centered. I am automatically generating hundreds of htmls so a manual fix isn't in the budget. Basically I need to have some paragraphs left-aligned (the default) and some centered, since they are poetry. How can this be achieved?
Thank you.
PS: I could also use markdown as the input language instead of Html.
You need to customize a docx template and apply the template when converting HTML into docx. In your case, <h1>
is converted into Heading 1
in Word, and <p>
is converted into First Paragraph
.
Steps:
Create a docx template.
pandoc -o custom-reference.docx --print-default-data-file reference.docx
Open custom-reference.docx
and modify Styles.
Heading 1
First Paragraph
Save custom-reference.docx
Convert.
pandoc input.html -o output.docx --reference-doc custom-reference.docx