I'm using pdfkit to convert html files that have links with href attributes in them.
Inside of the html, href's are written with relative paths, e.g.:
<a href="folder/picture.jpg">PIC</a>
When I convert this to pdf, the hrefs seem to be automatically rewritten to absolute paths (C:/Users/...
).
Why does pdf change the href?
Wkhtmltopdf, which pdfkit relies on, converts relative links to absolute links by default.
This can be stopped by using the command line tool with a special flag:
wkhtmltopdf --keep-relative-links src destination
Or by telling pdfkit to apply this option:
def convert_to_pdf(path):
try:
# run the conversion and write the result to a file
config = pdfkit.configuration(wkhtmltopdf=path_wkthmltopdf)
options = {
'--keep-relative-links': ''
}
pdfkit.from_url(path+'.htm', path+'.pdf', configuration=config, options=options)
except Exception as why:
# report the error
sys.stderr.write('Pdf Conversion Error: {}\n'.format(why))
raise