How can I force pdftohtml output to be UTF 8?
$ pdftohtml -enc utf8 my.pdf
Error: Couldn't find unicodeMap file for the 'utf-8' encoding
And -listenc
doesn't seem to be a valid option.
I think it is using ISO-8859-1 by default (although for some reason VIM reads the file and special characters fine even though :set enc?
reports utf-8
)
Please run the command by using pdftohtml -enc UTF-8 file.pdf
Like:
$ pdftohtml -enc UTF-8 my.pdf