Search code examples
phpphpwordphpoffice

docx to html with phpword issue


I'm encountering an issue when converting docx document into HTML with PHPWord library (https://github.com/PHPOffice/PHPWord).

Here is the code snippet I use:

$phpWord = \PhpOffice\PhpWord\IOFactory::load('test.docx');
$htmlWriter = new \PhpOffice\PhpWord\Writer\HTML($phpWord);
$htmlWriter->save('test.html');

The issue is that each block of text is encapsulated in <p> tags regardless if I defined titles in the docx document. I would expect <h1> <h2>... tags to be generated. Bullet list are lost too.

Does it work as designed or did I miss something?

Thank you for your feedback.

Regards


Solution

  • There's a little bit of a problem when it comes to using IOFactory::load of PHPWord such as what you encountered now, depending what saved the file or what version of Microsoft Word is used to create that file. If the encoding and tags of the docx file cannot be found by PHPWord , then it will produce unexpected results

    The code is fine, the problem is already with the dependency.