Search code examples
phppdfms-wordlibreofficedoc

Convert pdf to word document using php


I am trying to convert pdf to doc using Libreoffice in php which isn't working.

path/to/soffice --infilter="writer_pdf_import" --convert-to doc file.pdf /path/to/test.docx

PS: Is there any other better solutions to parse pdf and extract images not just text and then covert it to doc representations.


Solution

  • Well, you didn't show us the error, so I don't know why your command isn't working. But, that command is not a Libreoffice's command. You are using a soffice's command:

    This an example using libreoffice Software:

    path/to/libreoffice --headless --invisible --convert-to doc your_source_file.pdf
    

    Note:

    This solution only converts the text without the images.

    Alternative 01:

    If LibreOffice doesn't work on your system, Abiword also works in a similar way.

    1. Install Abiword by typing following command in terminal:
    sudo apt-get install abiword
    

    Then perform the conversion:

    abiword --to=doc your_source_file.pdf
    

    Alternative 02:

    If you want to keep using the soffice's command, probably you can use this syntax:

    path/to/soffice --headless --convert-to <TargetFileExtension>:<NameOfFilter> your_source_file.pdf
    

    In your example use "MS Word 2007 XML" for doc files or "Microsoft Word 2007/2010/2013 XML" and "Microsoft Word 2007-2013 XML" for docx as the filter:

    path/to/soffice --headless --convert-to docx:"Microsoft Word 2007/2010/2013 XML" your_source_file.pdf
    

    Here you can find more filters.