Search code examples
rubyms-wordlibreofficeabiword

Remove macros etc from office documents via ruby


Is there a way of specifying components to remove from MS or Openoffice documents via ruby? I'm talking about removing macros/meta information and also removing/replacing images. I've looked at a number of conversion programs with a view to doing a conversion from/to the same file format, but I can't find any that allow such options to be specified.

I've looked at:


Solution

  • Docx files are really zip files. You can unzip them (inflate) into a directory and delete or change the files you need, and update references to those files. The files inside the zip are text files, XML, so you can use LibXML-Ruby or Nokogiri.