Search code examples
google-apps-scriptgoogle-drive-apigoogle-docsdocx

How do I extract the text from a docx file using Apps Script?


The files are saved in a Drive folder. I need to send the text content of all .docx file as an API payload. I've tried using Blob but to no avail. Is there a way to get this done?


Solution

  • If I understand you correctly, you want to send the text content of a docx file that you have in Drive. If that's correct, then you can do the following:

    function docx() {
      var docxId ="your-docx-id";
      var docx = DriveApp.getFileById(docxId);
      var blob = docx.getBlob();
      var file = Drive.Files.insert({}, blob, {convert:true});
      var id = file["id"];
      var doc = DocumentApp.openById(id);
      var text = doc.getBody().getText();
      return text;
    }
    

    This code uses Advanced Drive Service to create a Docs file out of the blob you get from the docx, via Drive.Files.insert. Then, you can easily access this newly created file via DocumentApp and use getText.

    Bear in mind that this will create a new file every time you run it. Use Files.delete to avoid that.

    I hope this is of any help.