Search code examples
node.jsgoogle-apps-scriptgoogle-docs-apinamed-ranges

Retrieve text in namedRange with Google Docs API


Using the Google Docs/Drive APIs with Node, I've successfully made a service which produces 'template' style documents which feature namedRanges for other users to write into. I'd like to use the Google Docs API to read the text that gets entered inside of these ranges, but can't see a clean way of doing so. Given that I have the start and end indices of each range, I thought this would be very simple! Unfortunately I can't see any built-in way of doing it?

Currently it looks like I will have to request the whole google doc, and for each range that I'm watching, compare each node's start/end index and recursively traverse down the tree until they match. Is there not a better way of doing this?

Cheers

Edit:

Tanaike's solution below is cleaner, but I had already got a version working on my Firebase Function so thought I might as well share it. This code retrieves a Google Doc with the given ID and stores the contents of the namedRanges as strings within the a Firebase Realtime Database, keeping images and tables intact though "BBCode" style tags. Relevent code below (note that I know that each namedRange is inside of a table cell, which makes finding them easier):

async function StoreResponses(oauth2Client, numSections, documentId, meetingId, revisionId, roomId) 
{
    var gdocsApi = google.docs({version: 'v1', auth: oauth2Client});

    return gdocsApi.documents.get({ "documentId": documentId })
    .then((document) => {
        
        var ranges = document.data.namedRanges;
        var docContent = document.data.body.content;

        var toStore = [];

        for(var i = 0; i < numSections; i++)
        {
            var range = ranges[`zoomsense_section_${i}`].namedRanges[0].ranges[0]
            
            // loop through document contents until we hit the right index
            for(var j = 0; j < docContent.length; j++)
            {
                if(docContent[j].startIndex <= range.startIndex && docContent[j].endIndex >= range.endIndex)
                {
                    // we know that the ranges are inside single table cells
                    var sectionContents = docContent[j].table.tableRows[0].tableCells[0].content;

                    toStore.push(readStructuralElementsRecursively(document, sectionContents));
                }
            }
        }

        return db.ref(`/data/gdocs/${meetingId}/${roomId}/${documentId}/revisions/${revisionId}/responses`).set(toStore);
    })
    .catch((exception) => {
        console.error(exception)
        res.status(500).send(exception);
    })
}
// uses https://developers.google.com/docs/api/samples/extract-text
function readStructuralElementsRecursively(document, elements)
{
    var text = "";
    elements.forEach(element => {
        if(element.paragraph)
        {
            element.paragraph.elements.forEach(elem => {
                text += readParagraphElement(document, elem);
            });
        }
        else if(element.table)
        {
            // The text in table cells are in nested Structural Elements, so this is recursive
            text += "[table]"
            element.table.tableRows.forEach(row => {
                text += "[row]"
                row.tableCells.forEach(cell => {
                    text += `[cell]${readStructuralElementsRecursively(document, cell.content)}[/cell]`;
                })
                text += "[/row]"
            })
            text+= "[/table]"
        }
    });

    return text;
}
// handle text and inline content
function readParagraphElement(document, element)
{
    if(element.textRun)
    {
        // standard text
        return element.textRun.content;
    }
    if(element.inlineObjectElement)
    {
        var objId = element.inlineObjectElement.inlineObjectId;
        var imgTag = "\n[img]404[/img]"

        try
        {
            var embeddedObj = document.data.inlineObjects[objId].inlineObjectProperties.embeddedObject;
            if(embeddedObj.imageProperties)
            {
                // this is an image
                imgTag = `[img]${embeddedObj.imageProperties.contentUri}[/img]`
            }
            else if(embeddedObj.embeddedDrawingProperties)
            {
                // this is a shape/drawing
                // can't find any way to meaningfully reference them externally,
                // so storing the ID in case we can do it later
                imgTag = `[drawing]${objId}[/drawing]`
            }
        }
        catch(exception)
        {
            console.log(exception)
        }
         
        return imgTag;
    }
}

Solution

  • I believe your goal as follows.

    • You want to retrieve the values from the named range on Google Document.
    • In your Google Document, the named ranges have already been set.
    • You want to achieve this using Node.js.
      • Unfortunately, from your question, I couldn't confirm the library, you are using, for using Docs API.

    In order to achieve above, I would like to propose the following workarounds.

    Issue and workarounds:

    Unfortunately, in the current stage, there are no methods for directly retrieving the values from the named range in the Google Docs API. I believe that such method might be added in the future, because Docs API is growing now. So as the current workaround using Docs API, it is required to do the following flow.

    1. Retrieve the Google Document object using the method of documents.get in Docs API.
    2. Retrieve startIndex and endIndex using the name of the named range.
    3. Retrieve the values using startIndex and endIndex.

    This has already been mentioned in your question. When Google Docs API is used, in the curent stage, this method is require to be used. But when Google Document service is used, the values of the named range can be directly retrieved by the name and/or the ID of the named range. In this answer, I would like to propose this method as another workaround.

    Usage:

    Please do the following flow.

    1. Create new project of Google Apps Script.

    Sample script of Web Apps is a Google Apps Script. So please create a project of Google Apps Script. In order to use Document service, in this case, Web Apps is used as the wrapper.

    If you want to directly create it, please access to https://script.new/. In this case, if you are not logged in Google, the log in screen is opened. So please log in to Google. By this, the script editor of Google Apps Script is opened.

    2. Prepare script.

    Please copy and paste the following script (Google Apps Script) to the script editor. This script is for the Web Apps.

    function doGet(e) {
      Object.prototype.getText = function() {return this.getRange().getRangeElements().map(e => e.getElement().asText().getText().slice(e.getStartOffset(), e.getEndOffsetInclusive() + 1))};
      const doc = DocumentApp.openById(e.parameter.id);
      let res;
      if (e.parameter.name) {
        const ranges = doc.getNamedRanges(e.parameter.name);
        res = ranges.length > 0 ? ranges[0].getText() : [];
      } else if (e.parameter.rangeId) {
        const range = doc.getNamedRangeById(e.parameter.rangeId.split(".")[1]);
        res = range ? range.getText() : [];
      } else {
        res = [];
      }
      return ContentService.createTextOutput(JSON.stringify(res));
    }
    

    3. Deploy Web Apps.

    1. On the script editor, Open a dialog box by "Publish" -> "Deploy as web app".
    2. Select "Me" for "Execute the app as:".
      • By this, the script is run as the owner.
    3. Select "Anyone, even anonymous" for "Who has access to the app:".
      • In this case, no access token is required to be request. I think that I recommend this setting for testing your goal.
      • Of course, you can also use the access token. At that time, please set this to "Only myself" or "Anyone". And please include the scope of https://www.googleapis.com/auth/drive.readonly and https://www.googleapis.com/auth/drive to the access token. These scopes are required to access to Web Apps.
    4. Click "Deploy" button as new "Project version".
    5. Automatically open a dialog box of "Authorization required".
      1. Click "Review Permissions".
      2. Select own account.
      3. Click "Advanced" at "This app isn't verified".
      4. Click "Go to ### project name ###(unsafe)"
      5. Click "Allow" button.
    6. Click "OK".
    7. Copy the URL of Web Apps. It's like https://script.google.com/macros/s/###/exec.
      • When you modified the Google Apps Script, please redeploy as new version. By this, the modified script is reflected to Web Apps. Please be careful this.

    4. Run the function using Web Apps.

    You can retrieve the values from Google Spreadsheet using the following script.

    const request = require("request");
    const url = "https://script.google.com/macros/s/###/exec";  // Please set the URL of Web Apps.
    let qs = {
      id: "###",  // Please set the Document ID.
      name: "###",  // Please set the name of named range.
      // rangeId: "kix.###",  // Please set the ID of named range.
    };
    let options = {
      url: url,
      qs: qs,
      method: "get",
    };
    request(options, (err, res, result) => {
      if (err) {
        console.log(err);
        return;
      }
      console.log(result);
    });
    
    • In this case, the result is returned as an array including the values.
    • In above Web Apps, the values can be retrieved with the name and/or ID of named range. When you want to use the name of named range, please use let qs = {id: "###", name: "###"};. When you want to use the ID of named range, please use let qs = {id: "###", rangeId: "kix.###"};.

    Note:

    • When you modified the script of Web Apps, please redeploy the Web Apps as new version. By this, the latest script is reflected to the Web Apps. Please be careful this.

    References: