Search code examples
google-apps-scriptgoogle-sheetsweb-scrapinggoogle-tag-managerurlfetch

How do I fetch a column of URLs in Google Apps Script/Javascript?


I have a list of client sites on which I want to continuously check to see if they installed our GTM container tags. I can't connect the spreadsheet to our GTM because my company has a universal login that is not the same as the Google account I'm using for the spreadsheet, and I can't add my own personal tag to the containers at this time. I've got it working perfectly for one site at a time, changing the cell in the script, but I can't seem to get it to work when I try to let it check them all. I'd also like it to loop without running it from the script editor. It seemed like using array.map would solve both of those at once, but I just can't figure out how to use it. I'm still really new to all of this. Here's my working code for the individual sites (url column & gtm column are Xlookups, but I could reference the actual data, if xlookups won't work):

function verifyGTM() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var s = ss.getSheetByName('February 2023 Paste');
  var urlRange = s.getRange(5,10);
  var gtmRange = s.getRange(5,11);
  var url = urlRange.getDisplayValues();
  var gtmID = gtmRange.getDisplayValues();
  var str = UrlFetchApp.fetch(url).getContentText();
  const mainRegex = /<!-- Google Tag Manager -->([\s\S]*?)<!-- End Google Tag Manager -->/gi;
  var results = str.match(mainRegex);
  
    Logger.log(results)
    if(str.includes("GTM") && str.includes(gtmID) && !gtmID == "N/A"){
      s.getRange(5,12).setValue("Yes");
    }
    else if(str.includes("GTM") && !str.includes(gtmID)){
      s.getRange(5,12).setValue("Incorrect GTM");
    }
    else if(gtmID == "N/A"){
      s.getRange(5,12).setValue("No GTM");
    }
    else {
      s.getRange(5,12).setValue("No");
    }
    
    

}

And here's an example spreadsheet with some sensitive data removed: https://docs.google.com/spreadsheets/d/10xgrGVbIiPJiYis3jkBsoyzAUkudOjLLKgYRMGPVsYQ/edit?usp=sharing

I've got some onEdit scripts running, but I don't think those would affect this one.

I tried using fetchAll, as seen below:

function verifyGTM() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var s = ss.getSheetByName('February 2023 Paste');
  var urlRange = s.getRange("J:J");
  var gtmRange = s.getRange("K:K");
  var url = urlRange.getDisplayValues();
  var gtmID = gtmRange.getDisplayValues();
  var str = UrlFetchApp.fetchAll(url).getContentText();
  const mainRegex = /<!-- Google Tag Manager -->([\s\S]*?)<!-- End Google Tag Manager -->/gi;
  var results = str.match(mainRegex);
  
    Logger.log(results)
    if(str.includes("GTM") && str.includes(gtmID) && !gtmID == "N/A"){
      s.getRange("L:L").setValue("Yes");
    }
    else if(str.includes("GTM") && !str.includes(gtmID)){
      s.getRange("L:L").setValue("Incorrect GTM");
    }
    else if(gtmID == "N/A"){
      s.getRange("L:L").setValue("No GTM");
    }
    else {
      s.getRange("L:L").setValue("No");
    }
    
    

}

And I received this error: Exception: Service invoked too many times in a short time: urlfetch. Try Utilities.sleep(1000) between calls.

I tried putting utilities.sleep in, but that didn't work either.


Solution

  • I believe your goal is as follows.

    • You want to retrieve the URLs from values from columns "J" and "K", respectively.
    • You want to retrieve the values from the response values and your if statement, and want to put the values in column "L".
    • From As of right now, just hitting "run" from the script editor., you run your script of verifyGTM() by the script editor.

    Modification points:

    • As I have already mentioned in my comment, I think that it is required to fix the way for using fetchAll. The argument of fetchAll is required to be a 1-dimensional array including URLs or an object. But, in your script, a 2-dimensional array retrieved by getDisplayValues() is directly used.
    • fetchAll returns an array of UrlFetchApp.HTTPResponse[]. But, you directly use getContentText() like UrlFetchApp.fetchAll(url).getContentText();.
    • I think that !gtmID == "N/A" should be gtmID != "N/A".
    • In your situation, I thought that it might be required to check whether URL and gtmID are existing in the row.

    When these points are reflected in your script, how about the following modification?

    Modified script:

    function verifyGTM() {
      var ss = SpreadsheetApp.getActiveSpreadsheet();
      var s = ss.getSheetByName('February 2023 Paste');
      var range = s.getRange("J1:K" + s.getLastRow());
      var values = range.getDisplayValues();
      var { requests, gtmIDs, index } = values.reduce((o, [url, id], i) => {
        if (url && id) {
          o.requests.push({ url, muteHttpExceptions: true });
          o.gtmIDs.push(id);
          o.index.push(i);
        }
        return o;
      }, { requests: [], gtmIDs: [], index: [] });
      const mainRegex = /<!-- Google Tag Manager -->([\s\S]*?)<!-- End Google Tag Manager -->/gi;
      var resposes = UrlFetchApp.fetchAll(requests).map((r, i) => {
        if (r.getResponseCode() != 200) return [null];
        var gtmID = gtmIDs[i];
        var str = r.getContentText();
        var results = str.match(mainRegex);
        Logger.log(results)
        if (str.includes("GTM") && str.includes(gtmID) && gtmID != "N/A") {
          return ["Yes"];
        } else if (str.includes("GTM") && !str.includes(gtmID)) {
          return ["Incorrect GTM"];
        } else if (gtmID == "N/A") {
          return ["No GTM"];
        }
        return ["No"];
      });
      var len = values.length;
      var res = Array(len).fill([null]);
      index.forEach((e, i) => res[e] = resposes[i]);
      range.offset(0, 2, len, 1).setValues(res);
    }
    
    • When this script is run, the values are retrieved from columns "J" and "K", and the data is retrieved from all URLs of column "J", and create an output array using the values of column "K" and the retrieved data from URLs, and put the output values to column "L".

    Note:

    • When I tested this modified script using your provided Spreadsheet, the following values are put into column "L". If you change your spreadsheet, this script might not be able to be used. Please be careful about this.

      Yes
      No GTM
      No
      Yes
      Incorrect GTM
      Yes
      Incorrect GTM
      No GTM
      Incorrect GTM
      Yes
      Yes
      Yes
      Yes
      
    • This modified script supposes that your if statement, values of columns "J" and "K" can be used for obtaining your expected values. Please be careful about this.

    References: