Search code examples
arraysfor-loopgoogle-apps-scriptgoogle-sheets

How to loop an API import with multiple pages


So I have been looking to import the JSON links into a Google sheet via AppScript and then parse those links in a separate formula. However, there are multiple pages in this database that I need to import, so I am wondering if there is a way that I can loop this to combine the pages into one sheet. This is the import code I am using:

function getBaseJson() {
  
  var ss = SpreadsheetApp.getActiveSpreadsheet()
  var sheet = ss.getSheetByName("Import1")

  var tempHeaderArray = new Array
  var headerRow = ['JSONs']
  tempHeaderArray.push(headerRow)
  var countRow = 1
  var cellRange = sheet.getRange(countRow,1,tempHeaderArray.length,1)
  cellRange.setValues(tempHeaderArray)
  Logger.log("DEBUG: Saved header")

  var url = "https://sports.core.api.espn.com/v2/sports/football/leagues/college-football/athletes?limit=1000&active=true&page=1"
  var response = UrlFetchApp.fetch(url)
  var data = response.getContentText()
  var result = JSON.parse(data)

  var myItems = result.items
  var tempItemDataArray = new Array
  for (var i=0;i<myItems.length;i++){
    for (var [key, value] of Object.entries(myItems[i])) {
      tempItemDataArray.push([value])
    }
  }
  countRow++
  var cellRange = sheet.getRange(countRow,1,tempItemDataArray.length,1)
  cellRange.setValues(tempItemDataArray)
  Logger.log("DEBUG: saved items")
}

I tried putting a for loop in before the MyItems section, but it seemed like no matter where I put it, it would either stop the array from importing or wouldn't change the import. I know how to get it to pull the number of pages from the JSON, but in this case, I know I need it to loop through 28 pages. If I could get help on this, that would be great!


Solution

  • In your situation, how about the following modification?

    Modified script:

    function getBaseJson() {
      var ss = SpreadsheetApp.getActiveSpreadsheet();
      var sheet = ss.getSheetByName("Import1");
      var countRow = 1;
      var page = 1;
      var url = "https://sports.core.api.espn.com/v2/sports/football/leagues/college-football/athletes?limit=1000&active=true&page=";
      var response = UrlFetchApp.fetch(url + page);
      var data = response.getContentText();
      var result = JSON.parse(data);
      var { items, pageCount } = result;
      items = items.map(e => [e["$ref"]])
      var reqs = []
      for (var p = 2; p <= pageCount; p++) {
        reqs.push(url + p)
      }
      var responses = UrlFetchApp.fetchAll(reqs);
      var temp = responses.flatMap(r => r.getResponseCode() == 200 ? JSON.parse(r.getContentText()).items.map(e => [e["$ref"]]) : []);
      var res = [['JSONs'], ...items, ...temp];
      sheet.getRange(countRow, 1, res.length).setValues(res);
    }
    
    • In this modification, pageCount is retrieved at 1st request. And, the requests for retrieving the values from each page in a loop are created. And then, all requests are run with the fetchAll method. And, all values are put into the sheet.

    • The method of fetchAll is run with an asynchronous process. Ref (Author: me) So, when this is used in your situation, I thought that the process cost might be able to be reduced a little.

    • When this script is run, 27865 items are put into the "Import1" sheet.

    • If an error related to the concurrent HTTP requests occurs, UrlFetchApp.fetch might be suitable instead of UrlFetchApp.fetchAll.

    References: