Search code examples
google-apps-scriptgoogle-sheetsgoogle-drive-apicsv-import

Downloading Big CSV Files and putting in Google Sheet


This is a small project I have on automation. I regularly retrieve Email reports CSV attachment and convert it directly to Google Sheet using Google App Script. but there is a report that comes in that is too big and doesn't fit into the blob limit size(50mb), there will be execution errors.

Hence, downloading it and storing it in google drive isn't an option.

I tried of storing the contentText and using this CSVToArray function I found online

function CSVToArray( strData, strDelimiter ) {
  // Check to see if the delimiter is defined. If not,
  // then default to COMMA.
  strDelimiter = (strDelimiter || ",");
  // Create a regular expression to parse the CSV values.
  var objPattern = new RegExp(
    (
      // Delimiters.
      "(\\" + strDelimiter + "|\\r?\\n|\\r|^)" +

      // Quoted fields.
      "(?:\"([^\"]*(?:\"\"[^\"]*)*)\"|" +

      // Standard fields.
      "([^\"\\" + strDelimiter + "\\r\\n]*))"
    ),
    "gi"
  );

  // Create an array to hold our data. Give the array
  // a default empty first row.
  var arrData = [[]];

  // Create an array to hold our individual pattern
  // matching groups.
  var arrMatches = null;

  // Keep looping over the regular expression matches
  // until we can no longer find a match.
  while (arrMatches = objPattern.exec( strData )){
    // Get the delimiter that was found.
    var strMatchedDelimiter = arrMatches[ 1 ];
    // Check to see if the given delimiter has a length
    // (is not the start of string) and if it matches
    // field delimiter. If id does not, then we know
    // that this delimiter is a row delimiter.
    if (
      strMatchedDelimiter.length &&
      (strMatchedDelimiter != strDelimiter)
    ){

      // Since we have reached a new row of data,
      // add an empty row to our data array.
      arrData.push( [] );

    }
    // Now that we have our delimiter out of the way,
    // let's check to see which kind of value we
    // captured (quoted or unquoted).
    if (arrMatches[ 2 ]){
      // We found a quoted value. When we capture
      // this value, unescape any double quotes.
      var strMatchedValue = arrMatches[ 2 ].replace(
        new RegExp( "\"\"", "g" ),
        "\""
      );
    } else {
      // We found a non-quoted value.
      var strMatchedValue = arrMatches[ 3 ];
    }
    // Now that we have our value string, let's add
    // it to the data array.
    arrData[ arrData.length - 1 ].push( strMatchedValue );
  }
  // Return the parsed data.
  Logger.log(arrData);
  return( arrData );
};
function GetCSVFromLink(link){

  var urlData = UrlFetchApp.fetch(link);
  var stringData = urlData.getContentText(); 
  //
  //All the folder creation etc is here
  //
    var CSVArray = CSVToArray(stringData);   
    var newsheet = ss.insertSheet("NewReport");
    for ( var i =0, lenCsv=CSVArray.length; i<lenCsv;i++)
    {
     newsheet.getRange(i+1,1,1,CSVArray[i].length).setValues(new Array(CSVArray[i]));

    }

In the end, I received a reach Maximum execution time. This particular report has 30k Rows, hence even the long execution time of 30 minutes could not finish this. However, this works for other smaller csv files.(But would not want to do that when I can directly convert to Google sheet through the Drive API)

I also found out that it would be WAY smaller if I convert it from CSV to xlsm and it would be way easier to convert there. But the thing is I cannot get the CSV file to download automatically to my Drive and I don't know how to convert CSV to xlsm using App script.

Is there any other workarounds to this? Or is there any other way you guys think it might work?


Solution

  • You might be able to leverage resumable uploads via the Drive API. See Tanaike's solution:

    Resumable Upload for Web Apps using Google Apps Script