I am trying to fetch a large CSV from an external URL in Apps Script. CSV dimensions for reference: (no. of rows = 124653, no. of columns = 52, fileSize = 72 MB)
When I download the CSV manually via the URL in a browser, it has all the records (no. of records = 124653). But when fetched the same in Apps Script using UrlFetchApp.fetch(URL), not all records were present/returned (no. of records = 98701)
I guess this is because of the response limitation of UrlFetchApp.fetch which is 50 MB as per the documentation https://developers.google.com/apps-script/guides/services/quotas#current_limitations
Is there any alternative to this? The alternative should be ideally limited to using Google Apps Script itself but open to explore other options too if its not possible. Once the file is parsed, storing the data to BigQuery.
This is possible using the Range http header provided your server supports partial downloading.
let csvStr = '';
const ranges = ["bytes=0-49999999", "bytes=50000000-72074760"];
for(const range of ranges) {
var options = {
"headers": {
"Range": range
}
}
var response = UrlFetchApp.fetch(redirectURL, options);
var responseCode = response.getResponseCode();
if(responseCode != 206) {throw 'Error: ' + responseCode + " - " + response}
csvStr = csvStr + response.getContentText();
}
const arr = Utilities.parseCsv(csvStr);
...
To check whether your server supports partial downloading, print the response header (response.getHeaders()) and check if Accept-Ranges property does not have the value none.
My csv file size was 72 MB, I have hardcoded the end range in the example code for simplicity but the exact size of the file can be found through the response header (in property - Content-Length when API is called with just one range or no specified range) and it can be variably set.
Thank you @TheMaster for the suggestion in another post which helped in finding this answer!