I have a folder (let's call it the source folder) on Google Drive that is updated from time to time with new zip files (underlying files are PDFs). I am trying to use Google Apps Script to unzip only the new zip files and place the underlying PDFs in another folder (let's call it the destination folder).
I am currently using the following code to unzip the files in the source folder, running on a time-based trigger. My current code does not differentiate between old and new zip files so I am getting a large number of duplicates accumulating in the destination folder. (I found this code on WeirdGeek: https://www.weirdgeek.com/2019/10/unzip-files-using-google-apps-script/)
function Unzip() {
//Add folder ID to select the folder where zipped files are placed
var SourceFolder = DriveApp.getFolderById("1KbyB2vTUfbwYdzBEyIwzTliXKjATbW8A")
//Add folder ID to save the where unzipped files to be placed
var DestinationFolder = DriveApp.getFolderById("1Z-iVlcROe5kVX8IkBlV9a98WKlvlfp3U")
//Select the Zip files from source folder using the Mimetype of ZIP
var ZIPFiles = SourceFolder.getFilesByType(MimeType.ZIP)
//Loop over all the Zip files
while (ZIPFiles.hasNext()){
// Get the blob of all the zip files one by one
var fileBlob = ZIPFiles.next().getBlob();
//Use the Utilities Class to unzip the blob
var unZippedfile = Utilities.unzip(fileBlob);
//Unzip the file and save it on destination folder
var newDriveFile = DestinationFolder.createFile(unZippedfile[0]);
}
}
I initially thought to add some sort of time-based restriction to the function, but because the source folder is being synced (using MultCloud) with an sFTP site, I don't want to go that direction.
I ALSO found the following code is used to put a "replace" restriction on saving new spreadsheets but couldn't figure out how to integrate this with my code. (Code is from user Tainake)
function saveAsSpreadsheet() {
var folderId = "0B8xnkPYxGFbUMktOWm14TVA3Yjg";
var folder = DriveApp.getFolderById(folderId);
var files = folder.getFilesByName(getFilename());
if (files.hasNext()) {
files.next().setTrashed(true);
}
var sheet = SpreadsheetApp.getActiveSpreadsheet();
DriveApp.getFileById(sheet.getId()).makeCopy(getFilename(), folder);
}
Any ideas on how to solve this problem would be appreciated! I am a complete noob so I apologize in advance if this is a stupid question.
EDIT: I could not figure out how to unzip only "new" files in the source folder, and so my new code moves to trash all files in the destination folder, and then unzips all files in the source folder. Code is below:
function Unzip() {
//Add folder ID to select the folder where zipped files are placed
var SourceFolder = DriveApp.getFolderById("1KbyB2vTUfbwYdzBEyIwzTliXKjATbW8A")
//Add folder ID to save the where unzipped files to be placed
var DestinationFolder = DriveApp.getFolderById("1Z-iVlcROe5kVX8IkBlV9a98WKlvlfp3U")
//Delete files from the destination folder
//Get the files in the destination folder
var files = DestinationFolder.getFiles();
//Loop through the files in the destination folder
while(files.hasNext()){
//Get the individual file in the destination folder to process
var file = files.next();
//Trash that file
file.setTrashed(true);
}
//Select the Zip files from source folder using the Mimetype of ZIP
var ZIPFiles = SourceFolder.getFilesByType(MimeType.ZIP)
//Loop over all the Zip files
while (ZIPFiles.hasNext()){
// Get the blob of all the zip files one by one
var fileBlob = ZIPFiles.next().getBlob();
//Use the Utilities Class to unzip the blob
var unZippedfile = Utilities.unzip(fileBlob);
//Unzip the file and save it on destination folder
var newDriveFile = DestinationFolder.createFile(unZippedfile[0]);
}
}
I could see how this may not be the best solution to this issue, but this allows me to have a MultCloud sync the zip files into my Google Drive, and then allows me to have those files unzipped with a function that runs from time to time. Anyone have a better idea how to accomplish the same thing without deleteing and recreating all the files every time?
EDIT 2: Thank you to Cameron, this question is answered. I am pasting the full code I am using below, for posterity / other newbies so that they don't have to piece it together:
function Unzip() {
//Add folder ID to select the folder where zipped files are placed
var SourceFolder = DriveApp.getFolderById("1KbyB2vTUfbwYdzBEyIwzTliXKjATbW8A")
//Add folder ID to save the where unzipped files to be placed
var DestinationFolder = DriveApp.getFolderById("1Z-iVlcROe5kVX8IkBlV9a98WKlvlfp3U")
//Select the Zip files from source folder using the Mimetype of ZIP
var ZIPFiles = SourceFolder.getFilesByType(MimeType.ZIP);
var now = new Date(); //get current time after you fetch the file list from Drive.
//Get script properties and check for stored "last_execution_time"
var properties = PropertiesService.getScriptProperties();
var cutoff_datetime = properties.getProperty('last_execution_time');
//if we have last execution date, stored as a string, convert it to a Date object.
if(cutoff_datetime)
cutoff_datetime = new Date(cutoff_datetime);
//Loop over all the Zip files
while (ZIPFiles.hasNext()){
var file = ZIPFiles.next();
//if no stored last execution, or file is newer than last execution, process the file.
if(!cutoff_datetime || file.getDateCreated() > cutoff_datetime){
var fileBlob = file.getBlob();
//Use the Utilities Class to unzip the blob
var unZippedfile = Utilities.unzip(fileBlob);
//Unzip the file and save it on destination folder
var newDriveFile = DestinationFolder.createFile(unZippedfile[0]);
}
}
//store "now" as last execution time as a string, to be referenced on next run.
properties.setProperty('last_execution_time',now.toString());
}
You can use the getDateCreated() function on a File object to determine when the file was created. By checking this value against a time limit, you should be able to determine if the file is new. If you are triggering your script with at least a few hours between executions, you could use a hardcoded cutoff time. So if you are triggering your script every six hours, you could ignore any files not created within the last 6 hours, for example.
However, a more robust approach would be to store the last successful execution time in a Script Property, so you could always process any files created since the last successful execution.
Note that this code will process all files currently in the folder the first time it runs, after that it will only process files created since the last run.
var ZIPFiles = SourceFolder.getFilesByType(MimeType.ZIP);
var now = new Date(); //get current time after you fetch the file list from Drive.
//Get script properties and check for stored "last_execution_time"
var properties = PropertiesService.getScriptProperties();
var cutoff_datetime = properties.getProperty('last_execution_time');
//if we have last execution date, stored as a string, convert it to a Date object.
if(cutoff_datetime)
cutoff_datetime = new Date(cutoff_datetime);
//Loop over all the Zip files
while (ZIPFiles.hasNext()){
var file = ZIPFiles.next();
//if no stored last execution, or file is newer than last execution, process the file.
if(!cutoff_datetime || file.getDateCreated() > cutoff_datetime){
var fileBlob = file.getBlob();
//Use the Utilities Class to unzip the blob
var unZippedfile = Utilities.unzip(fileBlob);
//Unzip the file and save it on destination folder
var newDriveFile = DestinationFolder.createFile(unZippedfile[0]);
}
}
//store "now" as last execution time as a string, to be referenced on next run.
properties.setProperty('last_execution_time',now.toString());