Search code examples
arraysgoogle-apps-scriptgoogle-sheetsmergecopy-paste

Find unique ID, copy and paste rows to new tab and merge certain rows together if ID is duplicate


I am new to GAS with a little knowledge in Javascript

I am trying to read a list of IDs (column A in 'Outbound' sheet) and paste IDs to new 'temp' sheet (col A) and only show ID once if ID is duplicated, This part of my code is working fine.

Next I want to copy the rows of data over from 'Outbound' sheet to the new 'temp' sheet if ID match, but if a ID is duplicated then it will merge columns E:K. I haven't got to the merging part as my code is not working when looking through the IDs and pasting the relevant rows across.

Link to Google Sheet and script: Click Here

This is my code so far, I appreciate some variables/lines of codes are not used as I have been playing around with my code and there may be ways to speed things up.

var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet();
var newdata = new Array();
var data = ss.getDataRange().getValues();         // get all data
var destSheet = ss.getSheetByName("temp");
var lastRow = sheet.getLastRow();
var lastCol = sheet.getLastColumn();

function main(){
  var data = findUnique();
  sort();
  copyRowData();
}

function findUnique(){
  for(nn in data){
    var duplicate = false;
    for(j in newdata){
      if(data[nn][col] == newdata[j][0]){
        duplicate = true;
      }
    }
    if(!duplicate){
      newdata.push([data[nn][col]]);
    }
  }
  //Logger.log(newdata);
}

function sort(){
  newdata.sort(function(x,y){
    var xp = Number(x[0]);                      // ensure you get numbers
    var yp = Number(y[0]);
    return xp == yp ? 0 : xp < yp ? -1 : 1;     // sort on numeric ascending
  });
  //Logger.log(newdata);

  destSheet.clear();
  destSheet.getRange(1,1,newdata.length,newdata[0].length).setValues(newdata);    // Paste unique HS ID to new tab
}

function copyRowData() {  
  //var sheet = ss.getSheetByName('Outbound'); //source sheet
  var range = sheet.getRange(2,1,lastRow,5).getValues();
  Logger.log(range);
  var destlastRow = destSheet.getLastRow();
  var criteria = destSheet.getRange(1,1,destlastRow).getValues();
  Logger.log(criteria);
  var data1 = [];
  var j =[];
  Logger.log(range.length);

  //Condition to check in A:A, if true, copy the same row to data array 
  for (i=0;i<range.length;i++) {
    for (j=0; j<criteria.length; j++){
      if (range[i] == criteria[j]) {
        data1.push(range[i]);
      }
    }
  }
  Logger.log(data1.length);
  //Copy data array to destination sheet
  destSheet.getRange(2,2,data1.length).setValues(data1);
  //targetrange.setValues(data1)
}  

I am looking for an output similar to this, where Shaun and Kennedy have merged data in cells E to K: Click for image of expected outcome

Any help is much appreciated.


Solution

  • Modified Script

    I approached this a bit differently from your script.

    function main() {
      let file = SpreadsheetApp.getActive();
      let sourceSheet = file.getSheetByName("Outbound");
      let sourceRange = sourceSheet.getDataRange();
      let sourceValues = sourceRange.getValues();
    
      // Removing header row into its own variable
      let headers = sourceValues.shift();
    
      //==========================================
    
      // PHASE 1 - dealing with duplicates
    
      // Initializing the duplicate checking object
      // Using the ID as the key, objects will not
      // allow duplicate keys.
      let data = {}
    
      // For each row in the source
      // create another object with a key for each header
      // for each key assign an array with the values
      sourceValues.forEach(row => {
        let rowId = row[0]
        // If the id has already been added
        if (rowId in data) {
          // add the data to the array for each header
          headers.forEach((header, index) => {
            data[rowId][header].push(row[index]);
          })
        } else {
          // create a new object with an array for each header
          // initialize the array with one item
          // the value of the cell
          let entry = {}
          headers.forEach((header, index) => {
            entry[header] = [row[index]];
          })
          data[rowId] = entry
        }
      })
    
      // PHASE 2 - creating the output
    
      let output = []
      // You don't want the name to be merged
      // so put the indices of the columns that need to be merged here
      let indicesToMerge = [4,5,6,7,9,10]
    
      // For each unique id
      for (let id in data) {
        // create a row
        let newRow = []
        // temporary variable of id's content
        let entry = data[id]
        // for each header
        headers.forEach((header, index) => {
          // If this field should be merged
          if (indicesToMerge.includes(index)) {
            // joing all the values with a new line
            let content = entry[header].join("\n")
            // add to the new row
            newRow.push(content)
          } else {
            // if should not be merged
            // take the first value and add to new row
            newRow.push(entry[header][0])
          }
        })
        // add the newly constructed row to the output
        output.push(newRow)
      }
    
      //==========================================
    
      // update the target sheet with the output
      let targetSheet = file.getSheetByName("temp");
      let targetRange = targetSheet.getRange(
        2,1,output.length, output[0].length
      )
      targetRange.setValues(output)
    }
    

    Which outputs this on the temp sheet:

    enter image description here

    How the script works

    This script uses an object to store the data, here would be an example entry after the first phase of the script is done:

    '87817': 
    {
      ID: [87817, 87817, 87817],
      Name: ["Kennedy", "Kennedy", "Kennedy"],
      Surname: ["FFF", "FFF", "FFF"],
      Shift: ["NIGHTS", "NIGHTS", "NIGHTS"],
      "Area Manager completing initial conversation": ["AM1", "AM1", "AM1"],
      "WC Date ": [
        Sun Nov 29 2020 19:00:00 GMT-0500 (Eastern Standard Time),
        Sun Feb 14 2021 19:00:00 GMT-0500 (Eastern Standard Time),
        Sun Mar 07 2021 19:00:00 GMT-0500 (Eastern Standard Time),
      ],
      "Score ": [0.833, 0.821, 0.835],
      Comments: ["Comment 6", "Comment 10", "Comment 13"],
      "Intial Conversation date": ["Continue to monitor - no action", "", ""],
      "Stage 1 Meeting Date": [
        "N/A",
        Fri Feb 19 2021 19:00:00 GMT-0500 (Eastern Standard Time),
        Mon Mar 29 2021 19:00:00 GMT-0400 (Eastern Daylight Time),
      ],
      "Stage 1 Outcome": ["", "Go to Stage 1", "Stage 2"],
    };
    
    

    As you can see, if it finds a duplicate ID, for the first pass, it just copies all the information, including the name and surname etc.

    The next phase involves going through each of these entries and merging the headers that need to be merged by concatenating the results with a newline \n, resulting in a row like this:

    [
      87817,
      "Kennedy",
      "FFF",
      "NIGHTS",
      "AM1\nAM1\nAM1",
      "Sun Nov 29 2020 19:00:00 GMT-0500 (Eastern Standard Time)\nSun Feb 14 2021 19:00:00 GMT-0500 (Eastern Standard Time)\nSun Mar 07 2021 19:00:00 GMT-0500 (Eastern Standard Time)",
      "0.833\n0.821\n0.835",
      "Comment 6\nComment 10\nComment 13",
      "Continue to monitor - no action",
      "N/A\nFri Feb 19 2021 19:00:00 GMT-0500 (Eastern Standard Time)\nMon Mar 29 2021 19:00:00 GMT-0400 (Eastern Daylight Time)",
      "\nGo to Stage 1\nStage 2",
    ]
    

    Comments

    I believe the main difference between this script and yours is that it does everything in memory. That is, it gets all the data, and then never calls getRange or getValues again. Only at the end does it use getRange just for the purposes of outputting to the sheet.

    The other difference appears to be that this one uses the inbuilt property of objects to identify duplicates. I.e. an object key cannot be duplicated inside an object.

    Perhaps also, this approach takes two passes at the data, because the overhead is minimal and otherwise the code just gets hard to follow.

    Merging data like this can get endless as there are many tweaks and checks that can be implemented, but this is a working bare-bones solution that can get you started.