Search code examples
google-apps-scriptgoogle-sheetsworksheet-function

Google Sheets - Identify Duplicate Cells and Replace Cell Value App Script


Is there a way to identify duplicate cells and replace the duplicate's cell value to blank or clear it but not the entire row, only one column value. In my example below. I want to identify duplicates if Fruits and Item has the same value. In this case. Lime with item A and Apple with Item B are duplicates, and I want to replace Item duplicate value to blank.

Fruits Price Item
Lime 1 A
Apple 2 B
Lime 3 A
Apple 2 C
Apple 4 B

I've got this code but it deletes the entire row. I only want to compare column A and C and should remove cell value in Column C and keep the first instance and not delete the row.

function removeDuplicates() {
var sheet = SpreadsheetApp.getActiveSheet();
var rng = sheet.getRange("A2:C")
var data = rng.getValues();
var newData = new Array();
for(i in data){
var row = data[i];
var duplicate = false;
for(j in newData){
  if(row.join() == newData[j].join()){
    duplicate = true;
  }
}
if(!duplicate){
  newData.push(row);
}
}
rng.clearContent();
sheet.getRange(2, 1, newData.length, 
newData[0].length).setValues(newData);
}

Here's the result that I want.

Fruits Price Item
Lime 1 A
Apple 2 B
Lime 3
Apple 2 C
Apple 4

Solution

  • I believe your goal is as follows.

    • You want to achieve your showing sample input and output situations using Google Apps Script.

    In this case, how about the following modification?

    Modification points:

    • In your script, by row.join() == newData[j].join(), all values of columns "A", "B" and "C" are used. By this, the duplicated values of only columns "A" and "C" cannot be checked.
    • And, in order to search the duplicated values, a loop is included in a loop. By this, I'm worried that the process cost will become high.
    • In the case of var rng = sheet.getRange("A2:C"), all rows of the sheet are retrieved. In this case, the process cost will become high. So, you can use var rng = sheet.getRange("A2:C" + sheet.getLastRow()). By this, the data range is retrieved.

    When these points are reflected in your script, it becomes as follows.

    Modified script 1:

    In this pattern, your showing script is modified.

    function removeDuplicates() {
      var sheet = SpreadsheetApp.getActiveSheet();
      var rng = sheet.getRange("A2:C" + sheet.getLastRow()); // Modified
      var data = rng.getValues();
      var newData = new Array();
      for (i in data) {
        var row = data[i];
        var duplicate = false;
        for (j in newData) {
          if ([row[0] + row[2]].join() == [newData[j][0] + newData[j][2]].join()) { // Modified
            duplicate = true;
          }
        }
        if (!duplicate) {
          newData.push(row);
        } else {
          newData.push([row[0], row[1], null]); // Added
        }
      }
      rng.clearContent().setValues(newData); // Modified
    }
    

    Modified script 2:

    In this pattern, only the duplicated cell values of column "C" are removed.

    function removeDuplicates() {
      var sheet = SpreadsheetApp.getActiveSheet();
      var data = sheet.getRange("A2:C" + sheet.getLastRow()).getValues();
      var { ranges } = data.reduce((o, [a, , c], i) => {
        var key = a + c;
        if (o.obj[key]) {
          o.ranges.push(`C${i + 2}`);
        } else {
          o.obj[key] = true;
        }
        return o;
      }, { ranges: [], obj: {} });
      sheet.getRangeList(ranges).clearContent();
    }
    
    • When this script is run for your sample input situation, your sample output situation can be retrieved.

    References: