How can I highlight cells with duplicate text strings in an Excel column?
Although it works perfectly well for duplicated cells, there doesn't seem to be any way to do this in the Conditional Formatting feature.
I frequently have an issue with customer bill of materials, where they contain duplicate references. In the provided example, reference R60 is listed twice in item 103, and reference R32 is in two different rows, in Item 105 and 106. So only looking for duplicate cells won't work.
Example (pasted from Excel, for some reason it won't let me insert a picture):
Item | Qty | Reference |
---|---|---|
100 | 1 | U12 |
101 | 1 | U3 |
102 | 5 | R38,R39,R40,R41,R45 |
103 | 1 | R60,R60 |
104 | 1 | R13 |
105 | 2 | R17,R32 |
106 | 2 | R32,R43 |
107 | 8 | R8-9,R26,R30,R36,R44,R58,R61 |
108 | 2 | R19,R24 |
109 | 2 | R53,R59 |
110 | 3 | R16,R46-47 |
Additionally, different customers will separate references in different ways. Some use a comma, some use a space, and some use both a comma and a space. Occasionally they will use a combination of them. There can be hundreds of references in a given cell, so using Text to Columns and then using Conditional Formatting (which I saw suggested as a possible solution in a similar post) isn't going to work for me. Ideally, if there is a solution to this, it will take all this into account. Being able to select a delimiter might work too.
Based on several hours of web searching and experimentation, COUNTIF seems to be the key, but I am not at all familiar with the function or how to manipulate it.
Below is the VBA code I've been working on. The first part is just using the Conditional Formatting feature. The second part is a mash-up of two different codes I found that I thought would work, but I'm probably not using them right. I am very much a neophyte at VBA coding. My apologies in advance for that.
Sub DuplicateRed()
'
' DuplicateRed Macro
' First, turn duplicate cells red and second, duplicate text strings Within cells Red
' - November 15 2021
' First, turn duplicate cells Red
Dim r As Range ' Runs the macro on the selected column / cells
Selection.FormatConditions.AddUniqueValues
Selection.FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
Selection.FormatConditions(1).DupeUnique = xlDuplicate
With Selection.FormatConditions(1).Font
.Color = -16383844
.TintAndShade = 0
End With
With Selection.FormatConditions(1).Interior
.PatternColorIndex = xlAutomatic
.Color = 13551615
.TintAndShade = 0
End With
Selection.FormatConditions(1).StopIfTrue = False
'Second, turn any duplicate text stings within cells Red
Range(Addr) = Evaluate("IF(COUNTIF(" & Addr & "," & Addr &
")>1,""=""&" & Addr & "," & Addr & ")") On Error Resume Next
Range(Addr).SpecialCells(xlFormulas).Interior.ColorIndex = 6
Range(Addr).Replace "=", "", xlPart
' Locate duplicate values in selected range
If Application.Evaluate("COUNTIF(" & myDataRng.Address & "," & cell.Address & ")") > 1 Then
cell.Offset(0, 0).Font.Color = vbRed ' CHANGE COLOR TO RED.
End If Next cell
Set myDataRng = Nothing ErrHandler:
Application.EnableEvents = True
Application.ScreenUpdating = True
End Sub' ' DuplicateRed Macro ' First, turn duplicate
Ultimately, what I'd like to be able to do is select the references column (or a portion of it), and then run a VBA script on that selection to highlight the cells with duplicated references in them. I do not want to delete any of the duplicates, since I need to be able to tell customers that there is an error in their bill of materials.
Edit: New VBA code with help from JNevill
Sub highlight_duplicates()
' Turn duplicate cells and duplicate text strings within cells Red
' November 15 2021
' Credit to JNevill
'First, turn any duplicated text stings within cells Red
'Declare variables used in this script
Dim referenceRange As Range
Dim referenceCell As Range
Dim referenceArray As Variant
Dim referenceVal As String
Dim referenceItem As Variant
'Grab the selection into a variable
Set referenceRange = Selection
'iterate through each cell in the range
For Each referenceCell In referenceRange
'Because we can have either a space or a comma as a delimiter,
' lets make them all comma so it's easier to deal with.
' Note this doesn't change the value in the cell, just the
' variable here in VBA.
referenceVal = Replace(referenceCell.Value, ", ", ",")
referenceVal = Replace(referenceCell.Value, " ", ",")
'Break this thing into an array so it's easier to work with each
' value. The big advantage here is that we can iterate through
' an array, where iterating through a string is a nightmare.
referenceArray = Split(referenceVal, ",")
'We will use a dictionary to determine if there are duplicates in
' in this array. By definition an item in a dictionary can not be
' a duplicate so we just dump all the values of the array into
' the dictionary and then count elements of both the dictionary
' and the array. If they are they same, then the array has no
' duplicates.
With CreateObject("Scripting.Dictionary")
'Dump array into dictionary
For Each referenceItem In referenceArray
If Not .Exists(referenceItem) Then .Add referenceItem, 1
Next referenceItem
'Toggle looks of cell based on uniqueness
If .Count < UBound(referenceArray) + 1 Then
With referenceCell.Font
.Color = -16383844
.Bold = True
End With
Else
With referenceCell.Font
.Color = 1
.Bold = False
End With
End If
End With
Next referenceCell
' Second, turn duplicate cells Red
Dim r As Range
' Runs the macro on the selected column
Selection.FormatConditions.AddUniqueValues
Selection.FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
Selection.FormatConditions(1).DupeUnique = xlDuplicate
With Selection.FormatConditions(1).Font
.Color = -16383844
.TintAndShade = 0
End With
With Selection.FormatConditions(1).Interior
.PatternColorIndex = xlAutomatic
.Color = 13551615
.TintAndShade = 0
End With
Selection.FormatConditions(1).StopIfTrue = False
End Sub
This is definitely one of those "This should be easy" things that can end up getting ugly quick. Thankfully there's a few tricks we can employ in VBA to make this easier to deal with.
Here's a working example:
Sub highlight_duplicates()
'Declare variables used in this script
Dim referenceRange As Range
Dim referenceCell As Range
Dim referenceArray As Variant
Dim referenceVal As String
Dim referenceItem As Variant
'Grab the selection into a variable
Set referenceRange = Selection
'iterate through each cell in the range
For Each referenceCell In referenceRange
'Because we can have either a space or a comma as a delimiter,
' lets make them all comma so it's easier to deal with.
' Note this doesn't change the value in the cell, just the
' variable here in VBA.
referenceVal = Replace(referenceCell.Value, " ", ",")
'Break this thing into an array so it's easier to work with each
' value. The big advantage here is that we can iterate through
' an array, where iterating through a string is a nightmare.
referenceArray = Split(referenceVal, ",")
'We will use a dictionary to determine if there are duplicates in
' in this array. By definition an item in a dictionary can not be
' a duplicate so we just dump all the values of the array into
' the dictionary and then count elements of both the dictionary
' and the array. If they are they same, then the array has no
' duplicates.
With CreateObject("Scripting.Dictionary")
'Dump array into dictionary
For Each referenceItem In referenceArray
If Not .Exists(referenceItem) Then .Add referenceItem, 1
Next referenceItem
'Toggle looks of cell based on uniqueness
If .Count < UBound(referenceArray) + 1 Then
With referenceCell.Font
.Color = -16383844
.Bold = True
End With
Else
With referenceCell.Font
.Color = 1
.Bold = False
End With
End If
End With
Next referenceCell
End Sub