I found out that an office at my work spent weeks manually going through an Excel spreadsheet containing a database with >500,000 rows looking for duplicate rows matching certain criteria. The duplicates could not simply be erased before being researched, as a single mistake could have potentially lost hundreds of thousands of dollars in lost production. I decided simply flagging them and referencing the originating row would be the best answer in this case. So I decided to look into macros to see how much time could have been saved by using a simple macro instead. I am using this as a programming learning experience, so please no "here's a =function()" answers.
I've written a macro and changed it several times to no avail (most current is below). I wanted to use String variables because there's no telling what has been entered into the cells that will be checked. Here's what I've tried, failed, and learned(?) from this site:
Initially, I tried declaring a variable, and attaching a value from a cell directly to it. e.g. Dim myString As String Set myString = Cells(x, x).Value
However, I kept getting object errors. Thanks to Michael's response here, I learned that you have to use the Range
variable to use Set
.
My next issue has been getting a "type mismatch" error. I'm trying to assign and compare a stored variable against another stored variable, and I'm sure this is causing the issue. I initially tried Dim myRange As Range, myString As String Set myRange = Cells(x, x).Value myString = myRange
. This obviously didn't work, so I tried using the CStr()
"change to string" function to convert the Range variable to the String variable I want. And that's where I'm stuck.
Sub Duplicate()
'Declare the variables
Dim NSNrange, PNrange, KitIDrange As Range
Dim NSN, PN, KitID As String
Dim NSNCheck, PNCheck, KitIDCheck As String
Dim i, j, printColumn, rowCount As Integer
'Set which column we want to print duplicates on, and count the number of rows used
rowCount = ActiveSheet.UsedRange.Rows.Count
printColumn = 9
'Lets get started!
'Clear the duplicate list column for a fresh start
Columns(printColumn).EntireColumn.Delete
'Start on line 2, and grab the cell values for the NSN, Part number and kit ID.
For i = 2 To rowCount
Set NSNrange = Cells(i, 5).Value
Set PNrange = Cells(i, 7).Value
Set KitIDrange = Cells(i, 2).Value
'Change whatever is contained in those cells into a string and store them into their respective containers
NSN = CStr(NSNrange)
PN = CStr(PNrange)
KitID = CStr(KitIDrange)
'Now let's look through the rest of the sheet and find any others that match the 3 variables that we stored above
For j = 2 To rowCount
'To avoid needless checks, we'll check to see if it's already had a duplicate found. If so, we'll just skip to the next row
If Cells(j, printColumn).Value = "" Then
'If the print column is blank, we'll grab the 3 values from the current row to compare against the above variables
Set NSNrange = Cells(j, 5).Value
Set PNrange = Cells(j, 7).Value
Set KitIDrange = Cells(j, 2).Value
'Now we store the contents into their very own container
NSNCheck = CStr(NSNrange)
PNCheck = CStr(PNrange)
KitIDCheck = CStr(KitIDrange)
'Check the initial row with the current row to see if the contents match. If so, print which row it is duplicated on.
If NSN = NSNCheck And PN = PNCheck And KitID = KitIDCheck Then Cells(j, printColumn).Value = "Duplicated on row " & i
End If
Next j
Next i
MsgBox "Search Complete"
End Sub
As you asked for comments in relation to type errors. There are a number of place where confusion could arise
1) Every line where you do multiple declarations on the same line like this:
Dim NSNrange, PNrange, KitIDrange As Range
Only the last variable is explicitly type declared (in this case as a Range
). The others are implicit Variant
. So, I have gone through and put on separate lines and declared them as I believe you may have meant them to be.
2) Using Activesheet
and, in other places, just Cells
or Range
, which implicitly references the Activesheet
, means if you have changed sheets by then you may longer be referring to the sheet you intended. So whilst I have kept Activesheet
in, and used an overarching With Activesheet
statement that then allows me to say .Cells
or .Range
etc, you should change this to using explicit sheet names.
3) Where ever you use the Set
keyword the expectation is your are working with an object (e.g. a Range
). Going by your naming convention I am going to say that you mean
Set NSNrange = Cells(i, 5)
when you say
Set NSNrange = Cells(i, 5).Value
Which sets a range to another range rather than a cell value.
4) I have changed your Integers to Longs. You are working with rows which can go beyond what Integer
type can handle so you risked overflow. Long
is safer.
5) Rather than doing a conversion on the Range
as follows
NSN = CStr(NSNrange)
Where the default property of the range, .Value
, will be taken, as you want a string you can drop the CStr
conversion and just take the .Text
property which will give you the string you want.
6) Rather than the empty string literal ""
comparison, I have used vbNullString
which is faster to assign and to check.
Option Explicit
Sub Duplicate()
Dim NSNrange As Range
Dim PNrange As Range
Dim KitIDrange As Range
Dim NSN As String
Dim PN As String
Dim KitID As String
Dim NSNCheck As String
Dim PNCheck As String
Dim KitIDCheck As String
Dim i As Long
Dim j As Long
Dim printColumn As Long
Dim rowCount As Long
With ActiveSheet
rowCount = .UsedRange.Rows.Count
printColumn = 9
.Columns(printColumn).EntireColumn.Delete
For i = 2 To rowCount
Set NSNrange = .Cells(i, 5)
Set PNrange = .Cells(i, 7)
Set KitIDrange = .Cells(i, 2)
NSN = NSNrange.Text
PN = PNrange.Text
KitID = KitIDrange.Text
For j = 2 To rowCount
If .Cells(j, printColumn).Value = vbNullString Then
Set NSNrange = .Cells(j, 5)
Set PNrange = .Cells(j, 7)
Set KitIDrange = .Cells(j, 2)
NSNCheck = NSNrange.Text
PNCheck = PNrange.Text
KitIDCheck = KitIDrange.Text
If NSN = NSNCheck And PN = PNCheck And KitID = KitIDCheck Then
.Cells(j, printColumn).Value = "Duplicated on row " & i
End If
End If
Next j
Next i
End With
MsgBox "Search Complete"
End Sub