I'm using this YouTuber's csv as an example:
I'd like to move all the lines with duplicated names into another csv file. Is there a way to do this with an Excel formula?
My aim would be to have the first Mariya, Andrew and Nancy rows in this csv and move all the duplicates into another one. Then in that 2nd csv do the same thing so that I end up with a few csv files each with the row data for each name only appearing once (1001, 1003, 1008... will be in the 1st csv; 1002, 1004, 1014... will be in the 2nd csv and so on).
I have so far consolidated csv data into one file using the cmd prompt.
'VBA Codeto movethe entire row if cells in column 4 has the value Duplicate
Sub move_rows_to_another_sheet()
For Each myCell In Selection.Columns(4).Cells
If myCell.Value = "Duplicate" Then
myCell.EntireRow.Copy Sheet2.Range("A" & Rows.Count).End(3)(2)
myCell.EntireRow.Delete
End If
Next
End Sub
Sheet 1:
.
I want my code to put all the duplicate rows into sheet 2. Then run the same code in sheet 2 and create a new sheet automatically, and continue doing this until I have several sheets without any duplicates in the specified column (Co Name).
The formula: =IF(COUNTIF($A$2:$A5,A5)>1, "Duplicate","")
works intermittently. It ignores the first occurrence. It sometimes misses duplicates.
If I understand you correctly:
Sub test()
Dim rg As Range: Dim rgHdr As Range: Dim cell As Range
Dim arr: Dim el: Dim cnt As Long: Dim nm As Range
'delete certain sheets if exist
Application.DisplayAlerts = False
For Each sh In Sheets
If InStr(sh.Name, "result") Then sh.Delete
Next
Application.DisplayAlerts = True
'csv file has only one sheet, so it's the active sheet
'set the range of the name (column D) as rg variable on the active sheet
With ActiveSheet
Set rg = .Range("D2", .Range("D" & Rows.Count).End(xlUp))
Set rgHdr = .Range("A1", .Range("A1").End(xlToRight))
End With
'get each unique name in rg as arr variable
Set arr = CreateObject("scripting.dictionary")
For Each cell In rg: arr.Item(cell.Value) = 1: Next
'loop to each item (the unique name) in arr as el variable to get the most occurence of the name
'then have the count value as cnt variable
'for example, if under column D (SalesPerson) Sales X appear for 6 times while all other sales < 6
'then it means the total sheet needed is 6 sheets where the 6th sheet will only has this Sales X and his data
cnt = 0
For Each el In arr
If Application.CountIf(rg, el) > cnt Then cnt = Application.CountIf(rg, el): nm = el
Next
'create new sheets as many as the cnt value and put the rgHdr value
For i = 1 To cnt
Sheets.Add.Name = "result_" & Format(i, "00")
Range("A1").Resize(1, rgHdr.Columns.Count).Value = rgHdr.Value
Next i
'again, loop to each item in arr as el variable
'this time is to get the range of cells which value = the looped el (the name) as nm variable
For Each el In arr
With rg
.Replace el, True, xlWhole, , False, , False, False
Set nm = .SpecialCells(xlConstants, xlLogical) '.EntireRow.Delete
.Replace True, el, xlWhole, , False, , False, False
End With
'now the nm variable has the range of cells of the looped name
'next is to loop to each cell of nm variable and copy each looped cell row to the designated sheet
i = 1
For Each cell In nm
cell.EntireRow.Copy Destination:=Sheets("result_" & Format(i, "00")).Range("A" & Rows.Count).End(xlUp).Offset(1, 0)
i = i + 1
Next
Next
End Sub
FYI, the sub does not create a new csv file, but put the result in created sheet.
If in each created sheet result is the one that you expected but then you want that each created sheet is saved as csv file, then it need additional code.