I currently trying to clean up a large dataset using Excel VBA. The dataset structure looks like this.
However, I would like to make it look like this instead, whereby if the cells in columns A:D all contain the same value, transpose the cells in column E. (And remove the duplicated cells from A:D)
Here is the code I did
Dim ws As Worksheet: Set ws = Sheets("test")
lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
Dim j As Integer
j = 6
For i = 2 To lastrow
If (Range("A" & i).Value = Range("A" & i + 1).Value) And (Range("B" & i).Value = Range("B" & i + 1).Value) And (Range("C" & i).Value = Range("C" & i + 1).Value) Then
Cells(i, j).Value = Cells(i + 1, 5).Value
j = j + 1
End If
'Reset J back to 6 if columns A to D does not match previous
If (Range("A" & i).Value <> Range("A" & i + 1).Value) Or (Range("B" & i).Value <> Range("B" & i + 1).Value) Or (Range("C" & i).Value <> Range("C" & i + 1).Value) Then
j = 6
End If
Next i
How can this be done?
This ended up more complex than I'd thought but seems to work OK
Sub Compact()
Const KEY_COLS As Long = 4
Dim ws As Worksheet, i As Long, k As String, nextEmpty As Long
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), "~~")
If Not dict.exists(k) Then
'move this row up?
If nextEmpty > 0 Then
ws.Cells(i, 1).Resize(1, KEY_COLS + 1).Cut ws.Cells(nextEmpty, 1)
dict.Add k, nextEmpty 'new key - store row#
nextEmpty = 0
Else
dict.Add k, i 'new key - store row#
End If
Else
'seen this key before - move value to that row and clear
ws.Cells(dict(k), Columns.Count).End(xlToLeft).Offset(0, 1).Value = _
ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents
If nextEmpty = 0 Then nextEmpty = i 'available row
End If
Next i
End Sub
Edit: this is a bit cleaner I think. It's split into separate "read" and "write" parts.
Sub Compact2()
Const KEY_COLS As Long = 4
Const SEP As String = "~~"
Dim ws As Worksheet, i As Long, k, col As Long, v
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
Set ws = Sheets("test")
'collect all the unique combinations and associated values
For i = 1 To ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
'create a row "key" from first KEY_COLS cells
k = Join(Application.Transpose(Application.Transpose( _
ws.Cells(i, 1).Resize(1, KEY_COLS))), SEP)
If Not dict.exists(k) Then dict.Add k, New Collection
dict(k).Add ws.Cells(i, KEY_COLS + 1).Value
ws.Rows(i).ClearContents 'clear row
Next i
're-populate the sheet from the dictionary
i = 1
For Each k In dict
ws.Cells(i, 1).Resize(1, KEY_COLS).Value = Split(k, SEP)
col = KEY_COLS + 1
For Each v In dict(k)
ws.Cells(i, col) = v
col = col + 1
Next v
i = i + 1
Next k
End Sub