Search code examples
vbaexcelexcel-2013

Better method to Split Cell values in multiple Rows and Concatenate these values in the next Column with formatting intact using Excel-VBA


Here is the view of my excel sheet: enter image description here

Document Overview:

It has multiple columns which may vary as per the requirement.Document ID's and Document Versions are always available in each sheet, however the Name of the Column (Ex; Doc ID or ID / Doc Version or Doc #) and Columns (Ex; Column G & H / Column J & K) may vary.

In this case, Document ID - Column C and Document version - Column D may contain multiple values in each cell.

Document Id always has 9 digits (filled with trailing zeros if the Id does not have enough digits). Ex; 000987094, 123456100 , 234567899, 023456789 etc.

Document Version always has the fixed format of "0.0" or "00.0", Ex; 1.0, 23.0, 2.1 etc.

Description of what I have done so far:

I use VBA Macro to split the Cells containing multiple values (ID and related Versions, highlighted in the uploaded image) into rows below them. After that I Concatenate the Split Values into next column by inserting a new column manually and then using another Macro to Concatenate.

Here is the Output after I run the Macro:

enter image description here

Macros:

    Sub SplitCellValuesIntoRows()

        Dim rng_all_data As Range
        'Set rng_all_data = Application.InputBox(prompt:="Select the Range of cells:", Type:=8)
        Set rng_all_data = ActiveSheet.UsedRange
        Dim int_row As Integer
        int_row = 0

        On Error Resume Next

        Dim sht_out As Worksheet
        Set sht_out = Worksheets.Add

        Dim rng_row As Range
        For Each rng_row In rng_all_data.Rows

            Dim int_col As Integer
            int_col = 0

            Dim int_max_splits As Integer
            int_max_splits = 0

            Dim rng_col As Range
            For Each rng_col In rng_row.Columns

                Dim col_parts As Variant
                col_parts = Split(rng_col, vbLf)

                If UBound(col_parts) > int_max_splits Then
                    int_max_splits = UBound(col_parts)
                End If

                 sht_out.Range("A1").Offset(int_row, int_col).Resize(UBound(col_parts) + 1) = Application.Transpose(col_parts)

                int_col = int_col + 1
            Next

            int_row = int_row + int_max_splits + 1
        Next

    End Sub



Sub Join_em()

    For i = 2 To ActiveSheet.UsedRange.Rows.Count
        Range("E" & i).Formula = (Range("C" & i).Value & " " & Range("D" & i).Value)
    Next i

End Sub

In the macro Join_em(), I fill the values manually after using the first Macro SplitCellValuesIntoRows(), based on the output columns of Document ID and Document Version to get the Concatenated values.

Range("E" & i).Formula = (Range("C" & i).Value & " " & Range("D" & i).Value)

C, D and E in this case.

What I want to achieve:

I am trying to achieve something like this as the output:

enter image description here

  1. Split the Cells with multiple values by adding rows in the same sheet and keep the destination cell formatting intact.
  2. Add a new Column E (in this case) and Concatenate the values from Document ID and Document Version with the leading and trailing zeroes intact.
  3. Since the format for Document ID (9 digits with/without trailing zeroes) and Document versions ("0.0" or "00.0") are always fixed, but the Name and Column # are not fixed, Is it possible to use regex and combine the respective cells into newly inserted column beside them automatically after splitting multiple cell values into individual rows. (It would be awesome to get to know how to make it work,I have tried it without success. I don't know the logic to make it work)

Here is the link for downloading the dummy Excel Sheet, in case it is needed for clarity.

DummyBook.xlsx


Solution

  • Finding your columns

    Regex solutions are extremely useful when you are looking for complex string combinations, but in VBA they can be a little slow. Given the simplicity of your match patterns, it'd probably easier and faster to use more 'primitive' string comparisons. Say, for example, your Document Id's are between 10000 and 1000000000, you could simply try to convert your string to a Long and see if the value is between those numbers. A similar approach could be used comparing each side of a decimal for your Document Version comparison.

    With any string comparison, Regex or otherwise, you need to guard against false matches. The value of cell "A3", for example, matches the pattern of a Document Version. So you need to put in place some safeguards to prevent your code selecting the wrong column; only you will know what those could reliably be, but it might be something as simple as saying a Document Version can only occur in Column "C" or after.

    Concatenating the values

    In your spreadsheet, all cells are formatted as Text. This means that even numbers will be interpreted as strings - hence the little green triangle warning you of this in your ID and Version cells. Had they been numbers, then you'd have needed to apply a number format to those cells (eg #0.# for the Version). For your spreadsheet, concatenation is no more complicated than joining the two strings as in str = str1 & " " & str2.

    In your second image, it looks as though you have a General cell format (or perhaps some kind of number format) so those values are interpreted as numbers. These would need to be formatted before concatenation, using the NumberFormat() function.

    Splitting the rows

    Splitting cells into rows, although syntactically easy, can be fiddly when you're trying to keep track of which row you're investigating. The way that I do it is to store the pertinent rows in a Collection and I keep referencing those collection objects as I need them. The advantage of this is that the Range reference in the Collection updates itself whenever rows are added.

    All in all, then, your code is relatively straightforward and an example of how it could work is given below. You'll note that I haven't bothered formatting the new rows and columns - that's fairly trivial and is something you could do yourself to suit your own needs. This code should be put in a Module:

    Option Explicit
    
    Private Const ID_IDX As Long = 0
    Private Const VER_IDX As Long = 1
    Private Const RNG_IDX As Long = 2
    
    Private Sub RunMe()
        Dim data As Variant, cols As Variant, items As Variant
        Dim r As Long, c As Long, i As Long, n As Long
        Dim ids() As String, vers() As String
        Dim addItems As Collection, concatItems As Collection
        Dim dataRng As Range, rng As Range
        Dim writeID() As Variant, writeVer() As Variant, writeConcat() As Variant
        Dim dataStartRow As Long
    
        'Define the range we're interested in and read into an array.
        With Sheet1 'adjust for your worksheet object
            Set dataRng = .Range(.Cells(1, "A"), .Cells(.Rows.Count, "A").End(xlUp)) _
                          .Resize(, .Cells(1, .Columns.Count).End(xlToLeft).Column)
           End With
        data = dataRng.Value2
        dataStartRow = 2
    
        'Find the two target columns
        cols = AcquireIdAndVerCol(data, 3, 8)
        If IsEmpty(cols) Then
            MsgBox "Unable to find Id and Ver columns."
            Exit Sub
        End If
    
        With dataRng
            'Add a column next to the version number column.
            .Columns(cols(VER_IDX)).Offset(, 1).Insert Shift:=xlShiftToRight, CopyOrigin:=xlFormatFromLeftOrAbove
    
            'Add a column to our range.
            'This is to cover the case that the rightmost column is the version number column.
            Set dataRng = .Resize(, .Columns.Count + 1)
        End With
    
        'Find the rows that need to be split and concatenate the target strings.
        Set addItems = New Collection
        Set concatItems = New Collection
        For r = dataStartRow To UBound(data, 1)
    
            ids = Split(data(r, cols(ID_IDX)), vbLf)
            vers = Split(data(r, cols(VER_IDX)), vbLf)
            n = IIf(UBound(ids) >= UBound(vers), UBound(ids), UBound(vers))
    
            If n = 0 Then 'it's just one line of text.
    
                'Add concatenated text to list.
                concatItems.Add data(r, cols(ID_IDX)) & " " & data(r, cols(VER_IDX))
    
            ElseIf n > 0 Then 'it's multiple lines of text.
    
                'Transpose the id array.
                ReDim writeID(1 To UBound(ids) + 1, 1 To 1)
                For i = 0 To UBound(ids)
                    writeID(i + 1, 1) = ids(i)
                Next
                'Transpose the version array.
                ReDim writeVer(1 To UBound(vers) + 1, 1 To 1)
                For i = 0 To UBound(ids)
                    writeVer(i + 1, 1) = vers(i)
                Next
    
                'Add concatenated text to list.
                For i = 0 To n
                    concatItems.Add (IIf(UBound(ids) <= n And UBound(vers) <= n, ids(i) & " " & vers(i), Empty))
                Next
    
                'Add the range to be split to the collection.
                addItems.Add Array(writeID, writeVer, dataRng.Rows(r + 1).Resize(n))
    
            Else 'it's an empty cell
    
                'Add empty item to concatenated list in order to keep alignment.
                concatItems.Add Empty
    
            End If
    
        Next
    
        Application.ScreenUpdating = False
    
        'Split the ranges in the list.
        If addItems.Count > 0 Then
            For Each items In addItems
                'Add the rows.
                With items(RNG_IDX)
                    .Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
                    Set rng = .Offset(-.Rows.Count - 1).Resize(.Rows.Count + 1)
                    'Note: format your rng Range obect as desired here.
                End With
                'Write the id and version values.
                rng.Columns(cols(ID_IDX)).Value = items(ID_IDX)
                rng.Columns(cols(VER_IDX)).Value = items(VER_IDX)
            Next
        End If
    
        'Write the concatenated values.
        If concatItems.Count > 0 Then
            ReDim writeConcat(1 To concatItems.Count + dataStartRow - 1, 1 To 1)
            'Header to array.
            writeConcat(1, 1) = "Concat values"
            'Values from the collection to array.
            i = dataStartRow
            For Each items In concatItems
                writeConcat(i, 1) = items
                i = i + 1
            Next
            'Output array to range.
            With dataRng.Columns(cols(VER_IDX) + 1)
                .Value = writeConcat
                .AutoFit
            End With
        End If
    
        Application.ScreenUpdating = True
    End Sub
    
    Private Function AcquireIdAndVerCol(data As Variant, minCol As Long, maxCol As Long) As Variant
        Dim result(1) As Long
        Dim r As Long, c As Long, i As Long
        Dim items() As String
    
        'Check we're not operating outside bounds of data array.
        If minCol < LBound(data, 2) Then minCol = LBound(data, 2)
        If minCol > UBound(data, 2) Then minCol = UBound(data, 2)
        If maxCol < LBound(data, 2) Then maxCol = LBound(data, 2)
        If maxCol > UBound(data, 2) Then maxCol = UBound(data, 2)
    
        'Loop through data to find the two columns.
        'Once found, leave the function.
        For r = 1 To UBound(data, 1)
            For c = minCol To maxCol
                items = Split(data(r, c), vbLf)
                For i = 0 To UBound(items)
                    If result(ID_IDX) = 0 Then
                        If IsDocId(items(i)) Then
                            result(ID_IDX) = c
                            If result(VER_IDX) = 0 Then
                                Exit For
                            Else
                                AcquireIdAndVerCol = result
                                Exit Function
                            End If
                        End If
                    End If
                    If result(VER_IDX) = 0 Then
                        If IsDocVer(items(i)) Then
                            result(VER_IDX) = c
                            If result(ID_IDX) = 0 Then
                                Exit For
                            Else
                                AcquireIdAndVerCol = result
                                Exit Function
                            End If
                        End If
                    End If
                Next
            Next
        Next
    
    End Function
    Private Function IsDocId(val As String) As Boolean
        Dim n As Long
    
        n = TryClng(val)
        IsDocId = (n > 9999 And n <= 999999999)
    End Function
    
    Private Function IsDocVer(val As String) As Boolean
        Dim n As Long, m As Long
        Dim items() As String
    
        items = Split(val, ".")
        If UBound(items) <> 1 Then Exit Function
    
        n = TryClng(items(0))
        m = TryClng(items(1))
    
        IsDocVer = (n > 0 And n <= 99) And (m >= 0 And m <= 9)
    End Function
    
    '-------------------------------------------------------------------
    'Converts a variant to a Long or returns a fail value as a Long
    'if the conversion failed.
    '-------------------------------------------------------------------
    Private Function TryClng(expr As Variant, Optional fail As Long = -1) As Long
        Dim n As Long
    
        n = fail
        On Error Resume Next
        n = CLng(expr)
        On Error GoTo 0
    
        TryClng = n
    End Function