Search code examples
excelvbastring-comparisonstring-concatenation

VBA Code to Concatenate strings from column if first integers, or first and third integers, in another column match


Alright, this is a very specific question. I have an excel macro written that takes a web URL, delimits it, transposes it, and then adds adjacent columns that describe the information in the originally transposed columns. Now, I need to add something to my macro that will loop through and check if the first character of one cell matches one of the first 4 characters of another cell. If it does, I need to concatenate strings from the descriptive columns to new cells. I'll illustrate this below:

enter image description here


3,435,201,0.5,%22type%25202%2520diabetes%22,0   Node    type 2 diabetes
4,165,97,0.5,%22diet%22,0                       Node    diet
5,149,248,0.5,%22lack%2520of%2520exercise%22,2  Node    lack of exercise
6,289,329,0.5,%22genetics%22,3                  Node    genetics
7,300,71,0.5,%22blood%2520pressure%2520%22,5    Node    blood pressure 
7,3,-7,1,0                                      Arrow   +
4,3,-21,1,0                                     Arrow   +
5,3,-22,1,0                                     Arrow   +
6,3,-34,1,0                                     Arrow   +
,7%5D                                           Tail     

I added color to make the concept of the problem more easily visualized. In row one of the first column, we see a red 3 that corresponds to 'type 2 diabetes'. In the fifth row of the first column, we see a blue 7 that corresponds to 'blood pressure'. These are both node objects, as the adjacent column signifies. In the sixth cell of the first column we see a blue 7 and a red 3. This indicates that an arrow (also signified by adjacent column) is connecting blood pressure to diabetes. In the next column over, we see an orange plus sign, which indicates this is a positive relationship.

The goal is to populate the next column over with "blood pressure + type diabetes", as I demonstrated in the image. So, I need some code to check the first characters in each node cell, and then compare them to the first 4 characters of each arrow cell. When an arrow that matches two of the nodes is found, I need the code to populate the row next to the + signs with a concatenated string comprised of the names of the nodes pertaining to that arrow, as well as the + sign between them (it's possible that it could also be a minus sign, but one isn't present in this example). Any pointers? I can't wrap my head around this. Edited to add Data

Here is the code of my current macro:

Sub Delimit_Transpose()
    
    Cells.Replace What:="],[", Replacement:="@", LookAt:=xlPart, SearchOrder _
        :=xlByRows, MatchCase:=False, SearchFormat:=False, ReplaceFormat:=False
    
    ActiveCell.FormulaR1C1 = "=RIGHT(R[-1]C,LEN(R[-1]C)-36)"
    
    Dim i As Long, strTxt As String
    Dim startP As Range
    Dim xRg As Range, yRg As Range
    On Error Resume Next
    Set xRg = Application.InputBox _
    (Prompt:="Range Selection...", _
    Title:="Delimit Transpose", Type:=8)
    i = 1
    Application.ScreenUpdating = False
    For Each yRg In xRg
        If i = 1 Then
            strTxt = yRg.Text
            i = 2
        Else
            strTxt = strTxt & "," & yRg.Text
        End If
    Next
    Application.ScreenUpdating = True
    Set startP = Application.InputBox _
    (Prompt:="Paste Range...", _
    Title:="Delimit Transpose", Type:=8)
    ary = Split(strTxt, "@")
    i = 1
    Application.ScreenUpdating = False
    For Each a In ary
        startP(i, 1).Value = Replace(Replace(a, "[", ""), "]", "")
        i = i + 1
    Next a
    
    i = 1
    For Each a In ary
       If Len(a) > 13 Then
           startP.Offset(i - 1, 1).Value = "Node"
        ElseIf Len(a) < 13 And Len(a) > 6 Then
            startP.Offset(i - 1, 1).Value = "Arrow"
        Else
            startP.Offset(i - 1, 1).Value = "Tail"
        End If
        i = i + 1
    Next a

    Dim openPos As Integer
    Dim closePos As Integer
    Dim midBit As String
    
    i = 1
    n = 5
    For Each a In ary
    openPos = InStr(a, ",%22")
     On Error Resume Next
    closePos = InStr(a, "%22,")
     On Error Resume Next
    midBit = Mid(a, openPos + 1, closePos - openPos - 1)
     On Error Resume Next
        If openPos <> 0 And Len(midBit) > 0 Then
            startP.Offset(i - 1, 2).Value = Replace(Replace(midBit, "%22", ""), "%2520", " ")
        ElseIf Len(a) < 13 And InStr(a, "-") = 4 Then
            startP.Offset(i - 1, 2).Value = "'-"
        ElseIf Len(a) < 7 Then
            startP.Offset(i - 1, 2).Value = " "
        Else
            startP.Offset(i - 1, 2).Value = "+"
        End If
        i = i + 1
        n = n + 1
    Next a

    Application.ScreenUpdating = True
End Sub


Solution

  • This is my approach.

    There's room for a lot of improvements, but is a rough code that should get you started.

    Read the code's comments and adapt it to fit your needs.


    EDIT: I updated the code to match the sample worksheet you uploaded, build the first column range dinamically, validate if commas appear in the first column cell so no error is raised.

    As I said in the comments, it's better easier to debug if you call one procedure from the other, instead of merging them.

    Code:

    Option Explicit
    
    Public Sub StoreConcatenate()
    
        ' Basic error handling
        On Error GoTo CleanFail
        
        ' Define general parameters
        Dim targetSheetName As String
        targetSheetName = "Test space" ' Sheet holding the data
        
        Dim firstColumnLetter As String
        firstColumnLetter = "C" ' First column holding the numbers
        
        Dim firstColumnStartRow As Long
        firstColumnStartRow = 7
        
        ' With these three parameters we'll build the range address holding the first column dynamically
        
        ' Set reference to worksheet
        Dim targetSheet As Worksheet
        Set targetSheet = ThisWorkbook.Worksheets(targetSheetName)
        
        ' Find last row in column (Modify on what column)
        Dim firstColumnlastRow As Long
        firstColumnlastRow = targetSheet.Cells(targetSheet.Rows.Count, firstColumnLetter).End(xlUp).Row
        
        ' Build range of first column dinamically
        Dim firstColumnRange As Range
        Set firstColumnRange = targetSheet.Range(firstColumnLetter & firstColumnStartRow & ":" & firstColumnLetter & firstColumnlastRow)
        
        ' Loop through first column range cells
        Dim valueCell As Range
        For Each valueCell In firstColumnRange
            
            ' Check if cell contains "," in the second position in string
            If InStr(valueCell.Value, ",") = 2 Then
            
                ' Store first digit of cell before ","
                Dim firstDigit As Integer
                firstDigit = Split(valueCell.Value, ",")(0)
                
                
                ' Check if cell contains "," in the fourth position in string
                If InStr(3, valueCell.Value, ",") = 4 Then
                
                    ' Store second digit of cell after ","
                    Dim secondDigit As Integer
                    secondDigit = Split(valueCell.Value, ",")(1)
                
                End If
                
                ' Store second colum type
                Dim secondColumnType As String
                secondColumnType = valueCell.Offset(, 1).Value
                
                ' Store third column value
                Dim thirdColumnValue As String
                thirdColumnValue = valueCell.Offset(, 2).Value
                
                ' Store nodes values (first digit and second column type)
                Select Case secondColumnType
                Case "Node"
                    Dim nodeValues() As Variant
                    Dim nodeCounter As Long
                    ReDim Preserve nodeValues(nodeCounter)
                    
                    nodeValues(nodeCounter) = Array(firstDigit, thirdColumnValue)
                    
                    nodeCounter = nodeCounter + 1
                Case "Arrow"
                    Dim matchedNodeFirstValue As String
                    Dim matchedNodeSecondValue As String
                    matchedNodeFirstValue = IsInArrayReturnItem(firstDigit, nodeValues)(1)
                    matchedNodeSecondValue = IsInArrayReturnItem(secondDigit, nodeValues)(1)
                    If matchedNodeFirstValue <> vbNullString And matchedNodeSecondValue <> vbNullString Then
                        valueCell.Offset(, 3).Value = matchedNodeFirstValue & Space(1) & thirdColumnValue & Space(1) & matchedNodeSecondValue
                    End If
                End Select
            
            End If
        
        Next valueCell
        
    CleanExit:
        Exit Sub
     
    CleanFail:
        Debug.Print "Something went wrong: " & Err.Description
        Resume CleanExit
        
    End Sub
    
    ' Credits: https://stackoverflow.com/a/38268261/1521579
    Public Function IsInArrayReturnItem(stringToBeFound As Integer, arr As Variant) As Variant
        Dim i
        For i = LBound(arr) To UBound(arr)
            If arr(i)(0) = stringToBeFound Then
                IsInArrayReturnItem = arr(i)
                Exit Function
            End If
        Next i
        IsInArrayReturnItem = Array(vbNullString, vbNullString)
    End Function
    

    Let me know if it works