Search code examples
excelvbaexcel-formulaline-breaksstring-matching

Remove strings in a line in multiline Excel cells if they do not start with a certain string


I have a column of Excel cells that follow this format (the random strings are not in a fixed order). Strings that don't start with a certain string need to be removed.

randomstringA text_that_needs_to_be_kept1
text_that_needs_to_be_removed1
randomstringB text_that_needs_to_be_kept2
randomstringA text_that_needs_to_be_kept3
text_that_needs_to_be_removed2

I want the cell's output to be this (linebreak must be kept):

text_that_needs_to_be_kept1

text_that_needs_to_be_kept2
text_that_needs_to_be_kept3

And not this (linebreak got removed):

text_that_needs_to_be_kept1
text_that_needs_to_be_kept2
text_that_needs_to_be_kept3

Solution

  • The following code will go down column A starting in row 1 and remove any line that doesn't start with a value from the array arrToKeep, keeping the linebreaks.

    Option Explicit
    
    Sub RemoveStrings()
    Dim rngData As Range
    Dim arrData As Variant
    Dim arrLines As Variant
    Dim arrToKeep As Variant
    Dim idx1 As Long
    Dim idx2 As Long
    Dim idxRow As Long
    Dim boolKeep As Boolean
    
        arrToKeep = Array("randomstringA", "randomstringB")
    
        Set rngData = Range("A1", Range("A" & Rows.Count).End(xlUp))
    
        arrData = rngData.Value
    
        For idxRow = LBound(arrData, 1) To UBound(arrData, 1)
    
            arrLines = Split(arrData(idxRow, 1), vbLf)
    
            For idx1 = LBound(arrLines) To UBound(arrLines)
            
                boolKeep = False
                
                For idx2 = LBound(arrToKeep) To UBound(arrToKeep)
                    If arrLines(idx1) Like arrToKeep(idx2) & "*" Then
                        boolKeep = True
                        Exit For
                    End If
                Next idx2
                
                If Not boolKeep Then
                    arrLines(idx1) = ""
                End If
                
            Next idx1
    
            arrData(idxRow, 1) = Join(arrLines, vbLf)
    
        Next idxRow
    
        rngData.Value = arrData
    
    End Sub