Search code examples
regexexcelvbaexcel-2007

How to match two patterns with RegExp


I have 50,000 names and addresses all in one cell and I want to extract each element into separate cells HOWEVER one particular area is causing a headache. Some of the address fields contain two numbers for the street address. For instance a cell might contain "Mr Smith 10 Rillington Place Exeter Devon" which is fine as my code splits the name street city and County. But some cells contain "Smith & Sons Ltd. 10 & 12 High Street Sutton Surrey.

I have tried .Pattern = "(\d+|\D+|\[d+ & \d+])" but that does the same as .Pattern = "(\d+|\D+)"

Sub SplitTextNum2()
    Dim r As Range, rC As Range
    Dim Match, Matches
    Dim matchCount As Integer
    Columns("E").SpecialCells (xlCellTypeBlanks)
    On Error Resume Next
    Set r = Range("E1", Range("E50").End(xlDown))


    With CreateObject("VBScript.RegExp")
        .Pattern = "(\d+|\D+)"
        .Global = True
        For Each rC In r
            Set Matches = .Execute(rC.Value)
            For matchCount = 1 To Matches.Count
                rC.Offset(, matchCount).Value = Matches(matchCount - 1)
            Next
        Next rC
    End With
End Sub

My VBA Macro splits out "10" "&" "12" into separate cells. What I want it to do in this circumstance is to keep "10 & 12" together in one cell. The code needs to also cater for the single street number. Any suggestions would be of great help.


Solution

  • You might use the same alternation but start with \d+ & \d+

    You can omit the \[ because that matches a literal [ and that character will first be matched by \D which matches not a digit.

    If you intention was to create a character class by not escaping the first square bracket, the pattern would be the same as [d+ &].

    But the alternation would not get there because all the matches will be matched by either \d which matches a digit or \D which matches not a digit.

    You might use:

    (\d+ & \d+|\d+|\D+)
    

    Regex demo