Search code examples
excelregexvba

Remove a space within a pattern


Data from a chemistry instrument outputs values that are not uniformly delimited and contain special characters.

I am trying to import these values into Excel.
The values are read in line-by-line.
Each line is contained within its own cell in column A.
There can be anywhere from 50 to roughly 1000 columns of data, with the associated identifiers and metadata above.

Copy/paste of the first five lines of data.

1 7.724 1190 1231 1292 PV 4 724391 121434659 49.60% 9.688%
2 9.272 1451 1481 1484 VB 3961552 186833117 76.32% 14.905%
3 10.968 1732 1754 1816 VV 2673526 111034313 45.36% 8.858%
4 15.249 2382 2445 2453 PV 296082 33844178 13.82% 2.700%
5 15.384 2453 2466 2500 VV 219908 14461812 5.91% 1.154%

There are times when there are multiple peaks that make up one value and are recorded as two letters a space and one to two numbers (0-9).
Peak types with only one peak are just two letters.

For an example please look in line 1 where there is "PV 4".

I am trying to use regular expressions to loop through the A column, starting at row 18 and ending around row 1000, to find the letters and associated numbers, and remove the interstitial space so the cell will look like this:

1 7.724 1190 1231 1292 PV4 724391 121434659 49.60% 9.688%

Once it is in that form, I can use the space delimiter to separate the cells without frame shifting the ones that have the multiple peak types.

The code I've written so far:

Sub PKTYRegexRemoveSpace()

Dim StrPattern As String: StrPattern = "[A,B,H,M,N,P,S,T,U,V,X,\+][A,B,H,M,N,P,S,T,U,V,X,\+]\s[0-9]{1,2}\s"
Dim StrInput As String
Dim MyRange As Range
Dim regEx As New RegExp
Dim Cell As Range

Set MyRange = ActiveSheet.Range("A22:A24")

For Each Cell In MyRange
    If StrPattern <> "" Then
        StrInput = Cell.Value

        With regEx
            .Pattern = StrPattern
            .Global = False
            .IgnoreCase = False
        End With
    
        If regEx.Test(StrInput) Then
            MsgBox (regEx.Replace(StrInput, *this is where I need help*))
        Else
            MsgBox ("Not matched")
        End If
    
    End If
Next
End Sub

I am using a MsgBox during development to avoid having to re-import the file for every failed replacement attempt.


Solution

  • I suggest change the regex Pattern to use capturing groups and word boundary tokens

    \b([A,B,H,M,N,P,S,T,U,V,X,\+][A,B,H,M,N,P,S,T,U,V,X,\+])\s([0-9]{1,2})\b
    

    Then, for the replace string:

    $1$2