Folks, there are already billions of questions on "regex: match everything, but not ...", but non seems to fit my simple question.
A simple string: "1 Rome, 2 London, 3 Wembley Stadium
" and I want to match just "1 Rome,
2 London,
3 Wembley Stadium
", in order to extract only the names but not the ranks ("Rome, London, Wembley Stadium
").
Using a regex tester (https://extendsclass.com/regex-tester.html), I can simply match the opposite by:
([0-9]+\s*)
and it gives me:
"1
Rome, 2
London, 3
Wembley Stadium".
But how to reverse it? I tried something like:
[^0-9 |;]+[^0-9 |;]
, but it also excludes white spaces that I want to maintain (e.g. after the comma and in between Wembley and Stadium, "1 Rome,
2 London,
3 Wembley
Stadium
"). I guess the "0-9
" needs be determined somehow as one continuous string. I tried various brackets, quotation marks, \s*
, but nothing jet.
Note: I'm working in a visual basic environment and not allowing lookbehinds!
You can use
\d+\s*(.*?)(?=,\s*\d+\s|$)
See the regex demo, get the values from match.Submatches(0)
. Details:
\d+
- one or more digits\s*
- zero or more whitespaces(.*?)
- Group 1: zero or more chars other than line break chars as few as possible(?=,\s*\d+\s|$)
- a positive lookahead that requires ,
, zero or more whitespaces, one or more digits and then a whitespace OR end of string immediately to the right of the current location.Here is a demo of how to get all matches:
Sub TestRegEx()
Dim matches As Object, match As Object
Dim str As String
str = "1 Rome, 2 London, 3 Wembley Stadium"
Set regex = New regExp
regex.Pattern = "\d+\s*(.*?)(?=,\s*\d+\s|$)"
regex.Global = True
Set matches = regex.Execute(str)
For Each match In matches
Debug.Print match.subMatches(0)
Next
End Sub
Output: