I'd like to extract all the numbers from this string : 's_0a1f2d4e3c10b'. The string must follow this pattern 's_NumberLetterNumberLetter...' I wrote this regex which matches the whole string :
/^q_(?:\d+[a-f])+$/
The problem is that I don't know how to capture the numbers only. When I put brackets around the \d+ the regex matches only the last number (10). Here is the regex with the brackets :
^q_(?:(\d+)[a-f])+$
Of course I could use preg_match_all('/\d+/', 's_0a1f2d4e3c10b', $matches)
but I want the string to begin with 's_' and I'd like to use only one regex (if possible).
My desired output from s_0a1f2d4e3c10b
:
array(0, 1, 2, 4, 3, 10)
You need the "continue" metacharacter (\G
) in your regex to cleanly perform this task in a single preg_
call.
Matching can ONLY begin if the the substring starts with s_
. Then matching can ONLY continue while the alternating number then lowercase letter pattern is upheld.
\G
actually allows matching from the start of the string or from where the last matched finished. To deny the feature of matching from the start of the string add a negative lookahead containing a caret symbol ((?!^)
).
\K
means restart this fullstring match (or in other words, "forget" any previously matched characters). This spares the use of capture groups which would otherwise unnecessarily bloat the output array of matches.
Code: (Demo)
$tests = [
'This string s_0a1f2d4e3c10b is foo.',
's_1a23b456c789',
'b_9d9d9d9d9d',
's_1e2f3a4b'
];
foreach ($tests as $test) {
var_export(
preg_match_all(
'~(?:s_|\G(?!^)[a-z]+)\K\d+~',
$test,
$matches
)
? $matches[0]
: []
);
echo "\n---\n";
}
Output:
['0', '1', '2', '4', '3', '10']
---
['1', '23', '456', '789']
---
[]
---
['1', '2', '3', '4']
---