Search code examples
regexpowershellpowershell-5.0

Named capture group within optional non-capturing group


I have the below PowerShell code:

$input = 'ADM:Dev_ControllerStore_103:1:2'
$pattern = '^(?<UID>\S+)\:(?<DB>\S+)\:(?<AppId>\d+)(?:\:(?<LicNr>\d+))?$'
if ( $input -match $pattern ) {
    $Matches
}

This gives the following output:

Name                           Value
----                           -----
DB                             1
AppId                          2
UID                            ADM:Dev_ControllerStore_103
0                              ADM:Dev_ControllerStore_103:1:2

Whilst I'd expect this:

Name                           Value
----                           -----
DB                             1
AppId                          2
UID                            ADM:Dev_ControllerStore_103
LicNr                          2
0                              ADM:Dev_ControllerStore_103:1:2

i.e. for LicNr to be included in the output.

Note: The output I'm getting is the expected output for input string: 'ADM:Dev_ControllerStore_103:1' ... and that works correctly.

If I change the regex to make the last non capturing group non-optional, or I remove the non-capturing group all works correctly for the longer input string; but obviously these then don't cater for the version where there's only 3 values rather than 4.

Am I missing something in my understanding, or is this a bug in PowerShell?

Note: I have a workaround (~ $a,$b,$c,$d = $input -split ':'), so this question is just for academic interest.


Solution

  • The \S pattern matches any non-whitespace characters, including : and digits. You might fix the pattern by using the lazy \S+? pattern, '^(?<UID>\S+?):(?<DB>\S+?):(?<AppId>\d+)(?::(?<LicNr>\d+))?$', but may also use a more precise pattern that will use patterns more tailored to the input you have:

    ^(?<UID>\w+):(?<DB>\w+):(?<AppId>\d+)(?::(?<LicNr>\d+))?$
    

    See the regex demo

    Details

    • ^ - start of string
    • (?<UID>\w+) - Group UID: one or more word chars
    • : - a colon
    • (?<DB>\w+) - Group DB: one or more word chars
    • : - a colon
    • (?<AppId>\d+) - Group AppId: one or more digits
    • (?::(?<LicNr>\d+))? - an optional group: a : and then Group LicNr: 1+ digits
    • $ - end of string.