PowerShell regular expression to interpret time strings 12h34m56s with optional groups

Using PowerShell and regex, I'm trying to interpret a time string in a data file, the string is for example 12h30m meaning 12 hours and 30 minutes, or 15m meaning 15 minutes, 1m30s, 2h etc. I figured I could split these strings using regex to separate the hours minutes and seconds, and only get the digits parts, so leave out the h, m, s, and then do some calculations.

However, I managed to get a regex working but only when I also include the h, m, s, I mean when I add a "positive look ahead" part in the regex then the final result is not what I expected.

Here is the PowerShell script

$input = "12h34m56s"
#$input = "4h30m"
#$input = "1m15s"
#$input = "14400"

Write-Output "--(test 1)--------------------"
$input -match '(\d*h)?(\d*m)?(\d*s)?(\d*)?'
Write-Output $Matches

Write-Output "--(test 2)--------------------"
$input -match '(\d*(?=h))?(\d*(?=m))?(\d*(?=s))?(\d*)?'
Write-Output $Matches

The output is this:

--(test 1)--------------------
True

Name                           Value
----                           -----
4
3                              56s
2                              34m
1                              12h
0                              12h34m56s
--(test 2)--------------------
True
4
1                              12
0                              12

The first part "test 1" is what I expected, however for the second part "test 2" I was expecting this output

--(test 2)--------------------
True

Name                           Value
----                           -----
4
3                              56
2                              34
1                              12
0                              12h34m56s

I tested this on regex101 and looking at the syntax coloring it seems correct, but at he top it states "27 matches" and I would expect just 5 matches (because 5 lines). So I suspect it has something to do with grouping. I tried adding extra parenthesis around the whole but that didn't help. Any help would be appreciated.

Solution

Using [timespan]::ParseExact() as shown in the other, helpful answers, is definitely preferable if/once you have individual tokens representing timespans.

However, at least hypothetically you may (first) have to extract such tokens out of a larger text, in which case a regex is needed - see below.

The problem with your regex is that using lookahead assertions (e.g, (?=h)) prevents your regex from recognizing tokens such as 12h34m56s as a single match, because lookaround assertions do not consume the substrings they match.

Therefore, just match those characters directly, and enclose the subexpression in (?:…), i.e. a non-capturing group to avoid unnecessary capture groups; e.g., instead of (\d*h)?, use (?:(\d+)h)?.

Also note the use of + instead of +, as at least one digit should be present, whereas the subexpression as a whole may be absent ((?:…)?)

To put it all together, along with using named capture groups, which makes it easier to identify which capture-group matches captured what units:

$str = @'
12h34m56s545294385
1h2m3s
4h30m
1m15s
14400
'@

[regex]::Matches($str, '(?:(?<hrs>\d+)h)?(?:(?<mins>\d+)m)?(?:(?<secs>\d+)s)?(?<num>\d+)?') |
  ForEach-Object { 
    if ($_.Value) { # Only consider nonempty matches.
      [pscustomobject] @{ 
        Match = $_.Value
        Groups = $_.Groups | Select-Object -Skip 1 | Select-Object Name, Value | Out-String
      } 
    }
  } | Format-Table -Wrap

Output:

Match              Groups
-----              ------
12h34m56s545294385
                   Name Value
                   ---- -----
                   hrs  12
                   mins 34
                   secs 56
                   num  545294385
                  
                  
1h2m3s            
                   Name Value
                   ---- -----
                   hrs  1
                   mins 2
                   secs 3
                   num
                  
                  
4h30m             
                   Name Value
                   ---- -----
                   hrs  4
                   mins 30
                   secs
                   num
                  
                  
1m15s             
                   Name Value
                   ---- -----
                   hrs
                   mins 1
                   secs 15
                   num
                  
                  
14400             
                   Name Value
                   ---- -----
                   hrs
                   mins
                   secs
                   num  14400