Search code examples
regexpowershell

PowerShell multiline regex is matching on regex101.com AND regexr.com, but not in Powershell code


This regex: (?is)[\s\S]*?\[General\][\s\S]*?SystemMustBeRebooted=(\d)[\s\S]*?\[Install Execution\][\s\S]*?SilentInstall=""(.*?)"".* doesn't match the below cut down part of the file in Powershell. I need to extract:

  1. SystemMustBeRebooted value
  2. The SilentInstall value

It DOES work however, on regex101.com (with removed (?is)), using .NET regex, AND regexr.com set to Powershell regex none the less!

Here is the result when relevant code lines are run in Powershell:

$CVAFileContents = get-content $($CVAFile).fullname -raw

$RebootNeededandSilentInstall = $CVAFileContents | select string -pattern '(?s)[\s\S]*?\[General\][\s\S]*?SystemMustBeRebooted=(\d)[\s\S]*?\[Install Execution\][\s\S]*?SilentInstall="(.*?)".*' -allmatches

$RebootNeededandSilentInstall

<back to PS prompt>

If I cut the regex back to [General], it matches the below. Anything more added, no results though.

[General]
PN=P01759-B2M
Version=24.9764.1433.30
Revision=Q
Pass=5
Type=Driver
Category=Driver-Audio
SystemMustBeRebooted=0

....

[Install Execution]
Install="HPUP.exe"
SilentInstall="HPUP.exe"

What is wrong with my regex??

EDIT: Of course it works when using -match (without the thought-to-be-needed double escaping of "):

$CVAFileContents -match '(?s)[\s\S]*?\[General\][\s\S]*?SystemMustBeRebooted=(\d)[\s\S]*?\[Install Execution\][\s\S]*?SilentInstall="(.*?)".*' | out-null

Produces:

$Matches

Name                           Value                                                                                                                                     
----                           -----                                                                                                                                     
2                              HPUP.exe                                                                                                                                  
1                              0                                                                                                                                         
0                              [CVA File Information]...       

Just for curiousity, why does the same expression work with -match and not select-string??


Solution

  • What is wrong with my regex??

    Nothing, though it can be simplified (see below).

    why does the same expression work with -match and not select-string??

    It does work with Select-String too, you just need to extract the results differently:

    $rebootNeeded, $silentInstallExe =
      $CVAFileContents | 
        Select-String '(?s)\[General\].*?SystemMustBeRebooted=(\d).*?\[Install Execution\].*?SilentInstall="(.*?)"' |
        ForEach-Object { ($_.Matches[0].Groups | Select-Object -Skip 1).Value }
    
    • That is, Select-String emits a Microsoft.PowerShell.Commands.MatchInfo instance for each matching input string, from which you can extract the capture-group matches.

    • In your case, there is only one input string, namely the entire content of your input file (due to having read the file with Get-Content's -Raw switch).

    • Note that the -AllMatches switch has been omitted above, because you only need it to find multiple matches of your pattern in each input string, which doesn't apply here.

    • The absence of -AllMatches implies that the $_.Matches collection contains only one System.Text.RegularExpressions.Match instance describing the match.

    • The .Groups collection's first entry ([0]) is always the overall match, whereas the subsequent entries represent the capture-group matches; Select-Object -Skip 1 skips the overall match, and (...).Value uses member-access enumeration to extract the captured text from the groups.

    • Finally, the two pieces of text captured by the capture groups are assigned to two separate output variables, $rebootNeeded and $silentInstallExe, using a multi-assignment.


    That said, if you only need to look for one match and if the only information you need is what text the capture groups captured, using -match, the regular-expression matching operator, which reflects the details of the match in the automatic $Matches variable, is the simpler and more efficient choice.