Search code examples
regexpowershellregex-lookarounds

How to match, but not capture, part of a regex with Powershell


I have a line in a text doc that im trying to pull data from.

Example

  • Line im searching for in Text file: Valid from: Sun May 17 19:00:00 CDT 1998

I want to find the key words "Valid from:" and then get only Sun May 17 and 1998

So end result should look like this:

  • Sun May 17 1998

I think im close to getting it right. This is what I have. It finds the keyword Valid From: but it returns more than I need

  • Sun May 17 19:00:00 CDT 1998

  • (?<=Valid from:)\s+\w+\s+\w+\s+\d+\s+\d+:\d+:\d+\s+\w+\s+\d+

Thank you in advance for any assistance.


Solution

  • I would use two capture groups (…) instead of lookaround construct:

    $sampleText = 'Valid from: Sun May 17 19:00:00 CDT 1998'
    $regEx = 'Valid from:\s+(\w+\s+\w+\s+\d+)\s+\d+:\d+:\d+\s+\w+\s+(\d+)'
    
    if( $sampleText -match $regEx ) {
        # Combine the matched values of both capture groups into a single string
        $matches[1,2] -join ' '
    }
    

    Output:

    Sun May 17 1998
    
    • If the -match operator successfully matches the pattern on the right-hand-side with the input text on the left-hand-side, the automatic variable $matches is set.
    • $matches contains the full match at index 0 and the matched values of any capture groups at subsequent indices, in this case 1 and 2.
    • Using the -join operator we combine the matched values of the capture groups into a single string.
    • Demo at regex101.