I've been analyzing log file using PowerShell by searching for any lines with "error" or "exception" by using foreach { $_ -match $searchStrings }
and this works.
However I ran into logfiles that also include many lines with user login error
and user authentication error
which I want to ignore, but I can't get it to work.
When I try the -Match
and -NotMatch
separately they work as expected. But when I combine them I get just one result true
. I've also tried using Where-Object
but it just returns all lines including the lines without "error".
So the logfile is for example like this:
28-02-2024 02:30:37 - User authentication error ('JohnsenJ') on module LabResultCodeLists
28-02-2024 10:14:27 - LabResults imported (record count: 0)
28-02-2024 10:16:53 - LabResults imported (record count:6754)
28-02-2024 13:19:03 - Server error from remote client (45eaae02-b53b-40b4-ac7f-79926eac12c4)
28-02-2024 22:34:00 - Server error from remote client (fff80a18-8317-4945-9e12-b1cacc4b4419)
29-02-2024 00:59:31 - Query error ('SELECT * FROM LABRESULTS WHERE CODE = ;')
29-02-2024 01:29:51 - LabResults imported (record count:123)
29-02-2024 01:54:35 - Access violation error at dlgLabResultVerification
29-02-2024 15:30:14 - Query error ('SELECT * FROM LABRESULTS WHERE CODE = ;')
29-02-2024 17:09:06 - User login error ('JohnsneJ') not found
29-02-2024 18:59:29 - Connection error from remote client (45eaae02-b53b-40b4-ac7f-79926eac12c4)
And the PowerShell script is like this.
$searchStrings = "error|exception"
$excludeStrings = "User login error|User authentication error"
$file = "logfile.txt"
$lines = Get-Content $file -ReadCount 1000 |
foreach { $_ -match $searchStrings } # count 8
#foreach { $_ -NotMatch $excludeStrings } # count 9
#foreach { $_ -match $searchStrings -and $_ -NotMatch $excludeStrings} # 1: true ?!
#Where-Object { $_ -Match $searchStrings } # incorrect: count 11
# Any lines?
if ($lines.count -ne 0) {
Write-Host ("Error count: $($lines.count)")
Write-Host ($lines -join "`n")
}
How can I find just the lines that match $searchStrings
but exclude any that match $excludeStrings
?
Let's first analyze your existing code and why it behaves as it does:
$lines = Get-Content $file -ReadCount 1000 |
foreach { $_ -match $searchStrings } # count 8
When Get-Content
is used with parameter -ReadCount 1000
it reads as much as 1000 lines (or less, if the file is shorter). Then it passes these lines all at once to the next command in the pipeline (without -ReadCount
it would pass the lines one-by-one). So inside the foreach
code block the automatic $_
variable is an array of strings, instead of a simple string variable.
Why is this important? Because in PowerShell most of the operators work differently, depending on whether the operand on the left hand side (LHS) is a collection (like an array) or a scalar. In the case of a collection, instead of resolving to a boolean value, the operator acts as a filter that outputs only the elements of the collection that match (see about_Comparison_Operators - Common Features).
So why does combining the -match
and -NotMatch
operators in the next example doesn't work as expected then?
$lines = Get-Content $file -ReadCount 1000 |
foreach { $_ -match $searchStrings -and $_ -NotMatch $excludeStrings}
Due to operator precedence, PowerShell processes the expression inside foreach
in this order:
$temp1 = $_ -match $searchStrings
$temp2 = $_ -NotMatch $excludeStrings
$temp1 -and $temp2
As I've already explained, $temp1
and $temp2
will be arrays of lines that match or don't match. What happens when an array is used in a boolean context as in $temp1 -and $temp2
? The array resolves to $false
if it is empty or $true
if it is non-empty. As both arrays are non-empty, you get the result of $true
.
To fix your existing code with minimal changes, just chain the -match
and -notmatch
operators, without using a boolean operator:
$lines = Get-Content $file -ReadCount 1000 |
foreach { $_ -match $searchStrings -notmatch $excludeStrings }
PowerShell first resolves $_ -match $searchStrings
to an array of matching elements, then feeds this filtered array to the -notmatch
sub expression to filter it again, excluding the unwanted elements.
While Solution A works, it's not idiomatic PowerShell code. Unless there is a very good reason to use -ReadCount
(e. g. if performance is paramount), I'd use the more expressive Where-Object
filtering pattern instead:
$lines = Get-Content $file |
Where-Object { $_ -match $searchStrings -and $_ -notmatch $excludeStrings }
Note that in this case using the boolean operator -and
is correct, since without -ReadLines
, the Get-Content
command passes each line one-by-one to the next pipeline command. So $_
within the Where-Object
filter expression refers to a single string object only and the -match
and -notmatch
sub expressions both resolve to a boolean value, which results in the expected outcome when combined using -and
.