Thanks to @mklement0 for the help with getting this far with answer given in Powershell search directory for code files with text matching input a txt file.
The below Powershell works well for finding the occurrences of a long list of database field names in a source code folder.
$inputFile = 'C:\DataColumnsNames.txt'
$outputFile = 'C:\DataColumnsUsages.txt'
Get-ChildItem C:\ProjectFolder -Filter *.cs -Recurse -Force -ea SilentlyContinue |
Select-String -Pattern (Get-Content $inputFile) |
Select-Object Path, LineNumber, line |
Export-csv $outputfile
However, many lines of source code have multiple matches, especially ADO.NET SQL statements with a lot of field names on one line. If the field name argument was included with the matching output the results will be more directly useful with less additional massaging such as lining up everything with the original field name list. For example if there is a source line "BatchId = NewId" it will match field name list item "BatchId". Is there an easy way to include in the output both "BatchId" and "BatchId = NewId"?
Played with the matches object but it doesn't seem to have the information. Also tried Pipeline variable like here but X is null.
$inputFile = 'C:\DataColumnsNames.txt'
$outputFile = 'C:\DataColumnsUsages.txt'
Get-ChildItem C:\ProjectFolder -Filter *.cs -Recurse -Force -ea SilentlyContinue |
Select-String -Pattern (Get-Content $inputFile -PipelineVariable x) |
Select-Object $x, Path, LineNumber, line |
Export-csv $outputile
Thanks.
The Microsoft.PowerShell.Commands.MatchInfo
instances that Select-String
outputs have a Pattern
property that reflects the specific pattern among the (potential) array of patterns passed to -Pattern
that matched on a given line.
The caveat is that if multiple patterns match, .Pattern
only reports the pattern among those that matched that is listed first among them in the -Pattern
argument.
Here's a simple example, using an array of strings to simulate lines from files as input:
'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' |
Select-String -Pattern ('bar', 'foo') |
Select-Object Line, LineNumber, Pattern
The above yields:
Line LineNumber Pattern
---- ---------- -------
A fool and 1 foo
his barn 2 bar
foo and bar on the same line 4 bar
Note how 'bar'
is listed as the Pattern
value for the last line, even though 'foo'
appeared first in the input line, because 'bar'
comes before 'foo'
in the pattern array.
To reflect the actual pattern that appears first on the input line in a Pattern
property, more work is needed:
Formulate your array of patterns as a single regex using alternation (|
), wrapped as a whole in a capture group ((...)
) - e.g., '(bar|foo)'
)
'({0})' -f ('bar', 'foo' -join '|')
, constructs this regex dynamically, from an array (the array literal 'bar', 'foo'
here, but you can substitute any array variable or even (Get-Content $inputFile)
); if you want to treat the input patterns as literals and they happen to contain regex metacharacters (such as .
), you'll need to escape them with [regex]::Escape()
first.Use a calculated property to define a custom Pattern
property that reports the capture group's value, which is the first among the values encountered on each input line:
'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' |
Select-String -AllMatches -Pattern ('({0})' -f ('bar', 'foo' -join '|')) |
Select-Object Line, LineNumber,
@{ n='Pattern'; e={ $_.Matches[0].Groups[1].Value } }
This yields (abbreviated to show only the last match):
Line LineNumber Pattern
---- ---------- -------
...
foo and bar on the same line 4 foo
Now, 'foo'
is properly reported as the matching pattern.
To report all patterns found on each line:
Switch -AllMatches
is required to tell Select-String
to find all matches on each line, represented in the .Matches
collection of the MatchInfo
output objects.
The .Matches
collection must then be enumerated (via the .ForEach()
collection method) to extract the capture-group value from each match.
'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' |
Select-String -AllMatches -Pattern ('({0})' -f ('bar', 'foo' -join '|')) |
Select-Object Line, LineNumber,
@{ n='Pattern'; e={ $_.Matches.ForEach({ $_.Groups[1].Value }) } }
This yields (abbreviated to show only the last match):
Line LineNumber Pattern
---- ---------- -------
...
foo and bar on the same line 4 {foo, bar}
Note how both 'foo'
and 'bar'
are now reported in Pattern
, in the order encountered on the line.