Search code examples
cmdfindstr

How do I search for multiple strings in a single file using findstr?


I'm trying to search a folder for all files that include two different strings. I'm using PowerShell and the findstr command.

For example, I want to find all files that include BOTH "String: A" and "String: B", but not files that only have "String: A" OR "String: B".

I've tried using findstr /c:"String: A" /c:"String: B" *.txt in the folder, but it ended up giving me all files that had either "String: A" or "String: B", not just the files with both strings in them. findstr /? didn't explain how to essentially do an AND search, so I was wondering if anyone knew how to do such a thing.

I also tried findstr /c:"String: A" *.txt | findstr /c:"String: B" *.txt from this answer, but this ends up with no results (as in, PowerShell sits there for a very long time and never returns).

This answer was closer (I used findstr /r /c:"String: A.*String: B" *.txt), but the command returned nothing (I know from my data that there should be at least one file with both strings in it).

I'm not sure if there are formatting issues with the strings (given that they include multiple words and symbols), which is why I've been using /c: in the string formatting.


Solution

  • The challenge is that you seem to want to know if all of the words are present anywhere in the file, whereas findstr.exe matches patterns on a single line each.

    PowerShell's more powerful findstr.exe analog, Select-String, can be combined with Group-Object to provide a solution:

    $patterns = 'String: A', 'String: B'
    
    Select-String -Path *.txt -Pattern $patterns -AllMatches | 
      Group-Object Path | # Group matching lines by file of origin
      Where-Object {
        # Does the distinct set of patterns found comprise all input patterns?
        ($_.Group.Pattern | Sort-Object -Unique).Count -eq $patterns.Count
      } |
      ForEach-Object Name
    

    Caveat:

    • This only works as intended as long as two or more of the patterns do not (only) match on the same line, because - even with -AllMatches present - the .Pattern property on the Microsoft.PowerShell.Commands.MatchInfo instances that Select-String outputs only reflects the first matching pattern on a given line - see GitHub issue #7765.

    Note that this only outputs the paths of the matching files.

    To also output the individual lines that contained matches for any of the patterns inside a matching file, replace ForEach-Object Name with ForEach-Object Group.