I found the code sample below here. It searches for text in files, recursing through sub-directories, but I want to specify a subset of the first level of sub-directories to recurse through.
E.g. suppose I'm in directory C:\
which contains directories bin
, src
, and Windows
, and I want to recursively search for .h
and .c
files containing the text "include", I'd run the following with the MWE below, where my code is in textsearch.pl
:
perl textsearch.pl include "(\.)(h|c)($)"
How can I modify this program to only search in bin
and src
but not Windows
, while at the same time still recursing into sub-directories of bin
and src
? I.e. I'd like to be able to do something like the following:
perl textsearch.pl include "(\.)(h|c)($)" src,bin
I thought File::Find::Rule
would help, but I'm having trouble figuring out how to apply it here.
Also, if there's another much simpler way to do all this, I'd love to hear it.
MWE I found:
use strict;
use warnings;
use Cwd;
use File::Find;
use File::Basename;
my ($in_rgx,$in_files,$simple,$matches,$cwd);
sub trim($) {
my $string = shift;
$string =~ s/[\r\n]+//g;
$string =~ s/\s+$//;
return $string;
}
# 1: Get input arguments
if ($#ARGV == 0) { # *** ONE ARGUMENT *** (search pattern)
($in_rgx,$in_files,$simple) = ($ARGV[0],".",1);
}
elsif ($#ARGV == 1) { # *** TWO ARGUMENTS *** (search pattern + filename or flag)
if (($ARGV[1] eq '-e') || ($ARGV[1] eq '-E')) { # extended
($in_rgx,$in_files,$simple) = ($ARGV[0],".",0);
}
else { # simple
($in_rgx,$in_files,$simple) = ($ARGV[0],$ARGV[1],1);
}
}
elsif ($#ARGV == 2) { # *** THREE ARGUMENTS *** (search pattern + filename + flag)
($in_rgx,$in_files,$simple) = ($ARGV[0],$ARGV[1],0);
}
else { # *** HELP *** (either no arguments or more than three)
print "Usage: ".basename($0)." regexpattern [filepattern] [-E]\n\n" .
"Hints:\n" .
"*) If you need spaces in your pattern, put quotation marks around it.\n" .
"*) To do a case insensitive match, use (?i) preceding the pattern.\n" .
"*) Both patterns are regular expressions, allowing powerful searches.\n" .
"*) The file pattern is always case insensitive.\n";
exit;
}
if ($in_files eq '.') { # 2: Output search header
print basename($0).": Searching all files for \"${in_rgx}\"... (".(($simple) ? "simple" : "extended").")\n";
}
else {
print basename($0).": Searching files matching \"${in_files}\" for \"${in_rgx}\"... (".(($simple) ? "simple" : "extended").")\n";
}
if ($simple) { print "\n"; } # 3: Traverse directory tree using subroutine 'findfiles'
($matches,$cwd) = (0,cwd);
$cwd =~ s,/,\\,g;
find(\&findfiles, $cwd);
sub findfiles { # 4: Used to iterate through each result
my $file = $File::Find::name; # complete path to the file
$file =~ s,/,\\,g; # substitute all / with \
return unless -f $file; # process files (-f), not directories
return unless $_ =~ m/$in_files/io; # check if file matches input regex
# /io = case-insensitive, compiled
# $_ = just the file name, no path
# 5: Open file and search for matching contents
open F, $file or print "\n* Couldn't open ${file}\n\n" && return;
if ($simple) { # *** SIMPLE OUTPUT ***
while (<F>) {
if (m/($in_rgx)/o) { # /o = compile regex
# file matched!
$matches++;
print "---" . # begin printing file header
sprintf("%04d", $matches) . # file number, padded with 4 zeros
"--- ".$file."\n"; # file name, keep original name
# end of file header
last; # go on to the next file
}
}
} # *** END OF SIMPLE OUTPUT ***
else { # *** EXTENDED OUTPUT ***
my $found = 0; # used to keep track of first match
my $binary = (-B $file) ? 1 : 0; # don't show contents if file is bin
$file =~ s/^\Q$cwd//g; # remove current working directory
# \Q = quotemeta, escapes string
while (<F>) {
if (m/($in_rgx)/o) { # /o = compile regex
# file matched!
if (!$found) { # first matching line for the file
$found = 1;
$matches++;
print "\n---" . # begin printing file header
sprintf("%04d", $matches) . # file number, padded with 4 zeros
"--- ".uc($file)."\n"; # file name, converted to uppercase
# end of file header
if ($binary) { # file is binary, do not show content
print "Binary file.\n";
last;
}
}
print "[$.]".trim($_)."\n"; # print line number and contents
#last; # uncomment to only show first line
}
}
} # *** END OF EXTENDED OUTPUT ***
# 6: Close the file and move on to the next result
close F;
}
#7: Show search statistics
print "\nMatches: ${matches}\n";
# Search Engine Source: http://www.adp-gmbh.ch/perl/find.html
# Rewritten by Christopher Hilding, Dec 02 2006
# Formatting adjusted to my liking by Rene Nyffenegger, Dec 22 2006
The second parameter to the find() method can be a list of directories to scan. replace $cwd
with @some_list_of_directories
and you should be good to go