Search code examples
perlchop

Using chop in grep expression


My Perl script searches a directory of file names, using grep to output only file names without the numbers 2-9 in their names. That means, as intended, that file names ending with the number "1" will also be returned. However, I want to use the chop function to output these file names without the "1", but can't figure out how. Perhaps the grep and chop functions can be combined in one line of code to achieve this? Please advise. Thanks.

Here's my Perl script:

#!/usr/bin/perl
use strict;
use warnings;

my $dir = '/Users/jdm/Desktop/xampp/htdocs/cnc/images/plants';
opendir(DIR, $dir);
@files = grep (/^[^2-9]*\.png\z/,readdir(DIR));

foreach $file (@files) {
   print "$file\n";
}

Here's the output:

Ilex_verticillata.png
Asarum_canadense1.png
Ageratina_altissima.png
Lonicera_maackii.png
Chelone_obliqua1.png

Here's my desired output with the number "1" removed from the end of file names:

Ilex_verticillata.png
Asarum_canadense.png
Ageratina_altissima.png
Lonicera_maackii.png
Chelone_obliqua.png

Solution

  • The number 1 to remove is at the end of the name before the extension; this is different from filtering on numbers (2-9) altogether and I wouldn't try to fit it into one operation.

    Instead, once you have your filtered list (no 2-9 in names), then clip off that 1. Seeing that all names of interest are .png can simply use a regex

    $filename =~ s/1\.png\z/.png/;
    

    and if there is no 1 right before .png the string is unchanged. If it were possible to have other extensions involved then you should use a module to break up the filename.

    To incorporate this, you can pass grep's output through a map

    opendir my $dfh, $dir  or die "Can't open $dir: $!";
    
    my @files = 
        map { s/1\.png\z/.png/r } 
        grep { /^[^2-9]*\.png\z/ } 
        readdir $dfh;
    

    where I've also introduced a lexical directory filehandle instead of a glob, and added a check on whether opendir worked. The /r modifier on the substitution in map is needed so that the string is returned (changed or unchanged if regex didn't match), and not changed in place, as needed here.

    This passes over the list of filenames twice, though, while one can use a straight loop. In principle that may impact performance; however, here all operations are done on each element of a list so a difference in performance is minimal.