Search code examples
bashfileperldirectory-listing

How can I list files with find and Perl?


I'm working in a HP-UX based machine and I need to list the name of the logs contained in a folder by the date and separated of the name by a ";" and the result, sorted by date in descending order this stored in a txt so the content of the txt will be like:

2019-02-02;/home/user/Documents/imthelog03.log
2019-02-01;/home/user/Documents/imthelog02.log
2019-01-29;/home/user/Documents/imthelog01.log

I've tried this:

find /home/user/Documents/*.log* exec perl -MPOSIX -e 'print POSIX::strftime "%Y%m%d\n", localtime((stat $ARGV[0])[9])'

but I can't get what I need, I can't use stats I'm using a for to read line by line so How can I get the date and the path/filename separated by a ; in a txt, sorted by date descending using bash and eventually perl, thanks!


Solution

  • Can do it all in Perl

    perl -MPOSIX=strftime -wE'$d = shift || "."; 
        say strftime("%Y-%m-%d", localtime((stat $_)[9])), "; $_" 
            for sort { (stat $b)[9] <=> (stat $a)[9] } glob "$d/*log*"
    ' dir-name
    

    where you submit the dir-name to the one-liner (or it works with ., the current directory).

    Note, I don't see a need for find as you're getting a listing of (log) files from a directory.

    This can be optimized, to not run stat repeatedly, but I doubt that it matters in expected use. I would recommend putting this in a nice little script though.


    Still, stat isn't cheap and if this regularly catches long file lists then use

    perl -MPOSIX=strftime -wE'$d = shift || "."; 
        say strftime("%Y-%m-%d", localtime($_->[1])), "; $_->[0]" 
            for 
                sort { $b->[1] <=> $a->[1] } 
                    map { [$_, (stat $_)[9]] } glob "$d/*log*"
    ' dir-name
    

    where I've split the statement into yet more lines to emphasize the change.

    The input file list from glob is used to first build another list, using map, with an arrayref for each filename: The name itself and that file's timestamp. Then the pair-wise comparisons in sort don't have to run stat every time through; they use time-stamps precomputed once. This is called a Schwartzian transform. Additionally, the sprintf need not run stat again, either.

    Note that the optimization comes with an overhead, so use this only when it is indeed expected to be needed. See, for example, this post (last section) for a discussion and links.