Search code examples
perliteratorintrospectiondirectory-traversal

Iterate directories in Perl, getting introspectable objects as result


I'm about to start a script that may have some file lookups and manipulation, so I thought I'd look into some packages that would assist me; mostly, I'd like the results of the iteration (or search) to be returned as objects, which would have (base)name, path, file size, uid, modification time, etc as some sort of properties.

The thing is, I don't do this all that often, and tend to forget APIs; when that happens, I'd rather let the code run on an example directory, and dump all of the properties in an object, so I can remind myself what is available where (obviously, I'd like to "dump", in order to avoid having to code custom printouts). However, I'm aware of the following:

list out all methods of object - perlmonks.org
"Out of the box Perl doesn't do object introspection. Class wrappers like Moose provide introspection as part of their implementation, but Perl's built in object support is much more primitive than that."

Anyways, I looked into:

... and started looking into the libraries referred there (also related link: rjbs's rubric: the speed of Perl file finders).

So, for one, File::Find::Object seems to work for me; this snippet:

use Data::Dumper;
@targetDirsToScan = ("./");

use File::Find::Object;
my $tree = File::Find::Object->new({}, @targetDirsToScan);
while (my $robh = $tree->next_obj()) {
  #print $robh ."\n"; # prints File::Find::Object::Result=HASH(0xa146a58)}
  print Dumper($robh) ."\n";
}

... prints this:

# $VAR1 = bless( {
#                  'stat_ret' => [
#                                  2054,
#                                  429937,
#                                  16877,
#                                  5,
#                                  1000,
#                                  1000,
#                                  0,
#                                  '4096',
#                                  1405194147,
#                                  1405194139,
#                                  1405194139,
#                                  4096,
#                                  8
#                                ],
#                  'base' => '.',
#                  'is_link' => '',
#                  'is_dir' => 1,
#                  'path' => '.',
#                  'dir_components' => [],
#                  'is_file' => ''
#                }, 'File::Find::Object::Result' );
# $VAR1 = bless( {
#                  'base' => '.',
#                  'is_link' => '',
#                  'is_dir' => '',
#                  'path' => './test.blg',
#                  'is_file' => 1,
#                  'stat_ret' => [
#                                  2054,
#                                  423870,
#                                  33188,
#                                  1,
#                                  1000,
#                                  1000,
#                                  0,
#                                  '358',
#                                  1404972637,
#                                  1394828707,
#                                  1394828707,
#                                  4096,
#                                  8
#                                ],
#                  'basename' => 'test.blg',
#                  'dir_components' => []

... which is mostly what I wanted, except the stat results are an array, and I'd have to know its layout (($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks) stat - perldoc.perl.org) to make sense of the printout.

Then I looked into IO::All, which I like because of utf-8 handling (but also, say, socket functionality, which would be useful to me for an unrelated task in the same script); and I was thinking I'd use this package instead. The problem is, I have a very hard time discovering what the available fields in the object returned are; e.g. with this code:

use Data::Dumper;
@targetDirsToScan = ("./");

use IO::All -utf8;
$io = io(@targetDirsToScan);
@contents = $io->all(0);
for my $contentry ( @contents ) {
  #print Dumper($contentry) ."\n"; 
  # $VAR1 = bless( \*Symbol::GEN298, 'IO::All::File' );
  # $VAR1 = bless( \*Symbol::GEN307, 'IO::All::Dir' ); ...
  #print $contentry->uid . " -/- " . $contentry->mtime . "\n";
  # https://stackoverflow.com/q/24717210/printing-ret-of-ioall-w-datadumper
  print Dumper \%{*$contentry}; # doesn't list uid
}

... I get a printout like this:

# $VAR1 = {
#           '_utf8' => 1,
#           'constructor' => sub { "DUMMY" },
#           'is_open' => 0,
#           'io_handle' => undef,
#           'name' => './test.blg',
#           '_encoding' => 'utf8',
#           'package' => 'IO::All'
#         };
# $VAR1 = {
#           '_utf8' => 1,
#           'constructor' => sub { "DUMMY" },
#           'mode' => undef,
#           'name' => './testdir',
#           'package' => 'IO::All',
#           'is_absolute' => 0,
#           'io_handle' => undef,
#           'is_open' => 0,
#           '_assert' => 0,
#           '_encoding' => 'utf8'

... which clearly doesn't show attributes like mtime, etc. - even if they exist (which you can see if you uncomment the respective print line).

I've also tried Data::Printer's (How can I perform introspection in Perl?) p() function - it prints exactly the same fields as Dumper. I also tried to use print Dumper \%{ref ($contentry) . "::"}; (list out all methods of object - perlmonks.org), and this prints stuff like:

'O_SEQUENTIAL' => *IO::All::File::O_SEQUENTIAL,
'mtime' => *IO::All::File::mtime,
'DESTROY' => *IO::All::File::DESTROY,
...
'deep' => *IO::All::Dir::deep,
'uid' => *IO::All::Dir::uid,
'name' => *IO::All::Dir::name,
...

... but only if you use the print $contentry->uid ... line beforehand; else they are not listed! I guess that relates to this:

introspection - How do I list available methods on a given object or package in Perl? #911294
In general, you can't do this with a dynamic language like Perl. The package might define some methods that you can find, but it can also make up methods on the fly that don't have definitions until you use them. Additionally, even calling a method (that works) might not define it. That's the sort of things that make dynamic languages nice. :)

Still, that prints the name and type of the field - I'd want the name and value of the field instead.

So, I guess my main question is - how can I dump an IO::All result, so that all fields (including stat ones) are printed out with their names and values (as is mostly the case with File::Find::Object)?

(I noticed the IO::All results can be of type, say, IO::All::File, but its docs defer to "See IO::All", which doesn't discuss IO::All::File explicitly much at all. I thought, if I could "cast" \%{*$contentry} to a IO::All::File, maybe then mtime etc fields will be printed - but is such a "cast" possible at all?)

If that is problematic, are there other packages, that would allow introspective printout of directory iteration results - but with named fields for individual stat properties?


Solution

  • As I answered to your previous question, it is not a good idea to go relying on the guts of objects in Perl. Instead just call methods.

    If IO::All doesn't offer a method that gives you the information that you need, you might be able to write your own method for it that assembles that information using just the documented methods provided by IO::All...

    use IO::All;
    
    # Define a new method for IO::All::Base to use, but
    # define it in a lexical variable!
    #
    my $dump_info = sub {
       use Data::Dumper ();
       my $self = shift;
       local $Data::Dumper::Terse    = 1;
       local $Data::Dumper::Sortkeys = 1;
       return Data::Dumper::Dumper {
          name    => $self->name,
          mtime   => $self->mtime,
          mode    => $self->mode,
          ctime   => $self->ctime,
       };
    };
    
    $io = io('/tmp');
    for my $file ( $io->all(0) ) {
       print $file->$dump_info();
    }