When calling an external program from a Perl script, does Capture::Tiny avoid disk io required when using system()? I get essentially the same performance when using either. A colleague is using my code and told me that it was hammering his disks. I (perhaps) don't have this problem when running on my local machine and writing to local disks.
I was previously doing this:
open($fhStdin, ">stdin.txt");
print $fhStdin "some text\n";
close($fhStdin);
system("cmd < stdin.txt 1> stdout.txt 2> stderr.txt");
# open and read stdout.txt
# open and read stderr.txt
And changed to this:
($stdout, $stderr, $exit) = capture {
open($fhStdin, '| cmd');
print $fhStdin "some text\n";
close($fhStdin);
};
But NYTProf tells me that they take essentially the same amount of time to run (but NYTProf removes disk io overheads from subroutine times). So I wondered if capture() is writing to temporary files under the hood? (I tried reading the Tiny.pm source code but am ashamed to say I couldn't tell from that.)
Thanks for any tips.
The documentation for Capture::Tiny::capture states that files are indeed used
Captures are normally done to an anonymous temporary filehandle.
This can be seen in the source for the _capture_tee sub, used as a generic routine for all methods. About half-way through this sub we find a call to File::Temp->new
happening, unless named files are to be used (see below). The rest of processing can be traced with some care.†
The docs proceed to offer a way to monitor all this via named files instead
To capture via a named file (e.g. to externally monitor a long-running capture), provide custom filehandles as a trailing list of option pairs:
my $out_fh = IO::File->new("out.txt", "w+");
my $err_fh = IO::File->new("out.txt", "w+");
capture { ... } stdout => $out_fh, stderr => $err_fh;
The filehandles must be read/write and seekable. Modifying the files or filehandles during a capture operation will give unpredictable results. Existing IO layers on them may be changed by the capture.
(If this is done then the call to File::Temp
doesn't go, as mentioned above. See source.)
If this disk activity is a problem you can use piped open to read cmd
's output
(write its input to a file first), or use qx (backticks). But then you'd have to merge or redirect STDERR
and go through more hoops to check and handle error.
Another option is to use IPC::Run3. While it also uses files it offers far more options which may be leveraged to lessen the disk I/O, or perhaps avoid disk altogether. (The idea to invoke with a filehandle opened to a scalar (in-memory) doesn't work since this isn't a real filehandle.‡ )
The "nuclear" option is the more complex IPC::Run which can take output without using disk.
† A crude sketch
The "dispatch" of all methods to _capture_tee
is done in the beginning, where a set of flags is unshift
ed to @_
before goto &func
takes it away, to distinguish methods. For capture
this is 1,1,0,0
, what sets up variables $do_stdout
and $do_stderr
in _capture_tee
. These are then used to set up the %do
hash, which keys are iterated over to set up $stash
.
If extra arguments were passed to capture
(for named files) then $stash->{capture}
is set, otherwise a File::Temp
object is assigned.
The $stash
is later passed to _open_std
where the redirection happens.
A lot more is happening, but mostly related to manipulation of localized globs and layers.
‡ The most usual invocation writes to scalar(s)
run3 \@cmd, \my $in, \my $out, \my $err;
but this uses files, as explained in docs under How it works.
An attempt to trick it into not using files, by writing to a filehandle which is opened to a scalar
my @cmd = qw(ls -l .);
open my $fh, '>', \my $cmd_out; # not a real filehandle ...
run3 \@cmd, \undef, $fh; # ... so this won't work
aborts with
run3(): Invalid argument redirecting STDOUT at ...
This is because an open
to a scalar doesn't set up a real filehandle. See this post.
If the filehandle is opened to a file this works as intended, writing to that file. This may well result in a more efficient disk I/O operation, compared with what Capture::Tiny
does.