A Perl program uses IPC::Run
to pipe a file through a series of commands determined at runtime and into another file, like this small test excerpt demonstrates:
#!/usr/bin/perl
use IO::File;
use IPC::Run qw(run);
open (my $in, 'test.txt');
my $out = IO::File->new_tmpfile;
my @args = ( [ split / /, shift ], "<", $in); # this code
while ($#ARGV >= 0) { # extracted
push @args, "|", [ split / /, shift ]; # verbatim
} # from the
push @args, ">pipe", $out; # program
print "Running...";
run @args or die "command failed ($?)";
print "Done\n";
It builds the pipeline from commands given as arguments, the test file is hard-coded. The problem is that the pipeline hangs if the file is bigger than 64KiB. Here is a demonstration that uses cat
in the pipeline to keep things simple. First a 64KiB (65536 bytes) file works as expected:
$ dd if=/dev/urandom of=test.txt bs=1 count=65536
65536 bytes (66 kB, 64 KiB) copied, 0.16437 s, 399 kB/s
$ ./test.pl cat
Running...Done
Next, one byte more. The call to run
never returns...
$ dd if=/dev/urandom of=test.txt bs=1 count=65537
65537 bytes (66 kB, 64 KiB) copied, 0.151517 s, 433 kB/s
$ ./test.pl cat
Running...
With IPCRUNDEBUG
enabled, plus a few more cats you can see it's the last child that doesn't end:
$ IPCRUNDEBUG=basic ./test.pl cat cat cat cat
Running...
...
IPC::Run 0000 [#1(3543608)]: kid 1 (3543609) exited
IPC::Run 0000 [#1(3543608)]: 3543609 returned 0
IPC::Run 0000 [#1(3543608)]: kid 2 (3543610) exited
IPC::Run 0000 [#1(3543608)]: 3543610 returned 0
IPC::Run 0000 [#1(3543608)]: kid 3 (3543611) exited
IPC::Run 0000 [#1(3543608)]: 3543611 returned 0
(with a file under 64KiB you see all four exit normally)
How can this be made to work for files of any size ?
(Perl 5, version 30, subversion 3 (v5.30.3) built for x86_64-linux-thread-multi, tried on Alpine Linux, the target platform, and Arch Linux to rule out Alpine as a cause)
You have a deadlock:
Consider using one of the following instead:
run [ 'cat' ], '<', $in_fh, '>', \my $captured;
# Do something with the captured output in $captured.
or
my $receiver = sub {
# Do something with the chunk in $_[0].
};
run [ 'cat' ], '<', $in_fh, '>', $receiver;
For example, the following "receiver" processes each line as they come in:
my $buffer = '';
my $receiver = sub {
$buffer .= $_[0];
while ($buffer =~ s/^(.*)\n//) {
process_line("$1");
}
};
run [ 'cat' ], '<', $in_fh, '>', $receiver;
die("Received partial line") if length($buffer);