Search code examples
perlipcpipepopenbinutils

How to filter a lot of data with IPC::Open2?


My task is to filter some data from perl script with external utility (the addr2line). The data size is quite large. I need to print a lot of data to stdin of program and read a lot of data back (from stdout of program into my script).

Now I do this with IPC::Open2, but I don't mix reading and writing. Is this legal? Will Open2 buffer any size of data in pipe?

My code:

my $cmd="addr2line -e $prog_name ";
use IPC::Open2;
local (*Reader, *Writer);
my $pid = open2(\*Reader, \*Writer, $cmd);
for(@requests) {  # this array is HUGE, 100s of thousands of entries
    print Writer "$_\n";
}
close Writer;  
for(@requests) {
    $function_name = <Reader>;
    $filesource = <Reader>;
   #... store ..
}
close Reader;
waitpid($pid,0);

Solution

  • Yes, you will run into buffer capacity constraints the way your program is written. Your input buffer (Reader) will fill up and block execution of your external program.

    Mixing reading and writing would help, as you would be emptying the input buffer at about the same rate that the external program is filling it.

    Another thing that would help is using files for interprocess communication instead of pipes or sockets (as IPC::Open2 does). Then you would be limited only by the amount of free disk space. You could do it yourself, though Forks::Super uses files for IPC by default.

    use Forks::Super 'open2';
    
    ...
    my ($Reader,$Writer,$pid) = open2(@command);
    for (@requests) { print $Writer "$_\n" }
    close $Writer;
    for (@requests) { ... read ... }
    close $Reader;
    waitpid $pid,0;