Search code examples
perlstdin

How do I do nested reads from <STDIN> in perl?


I'm writing a script to parse thread dumps from Java. For some reason when I try to read from within the subroutine, or inside a nest loop, it doesn't enter the nested loop at all. Ideally I want to be able to operate on STDIN on nested loops otherwise you'll have to write some ugly state transition code.

Before I was using STDIN, but just to make sure that my subroutine didn't have an independent pointer to STDIN, I opened it into $in.

When I run it, it looks like below. You can see that it never enters the nested loop despite the outer loop having more files from STDIN to read.

~/$ cat catalina.out-20160* | thread.dump.find.all.pl
in is GLOB(0x7f8d440054e8)
found start of thread dump at 2016-06-17 13:38:23 saving to tdump.2016.06.17.13.38.23.txt
in is GLOB(0x7f8d440054e8)
BEFORE NESTED STDIN
BUG!!!!
found start of thread dump at 2016-06-17 13:43:05 saving to tdump.2016.06.17.13.43.05.txt
in is GLOB(0x7f8d440054e8)
BEFORE NESTED STDIN
BUG!!!!
...

The code:

#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long;
use DateTime::Format::Strptime;
use DateTime::Format::Duration;
use Data::Dumper;
# DO NOT touch ARGV!
Getopt::Long::Configure("pass_through");

# cat catalina.out-* | thread.dump.find.all.pl



sub processThreadDump {
    my $in=$_[0];
    my $currentLine=$_[1];
    my $prevLine=$_[2];
    my $parsedDatetime=$_[2];

    # 2016-09-28 09:27:34
    $parsedDatetime=~ s/[ \-\:]/./g;
    my $outfile="tdump.$parsedDatetime.txt";
    print " saving to $outfile\n";
    print " in is $in\n";
    open(my $out, '>', $outfile);
    print $out "$prevLine\n";
    print $out "$currentLine\n";
    print "BEFORE NESTED STDIN\n";
    foreach my $line ( <$in> ) {
        print "INSIDE NESTED STDIN\n";
        $line =~ s/\R//g; #remove newlines
        print $out "$line\n";
        if( $line =~ m/JNI global references:/ ) {
            print "PROPERLY LEFT NESTED STDIN\n";
            close($out);
            return;
        } elsif( $line =~ m/Found \d+ deadlock\./ ) {
            print "PROPERLY LEFT NESTED STDIN\n";
            close($out);
            return;
        }
    }
    print "BUG!!!!\n";
    close($out);
}

open(my $in, '<-');
print "in is $in\n";
my $prevLine;
# read from standard in
foreach my $line ( <$in> ) {
    $line =~ s/\R//g; #remove newlines
    if( $line =~ m/Full thread dump OpenJDK 64-Bit Server VM/ ) {
        # we found the start of a thread dump
        print "found start of thread dump at ${prevLine}";
        processThreadDump($in, $line, $prevLine);
    } else {
        #print "setting prev line to $line\n";
        $prevLine=$line;
    }
}
close($in);

Solution

  • When you say foreach my $line ( <$in> ), this causes perl to read the entire $in filehandle before starting the loop. What you probably want is more like this:

    while (defined(my $line = <$in>))
    

    This will only read one line at a time, discarding it as you finish with it.