Search code examples
arraysperltruncatefilehandle

Perl Filehandle Open Truncates the File


I am writing my first Perl program and it's a doozy. I'm happy to say that everything has been working for the most part, and searching this website has helped with most of my problems.

I am working with a large file composed of space separated values. I filter the file down to display only lines with a certain value in one of the columns, and output the filtered data to a new file. I then attempt to push all of the lines of that file into an array to use for looping. Here's some code:

my @orig_file_lines = <ORIG_FILE>;
open MAKE_NEW_FILE, '>', 'newfile.dat' or die "Couldn't open newfile.dat!";
&make_new_file(\@orig_file_lines);   ##Creates a new, filtered newfile.dat
open NEW, "newfile.dat" or die "Couldn't open newfile.dat!";
my @lines;
while(<NEW>){
 push(@lines,$_);
}
printf("%s\n", $lines[$#lines]);  ##Should print entirety of last line of newfile.dat

The problem is twofold: 1. $#lines = 24500 here when the newly created file (newfile.dat) actually has 24503 lines (so it should be 24502), 2. the printf statement returns a truncated line 24500, cutting off that line prematurely by about two columns.

Every other line, e.g. $lines[0-24499], will successfully print the entire line even when it is wider than $lines[24500], so the length of that particular line (they're all long) is not the problem. But it is almost as if the array has gotten too large somehow, since it cut off part of one line, and then the next two lines. If so, how do I combat this?


Solution

  • It looks like you forgot to close MAKE_NEW_FILE before opening the same file with NEW.

    Some other points to look at:

    • &function syntax is mostly deprecated because it bypasses prototype checking.

    • I trust that you are using use warnings; and use strict;.

    • I notice that you have a two-argument open and a three-argument open. Although both are legal they have different mindsets which makes using them together confusing to the programmer. I would stay with the three argument open because I think it is easier to understand (unless you are playing code golf)