Search code examples
perlforeachwhile-looptruncatetruncated

truncate all lines in a file while preserving whole words


I'm trying to shorten each line of a file to 96 characters while preserving whole words. If a line is under or equal to 96 chars, I want to do nothing with that line. If it over 96 chars, I want it cut it down to the closest amount less than 96 while preserving whole words. When I run this code, I get a blank file.

use Text::Autoformat;

use strict;
use warnings;

#open the file
my $filename = $ARGV[0]; # store the 1st argument into the variable
open my $file, '<', $filename;
open my $fileout, '>>', $filename.96;

my @file = <$file>;  #each line of the file into an array

while (my $line = <$file>) {
  chomp $line;
  foreach (@file) {
#######
sub truncate($$) {
    my ( $line, $max ) = @_;

    # always do nothing if already short enough 
    ( length( $line ) <= $max ) and return $line;

    # forced to chop a word anyway
    if ( $line =~ /\s/ ) {
       return substr( $line, 0, $max );
    }
    # otherwise truncate on word boundary 
    $line =~ s/\S+$// and return $line;

    die; # unreachable
}
####### 

my $truncated  = &truncate($line,96);

print $fileout "$truncated\n";

  }
}       
close($file);
close($fileout);

Solution

  • You have no output because you have no input.

    1. my @file = <$file>;  #each line of the file into an array
    2. while (my $line = <$file>) { ...
    

    The <$file> operation line 1 is in list context "consumes" all the input and loads it into @file. The <$file> operation in line 2 has no more input to read, so the while loop does not execute.

    You either want to stream from the filehandle

    # don't call @file = <$file>
    while (my $line = <$file>) {
        chomp $line; 
        my $truncated = &truncate($line, 96);
        ...
    }
    

    Or read from the array of file contents

    my @file = <$file>;
    foreach my $line (@file) {
        chomp $line; 
        my $truncated = &truncate($line, 96);
        ...
    }
    

    If the input is large, the former format has the advantage of just loading a single line into memory at a time.