Search code examples
perliotruncatetruncation

Perl truncation is Off-by-2 when outputting the line number for a string been truncated


I am outputting the line numbers from a text file for whenever truncation occurs. Successfully I am getting output for most truncated lines.

However, the truncated line output is off-by-2. Here is what is happening in my code:

Rain , a string, is on line 1 of the input text file (see below). Applied RegExp s/.{4}\K.*//s to truncate to 4 and Rain outputs truncated even though it was not truncated (Rain is 4 characters no need to shorten it). In addition, it happens for 5, s/.{5}\K.*//s.

Correctly, the code outputs the truncated line when truncating by 3 or less.

How can I show NO truncation occur when using s/.{4}\K.*//s and s/.{5}\K.*//s? In other words, when I run my code to truncate on 4 or 5, Rain displays no truncation output for the line number.

My text file - weather.txt:

Rain
Snow

Here is my code:

#!/usr/bin/perl
use strict;
use warnings;

my $input = 'weather.txt';

open my $fhIn, '<', $input or die qq(Unable to open "$input" for input: $!);

my @lines;

while( <$fhIn>) {
    chomp(@lines);
    push @lines, $. if s/.{5}\K.*//s;
}

my $max = @lines;
my $none = '-';

my $fmt = "%-20s\n";

print sprintf($fmt, "Column 1");

foreach my $i (0..$max-1) {
    print sprintf($fmt, ($lines[$i] or $none), ($lines[$i] or $none));
}

Solution

  • Most likely, your text file contains a carriage return and a linefeed character at the end of each line. The chomp call only removes the linefeed character, leaving you with 5 characters in your lines.

    A good approach is to print your input with some delimiters around it to inspect it:

    print "<<$_>>\n";
    

    Alternatively, you can use Data::Dumper to inspect your data:

    use Data::Dumper;
    $Data::Dumper::Useqq = 1;
    print Dumper $_;
    

    Personally, I really like to remove all whitespace from the end of input lines, as keeping it is rarely wanted anyway:

    while( <$fhIn> ) {
        s/\s+$//;
        push @lines, $. if s/.{5}\K.*//s;
    };