Search code examples
csvperlxs

Perl Text::CSV_XS treats comment as data when it is the last line


I wonder if this is a bug or feature of Text::CSV_XS. When the last line in my CSV is a comment, it will be parsed as a line containing data (yielding undef or ""). How to repodruce:

$ cat test.csv
id | name
#
42 | foo
#

This is my perl script:

#!/usr/bin/env perl

use warnings;
use diagnostics;
use strict;

use Text::CSV_XS qw(csv);
use Data::Dumper;

# Returns array of hashrefs for argument CSV file.
sub load_csv_file {
  my $filename = shift;
  return csv (
    in               => $filename,
    sep_char         => '|',
    headers          => 'auto',
    allow_whitespace => 1,
    comment_str      => "#"
  );
}
my $table = load_csv_file("test.csv");
print Dumper($table);

When run it dumps this:

$VAR1 = [
          {
            'id' => '42',
            'name' => 'foo'
          },
          {
            'id' => '',             # WHY IS THIS ENTRY HERE?
            'name' => undef
          }
        ];

I would have expected the last line to be treated as a comment and ignored. When I remove the comment in the last line from test.csv, I get what I expect, just one row:

$VAR1 = [
          {
            'id' => '42',
            'name' => 'foo'
          }
        ];

What am I missing? I'm using Text::CSV_XS version 1.47 on Ubuntu Jammy.


Solution

  • After reporting, this was fixed in version 1.53, Changelog entry

    1.53    - 2023-11-22, H.Merijn Brand
        * Two casts for -Wformat (issue 50)
        * Add --skip-empty to csv2xlsx
        * Add --font and --font-size to csv2xlsx
        * Fix skip_empty_rows ("skip") and trailing newlines (Corey Hickey, PR#52)
        * Fix comment in last line (RT#150501)