Search code examples
regexperlcsv

Split function using Text::CSV_XS


I am trying to parse log files and convert them in to .csv files. I am having trouble with the split function. For example I have the following in the log file: 21a94551,00:00:59.643;ERROR; . When I try to split the comma (,) and semi colon (;) I lose .643 from my time stamp in the output csv file. I would like to keep the time (00:00:59.643) intact. I have multiple lines in the log file (all with different numbers) so those values are not explicit.

When I use a print function after the split function the values are outputted to the screen ok, but in the CSV file

I am new to Perl. Can someone please explain what I am doing wrong? I think the issue might be with how the string is handled ?

use strict;
use Cwd;
use Excel::Writer::XLSX;
use Text::CSV_XS;
use Spreadsheet::Read;

my $dirname = getcwd;               # Set the directory to current working directory.
opendir (DIR, $dirname) || die;     # Open the current directory
my @FileNameList = readdir(DIR);    # Load the names of files in to an array

foreach (@FileNameList)             #Read each of the file names
{
    my $FileName = $_;
    my $Output;

    if ($FileName =~ m/iusp_\d+.log/)
        {
        print ("\n". $FileName." \n Correct Log File Found");

open (my $file, "<", $FileName);

while (<$file>) {
        chomp;    # Remove the \n from the last field
        my $Line = $_;    # Create the variable SLine and place the contents of the current line there

        if ( $Line =~ m/ERROR/ )    # Select any line that has "ERROR" inside it.
        {
            my @fields = split /[,;]/, $Line;    # Split up the line $Line by ", ;"
            my $csv = Text::CSV_XS->new();         # Create new CSV
            $csv->combine(@fields);
            my $csvLine = $csv->string();
            print $csvLine, "\n";
            {
                $Output = $csvLine . "\n";
            }
            my $OutputFileName = $FileName . ".csv";
            print( "\n Saving File:" . $OutputFileName );
            open( MyOutputFile, ">>$OutputFileName" );
            print MyOutputFile $Output;
        }    #End of IF Statement
    }    #End of while statement

Solution

  • Simplify your regex. You don't need the .* (perldoc -f split). The dot is treated as a delimiter by split because it is inside the character class square brackets.

    use warnings;
    use strict;
    use Data::Dumper;
    
    my $Line = '21a94551,00:00:59.643;ERROR;';
    my @fs = split /[,;]/, $Line;
    print Dumper(\@fs);
    

    Output:

    $VAR1 = [
              '21a94551',
              '00:00:59.643',
              'ERROR'
            ];