Search code examples
csvperlsplithashmap

How to split a line in CSV with multiple delimiters into different hashes?


File zone.csv looks like below:

===========================================================

zone0,primaryserver1,primaryserver2,secondaryserver1|secondaryserver2

zone1,primaryserver1,primaryserver2,primaryserver3,secondaryserver1|secondaryserver2|secondaryserver3

zone2,primaryserver1,secondaryserver1

zone3,primaryserver1,primaryserver2,secondaryserver1|secondaryserver2

===========================================================

First is the zone, then all primary servers are listed which are comma-separated, then all secondary servers are listed, which are pipe-separated.

I need to fill two hashes (zone + primary server) and (zone + secondary server). I need help with the usage of the split function with two delimiters. I want all primary servers in one hash and all secondary servers in another hash (with the zone to be the key for both the hashes). Your help will be much appreciated.

        my %zoneP;
        my %zoneS;
        my $file1 = "zone.csv";
        open (IN1, "$file1") || die "Cannot open $file1\n";

        # Read File
        while (<IN1>)
        {
                chomp($_);
                # all regex - avoid comments and empty lines
                if ( ($_ !~ /^(#)+/) && ( $_ =~ /[a-z A-Z 1-9 \. \[ \] \* \+ ]/) )
                {
                        print "$_\n" if($debug > -1);
                        my ($zone, $serverP, $serverS) = split(",",$_);
                        $zoneP{$zone} = $serverP;
                        $zoneS{$zone} = $serverS;

                }
        }
        close IN1;

I wrote the above code but need help with the split function - how to use two delimiters and get values in separate hashes.


Solution

  • You can use split twice: first with the comma, then with the pipe.

    use warnings;
    use strict;
    
    my %zoneP;
    my %zoneS;
    
    # Read File
    while (<DATA>) {
        chomp($_);
        # all regex - avoid comments and empty lines
        if ( ($_ !~ /^(#)+/) && ( $_ =~ /[a-z A-Z 1-9 \. \[ \] \* \+ ]/) )
        {
            my ($zone, @servers) = split(/,/, $_);
            my $server_s = pop @servers;
            my @servers_s = split(/\|/, $server_s);
            push @{ $zoneP{$zone} }, [@servers];
            push @{ $zoneS{$zone} }, [@servers_s];
        }
    }
    
    print "zoneP\n";
    use Data::Dumper; $Data::Dumper::Sortkeys=1; print Dumper(\%zoneP);
    print "zoneS\n";
    use Data::Dumper; $Data::Dumper::Sortkeys=1; print Dumper(\%zoneS);
    
    __DATA__
    zone0,primaryserver1,primaryserver2,secondaryserver1|secondaryserver2
    
    zone1,primaryserver1,primaryserver2,primaryserver3,secondaryserver1|secondaryserver2|secondaryserver3
    
    zone2,primaryserver1,secondaryserver1
    
    zone3,primaryserver1,primaryserver2,secondaryserver1|secondaryserver2
    

    Output:

    zoneP
    $VAR1 = {
              'zone0' => [
                           [
                             'primaryserver1',
                             'primaryserver2'
                           ]
                         ],
              'zone1' => [
                           [
                             'primaryserver1',
                             'primaryserver2',
                             'primaryserver3'
                           ]
                         ],
              'zone2' => [
                           [
                             'primaryserver1'
                           ]
                         ],
              'zone3' => [
                           [
                             'primaryserver1',
                             'primaryserver2'
                           ]
                         ]
            };
    zoneS
    $VAR1 = {
              'zone0' => [
                           [
                             'secondaryserver1',
                             'secondaryserver2'
                           ]
                         ],
              'zone1' => [
                           [
                             'secondaryserver1',
                             'secondaryserver2',
                             'secondaryserver3'
                           ]
                         ],
              'zone2' => [
                           [
                             'secondaryserver1'
                           ]
                         ],
              'zone3' => [
                           [
                             'secondaryserver1',
                             'secondaryserver2'
                           ]
                         ]
            };