Search code examples
perlhashperl-data-structures

reading columns from Hash of Arrays


I'm new in perl and have a question concerning the use of hashes of arrays to retrieve specific columns. My code is the following:

my %hash = ( name1 => ['A', 'A', 'B', 'A', 'A', 'B'],
             name2 => ['A', 'A', 'D', 'A', 'A', 'B'],
             name3 => ['A', 'A', 'B', 'A', 'A', 'C'],
             );

#the values of %hash are returned as arrays not as string (as I want)

foreach my $name (sort keys %hash ) {
    print "$name: ";
    print "$hash{$name}[2]\n";
}

for (my $i=0; $i<$length; $i++) {
        my $diff = "no";
        my $letter = '';
        foreach $name (sort keys %hash) {
            if (defined $hash{$name}[$i]) {
                if ($hash{$name}[$i] =~ /[ABCD]/) {
                    $letter = $hash{$name}[$i];
                }
                elsif ($hash{$name}[$i] ne $letter) { 
                    $diff = "yes";
                }
            }
            if ( $diff eq "yes" ) {
                foreach $name (sort keys %hash) {
                    if (defined $hash{$name}[$i]) { $newhash{$name} .= $hash{$name}[$i]; }  
                }
            }
        }
    }
    foreach $name (sort keys %newhash ) {
        print "$name: $newhash{$name} \n";
    }

I want the output of this program to be something like a new hash with only the variable columns:

my %newhash = ( name1 => 'BB',
            name2 => 'DB',
            name3 => 'BC',
              );

but is only given this message: Use of uninitialized value $letter in string ne at test_hash.pl line 31.

Does anyone have ideas about this? Cheers

EDIT:

Many thanks for your help in this question.

I edited my post to confirm with the suggestions of frezik, Dan1111, Jean. You're right, now there are no warnings but I can not also get any output from the print statement and I don't have any clue about this...

@TLP: ok I just generate a random set of columns without any order purpose. What I really want is about how the letters vary, which means that if for the same array index (stored in the hash) the letters are the same, discard those, but if the letters are different between keys, I want to store that index column in a new hash.

Cheers.


Solution

  • I think it's a mistake to check the letters one by one. It seems easier to just collect all the letters and check them at once. The List::MoreUtils module's uniq function can then quickly determine if the letters vary, and they can be transposed into the resulting hash easily.

    use strict;
    use warnings;
    use Data::Dumper;
    use List::MoreUtils qw(uniq);
    
    my %hash = ( name1 => ['A', 'A', 'B', 'A', 'A', 'B'],
                 name2 => ['A', 'A', 'D', 'A', 'A', 'B'],
                 name3 => ['A', 'A', 'B', 'A', 'A', 'C'],
    );
    my @keys = keys %hash;
    my $len = $#{ $hash{$keys[0]} };   # max index
    my %new;
    
    for my $i (0 .. $len) {
        my @col;
        for my $key (@keys) {
            push @col, $hash{$key}[$i];
        }
        if (uniq(@col) != 1) {     # check for variation
            for (0 .. $#col) {
                $new{$keys[$_]} .= $col[$_];
            }
        }
    }
    print Dumper \%new;
    

    Output:

    $VAR1 = {
              'name2' => 'DB',
              'name1' => 'BB',
              'name3' => 'BC'
            };