Search code examples
perlhashperl-data-structures

How can I create a hash of hashes from an array of hashes in Perl?


I have an array of hashes, all with the same set of keys, e.g.:

my $aoa= [
 {NAME=>'Dave', AGE=>12, SEX=>'M', ID=>123456, NATIONALITY=>'Swedish'},
 {NAME=>'Susan', AGE=>36, SEX=>'F', ID=>543210, NATIONALITY=>'Swedish'},
 {NAME=>'Bart', AGE=>120, SEX=>'M', ID=>987654, NATIONALITY=>'British'},
]

I would like to write a subroutine that will convert this into a hash of hashes using a given key hierarchy:

my $key_hierarchy_a = ['SEX', 'NATIONALITY'];
aoh_to_hoh ($aoa, $key_hierarchy_a) = @_;
 ...
}

will return

{M=>
  {Swedish=>{{NAME=>'Dave', AGE=>12, ID=>123456}},
   British=>{{NAME=>'Bart', AGE=>120, ID=>987654}}}, 
 F=>
  {Swedish=>{{NAME=>'Susan', AGE=>36,  ID=>543210}}
}

Note this not only creates the correct key hierarchy but also remove the now redundant keys.

I'm getting stuck at the point where I need to create the new, most inner hash in its correct hierarchical location.

The problem is I don't know the "depth" (i.e. the number of keys). If I has a constant number, I could do something like:

%h{$inner_hash{$PRIMARY_KEY}}{$inner_hash{$SECONDARY_KEY}}{...} = filter_copy($inner_hash,[$PRIMARY_KEY,$SECONDARY_KEY])

so perhaps I can write a loop that will add one level at a time, remove that key from the hash, than add the remaining hash to the "current" location, but it's a bit cumbersome and also I'm not sure how to keep a 'location' in a hash of hashes...


Solution

  • use Data::Dumper;
    
    my $aoa= [
     {NAME=>'Dave', AGE=>12, SEX=>'M', ID=>123456, NATIONALITY=>'Swedish'},
     {NAME=>'Susan', AGE=>36, SEX=>'F', ID=>543210, NATIONALITY=>'Swedish'},
     {NAME=>'Bart', AGE=>120, SEX=>'M', ID=>987654, NATIONALITY=>'British'},
    ];
    
    sub aoh_to_hoh {
      my ($aoa, $key_hierarchy_a) = @_;
      my $result = {};
      my $last_key = $key_hierarchy_a->[-1];
      foreach my $orig_element (@$aoa) {
        my $cur = $result;
        # song and dance to clone an element
        my %element = %$orig_element;
        foreach my $key (@$key_hierarchy_a) {
          my $value = delete $element{$key};
          if ($key eq $last_key) {
            $cur->{$value} ||= [];
            push @{$cur->{$value}}, \%element;
          } else {
            $cur->{$value} ||= {};
            $cur = $cur->{$value};
          }
        }
      }
      return $result;
    }
    
    my $key_hierarchy_a = ['SEX', 'NATIONALITY'];
    print Dumper(aoh_to_hoh($aoa, $key_hierarchy_a));
    

    As per @FM's comment, you really want an extra array level in there.

    The output:

    $VAR1 = {
              'F' => {
                       'Swedish' => [
                                      {
                                        'ID' => 543210,
                                        'NAME' => 'Susan',
                                        'AGE' => 36
                                      }
                                    ]
                     },
              'M' => {
                       'British' => [
                                      {
                                        'ID' => 987654,
                                        'NAME' => 'Bart',
                                        'AGE' => 120
                                      }
                                    ],
                       'Swedish' => [
                                      {
                                        'ID' => 123456,
                                        'NAME' => 'Dave',
                                        'AGE' => 12
                                      }
                                    ]
                     }
            };
    

    EDIT: Oh, BTW - if anyone knows how to elegantly clone contents of a reference, please teach. Thanks!

    EDIT EDIT: @FM helped. All better now :D