Search code examples
perlperl-hash

Perl: How to make hash of hash where keys come from an array


I want to make a hash of hash using the structure of an array. Each array element should be a subkey of the preceding array element. For example using the following arrays:

@array1 = ("animal","dog","sparky");
@array2 = ("animal","cat","felix");
@array3 = ("animal","bird","penguin","skipper");

I want to make a hash that is structured like this:

$hash{"animal"}{"dog"}{"sparky"} = 1;

%hash = ( 
  "animal" => { 
     "dog" => {
         "sparky" => "1", 
      }, 
     "cat" => {
         "felix" => "1", 
      }, 
     "bird" => {
         "penguin" => {
             "skippy" => "1", 
          }, 
      }, 
  }, 
);

The arrays will not always have the same number of elements. But it should build the structure just the same.

Thanks for your help.


Solution

  • This is sounding like an XY problem to me - I'm very suspicious that you've 3 separate, numbered arrays.

    But I'll answer on the off chance you're seeing a more general case - the trick to doing this sort of thing is using a hash reference to traverse and reset.

        use Data::Dumper;
        my %hash; 
        my @array1 = ("animal","bird","penguin","skipper");
        my $cursor = \%hash; 
        
        foreach my $element (  @array1 ) {  
           $cursor -> {$element} //= {};
           $cursor = $cursor -> {$element};
        }
        $cursor = 1; 
    
        print Dumper \%hash;
    

    So we walk down your data structure; and create a subelement - using //= to create a new subhash if - and only if - there isn't one defined already.

    So for your whole set:

    use strict;
    use warnings;
    
    use Data::Dumper;
    my %hash;
    my @array1 = ( "animal", "dog",  "sparky" );
    my @array2 = ( "animal", "cat",  "felix" );
    my @array3 = ( "animal", "bird", "penguin", "skipper" );
    
    my $cursor = \%hash;
    
    foreach my $array ( \@array1, \@array2, \@array3 ) {
        foreach my $element (@$array) {
            $cursor->{$element} //= {};
            $cursor = $cursor->{$element};
        }
        $cursor = 1;
        $cursor = \%hash;
    }
    print Dumper \%hash;
    

    Now note - this doesn't have quite the desired outcome, in that we create empty hashes to populate your structure with. So the bottom level is {} - an empty hash - not the 1 you're seeking.

    $VAR1 = {
              'animal' => {
                            'dog' => {
                                       'sparky' => {}
                                     },
                            'bird' => {
                                        'penguin' => {
                                                       'skipper' => {}
                                                     }
                                      },
                            'cat' => {
                                       'felix' => {}
                                     }
                          }
            };
    

    But hopefully this gives you an idea how the problem can be solved?

    It's worth looking at what autovivification is, and what it's doing - usually it's helpful, but for building this sort of data structure it may not be. We've explicitly created an empty subhash below each of your keys - but only if one doesn't exist already.

    So in order to accomplish what you're trying to do - we actually need to handle the last element differently - we're not trying to create and empty subhash, we're attempting to set a value.

    Thus:

    use strict;
    use warnings;
    
    use Data::Dumper;
    my %hash;
    my @array1 = ( "animal", "dog",  "sparky" );
    my @array2 = ( "animal", "cat",  "felix" );
    my @array3 = ( "animal", "bird", "penguin", "skipper" );
    
    my $cursor = \%hash;
    
    foreach my $array ( \@array1, \@array2, \@array3 ) {
        # remove the last value from the array
        my $last =  pop @$array;
        foreach my $element (@$array) {
            $cursor->{$element} //= {};
            $cursor = $cursor->{$element};
        }
        #set the last value to be '1' instead of a subhash.
        #Otherwise it'll be created by the //= line above, and be an empty hash. 
        $cursor -> {$last} = 1;
        $cursor  = \%hash;
    }
    print Dumper \%hash;
    

    This gives us the desired result:

    $VAR1 = {
              'animal' => {
                            'dog' => {
                                       'sparky' => 1
                                     },
                            'bird' => {
                                        'penguin' => {
                                                       'skipper' => 1
                                                     }
                                      },
                            'cat' => {
                                       'felix' => 1
                                     }
                          }
    

    Or you can look at Data::Diver which approximately accomplishes the same thing.