Search code examples
arraysperlhashperl-data-structures

How to expand a Hash of Arrays?


I don't know if "expand" is the right word, but here is what I would like to do =)

This script

#!/usr/bin/perl

use warnings; 
use strict;

my %HoA = (
    group1 => [ "user1", "user2" ],
    group2 => [ "group1", "user3" ],
    group3 => [ "group1", "group2" ],
    group4 => [ "group3", "user2" ],
    );

foreach my $group ( keys %HoA ) {
    print "$group: @{ $HoA{$group} }\n"
}

outputs

group1: user1 user2
group2: group1 user3
group3: group1 group2
group4: group3 user4

What I would like is to replace the groups in the array with the members. I.e. so the output and $HoA becomes

group1: user1 user2
group2: user1 user2 user3
group3: user1 user2 user3
group4: user1 user2 user3 user4

Perhaps "search and replace" and remove duplicates would be a better explanation of what I would like to do?


Solution

  • Assuming the data you have given, the following loop will create a new hash with the expanded arrays. This algorithm assumes that groups will resolve in sorted order (group2 will only depend on group1, group3 on 1 / 2, ...).

    my %expanded;
    for my $group (sort keys %HoA) {
        my %seen;
        $expanded{$group} = [
            grep {not $seen{$_}++}
            map {exists $expanded{$_} ? @{$expanded{$_}} : $_}
            @{$HoA{$group}}
        ];
        print "$group: @{ $expanded{$group} }\n"
    }
    

    which prints:

    group1: user1 user2
    group2: user1 user2 user3
    group3: user1 user2 user3
    group4: user1 user2 user3
    

    If you can not assume a resolution order, the following is a bit brute force, but should work:

    my %HoA = (
        group1 => [ "user1", "user2" ],
        group2 => [ "group1", "user3" ],
        group3 => [ "group1", "group2" ],
        group4 => [ "user5", "group5" ],
        group5 => [ "group3", "user2" ],
        );
    
    my @to_expand = keys %HoA;
    
    my %final;
    my $tries = @to_expand;
    to_expand: while (@to_expand and $tries) {
        my $next = shift @to_expand;
    
        my (@users, @groups);
        for (@{ $HoA{$next} }) {
            if (/^group/) {
                push @groups, $_;
            } else {
                push @users, $_;
            }
        }
        for my $group (@groups) {
            if (exists $final{$group}) {
                push @users, @{$final{$group}}
            } else {
                $tries--;
                push @to_expand, $next;
                next to_expand;
            }
        }
        $tries++;
        my %seen;
        $final{$next} = [grep {not $seen{$_}++} @users];
    }
    if (@to_expand) {
        print "error with groups: @to_expand\n";
    }
    
    for (sort keys %final) {
        print "$_: @{$final{$_}}\n";
    }
    

    which prints:

    group1: user1 user2
    group2: user3 user1 user2
    group3: user1 user2 user3
    group4: user5 user2 user1 user3
    group5: user2 user1 user3
    

    if there is an error (say group3 depends on group5), then you will get this output:

    error with groups: group4 group5 group3
    group1: user1 user2
    group2: user3 user1 user2
    

    There's probably a better algorithm out there for this.