Search code examples
perl

Splitting lines of a file into a hash


I am writing a Perl script which needs to perform some validations. I have a file available which contains following information

patchname subpatch1,subpatch2,subpatch3...

Content of the files are to be read and the subpatches are to be placed in array. These subpatches may contain more subpatches which need to be traversed using either DFS or BFS which comes later into picture. But, these subpatches need to be mapped to a hash while these are read from the file, i.e.:

$list->{subpatch1} = \subpatch1
$list->{subpatch2} = \subpatch2
$list->{subpatch3} = \subpatch3 ....

Using split does not help.

($a1,$2)=split;

gives me $2=subpath1,subpatch2,... which needs to be further split. It's getting really confusing here specially since I am new to Perl.

Is there a better way of performing the action above or is there is any module with which this can be achieved?


Solution

  • I think that you are confusing the lists and hashes. A list is basically an array... it's a set of elements that's indexed by number. Its elements can only be accessed by index:

    my $list = [ 'foo', 'bar', 'baz' ];
    print $list->[0], "\n";   #prints foo
    print $list->[2], "\n";   #prints baz
    

    A hash is a collection that's indexed by a key, which you decide. Elements are looked up by this key (as opposed to an index, like in a list):

    my $hash = { fookey => "foo", barkey => "bar", bazkey => "baz" };
    print $hash->{'fookey'}, "\n";  # prints foo
    print $hash->{'barkey'}, "\n";  # prints bar
    

    If I understand your requirement correctly, you're looking for a way to store data in the following form:

    patchname1 --relies on--> patchname2, patchname2, ...
    

    So what you really want is one hash, where the keys are the patch names and the values are lists of patchnames:

    my $patch_hash = {
        patchname1 => [ 'patchname2', 'patchname4' ],
        patchname2 => [ 'patchname3', 'patchname4' ],
        patchname3 => [ 'patchname4' ],
        patchname4 => [],
    };
    

    Where each unique patchname is a key in patch_hash who's value is a list of its dependencies.

    As toolic suggests, the actual parsing can be done with split if you tell it what delimiters to split on. From there, you can populate your hash with something like:

    my $patch_hash = {};
    while( <FILE> ){
        my( $name, @subs ) = split /[\s,]+/;
        push( @{$patch_hash->{$name}}, @subs );
    }
    

    This is a nice article on Perl arrays.

    This is a nice list of the functions for Perl arrays.

    This is a nice article on Perl hashes.

    And this is a nice summary of Perl regex.