Search code examples
regexperlfilenames

perl - create file name from column names


I am new to Perl and I would like to create the name of the output file based on the column names present in the input file. Say that my input file header is the following:

#identifier    (%)composition

and I would like my output file name to be identifier_composition. These identifiers and compositions can be a sequence of alphanumeric characters such as #E2FAR4 for identifier or (%)MhDE4 for composition. For this example, the output file name should be E2FAR4_MhDE4. So far, I am able to get the identifier but not the composition. This is what I have tried as code:

if ($line =~ /^#\s*(\S+)\t\(%)s*(\S+)/){
    my $ID = $1;
    my $comp = $2;
    my $out_file = "${ID}_${comp}"
}

but I get the identifier also as the second argument. Any help would be appreciated.


Solution

  • Use below regex

    ^#\s*(\S+)\t\(%\)(\S+)
    

    Demo

    Example code:

    #!/usr/bin/perl
    use strict;
    use warnings;
    while(<DATA>){
        my $line = $_;
        chomp $line;
        if ($line =~ /^#\s*(\S+)\t\(%\)(\S+)/){
            my $ID = $1;
            my $comp = $2;
            my $out_file = "${ID}_${comp}";
            print "Filename: $out_file";
        }
    }
    
    __DATA__
    #identifier (%)composition
    

    Output:

    Filename: identifier_composition