Search code examples
regexperl

How do I assign many values to a particular Perl variable?


I am writing a script in Perl which searches for a motif(substring) in protein sequence(string). The motif sequence to be searched (or substring) is hhhDDDssEExD, where:

  • h is any hydrophobic amino acid
  • s is any small amino acid
  • x is any amino acid
  • h,s,x can have more than one value separately

Can more than one value be assigned to one variable? If yes, how should I do that? I want to assign a list of multiple values to a variable.


Solution

  • I am no great expert in perl, so there is quite possibly a quicker way to this, but it seems like the match operator "//" in list context is what you need. When you assign the result of a match operation to a list, the match operator takes on list context and returns a list with each of the parenthesis delimited sub-expressions. If you specify global matches with the "g" flag, it will return a list of all the matches of each sub-expression. Example:

    # print a list of each match for "x" in "xxx"
    @aList = ("xxx" =~ /(x)/g);
    print(join(".", @aList));
    

    Will print out

    x.x.x
    

    I'm assuming you have a regular expression for each of those 5 types h, D, s, E, and x. You didn't say whether each of these parts is a single character or multiple, so I'm going to assume they can be multiple characters. If so, your solution might be something like this:

    $h = ""; # Insert regex to match "h"
    $D = ""; # Insert regex to match "D"
    $s = ""; # Insert regex to match "s"
    $E = ""; # Insert regex to match "E"
    $x = ""; # Insert regex to match "x"
    
    $sequenceRE = "($h){3}($D){3}($s){2}($E){2}($x)($D)"
    
    if ($line =~ /$sequenceRE/) {
        $hPart = $1;
        $sPart = $3;
        $xPart = $5;
    
        @hValues = ($hPart =~ /($h)/g);
        @sValues = ($sPart =~ /($s)/g);
        @xValues = ($xPart =~ /($x)/g);
    }
    

    I'm sure there is something I've missed, and there are some subtleties of perl that I have overlooked, but this should get you most of the way there. For more information, read up on perl's match operator, and regular expressions.