Search code examples
regexperlmismatch

perl mismatch string search


How to find if a string is present with one or two mismatch in another string?

my $find = "MATCH";
my $search = "stringisMATTHhere";

# $search has one mismatch: MATTH
# for exact match, this one seems working
if   ($search =~ /$find/){
       print "String found";
     }
else {
       print "String not found";
     }

How can I solve this issue with one mismatch: MSTCH, AATCH, MACCH, etc. and two mismatches: ATTCH, MGGCH, etc


Solution

  • So you want to do

    /
       ..TCH | .A.CH | .AT.H | .ATC. |
       M..CH | M.T.H | M.TC. | 
       MA..H | MA.C. |
       MAT..
    /x
    

    or

    /
       \w\wTCH | \wA\wCH | \wAT\wH | \wATC\w |
       M\w\wCH | M\wT\wH | M\wTC\w | 
       MA\w\wH | MA\wC\w |
       MAT\w\w
    /x
    

    Easy enough:

    my @subpats;
    for my $i (0..length($find)-1) {
       for my $j ($i+1..length($find)-1) {
          my $subpat = join('',
             substr($find, 0, $i),
             '.',  # or '\\w'
             substr($find, $i+1, $j-$i-1),
             '.',  # or '\\w'
             substr($find, $j+1),
          );
          push @subpats, $subpat;
       }
    }
    
    my $pat = join('|', @subpats);
    
    $search =~ /$pat/
    

    Perl 5.10+ trie-based alternations should optimize the common leading prefixes into something efficient. Saves us the trouble of generating (?:.…|M…).