I have two strings A and B, let's say
A = AATCGGATATAG
B = CGATA
Some of you may know two types of alignments:
But I would like to implement an alignment that takes the best whole substring of A which, if aligned with B, yields the best alignment
For example:
A,B -- Alignment algorithm --> AATCGGATATAG
CG-ATA
So far I've been using the Smith-Waterman Algorithm
Does anyone know any suggestions to solve this problem?
Thanks in advance!
Smith-Waterman is still the algorithm you should use. In order to get the full sequence aligned, you should change your gap penalty to 0. This will make S-W favor gaps over mismatches and add as many gaps as are need to include the whole sequence.
For example setting the gap penalty to 0 using the standard nucleotide 4.4 subsitution matrix will make this alignment:
A = AATCGGATATAG
B = C-GATA