Search code examples
algorithmfortranstring-search

Morris Pratt table in Fortran


I have been tried to do the Morris Pratt table and the code is basically this one in C:

void preMp(char *x, int m, int mpNext[]) {
int i, j;

i = 0;
j = mpNext[0] = -1;
while (i < m) {
   while (j > -1 && x[i] != x[j])
      j = mpNext[j];
   mpNext[++i] = ++j;
 }
}

and here is where i get so far in Fortran

program MP_ALGORITHM
implicit none
integer, parameter :: m=4
character(LEN=m) :: x='abac'
integer, dimension(4) :: T
integer :: i, j

i=0
T(1)=-1
j=-1

do while(i < m)
  do while((j > -1) .AND. (x(i+1:i+1) /= (x(j+i+1:j+i+1))))
    j=T(j)
  end do
  i=i+1
  j=j+1
  T(i)=j
end do
print *, T(1:)
end  program MP_ALGORITHM

and the problem is i think i am having the wrong output.

for x=abac it should be (?):

a   b  a  c

-1  0  1  0

and my code is returning 0 1 1 1

so, what i've done wrong?


Solution

  • The problem here is that C indices start from zero, but Fortran indices start from one. You can try to adjust the index for every array acces by one, but this will get unwieldy.

    The Morris-Pratt table itself is an array of indices, so it should look different in C and Fortran: The Fortran array should have one-based indices and it should use zero as invalid index.

    Together with the error that chw21 pointed out, your function might look like this:

    subroutine kmp_table(x, t)
        implicit none
    
        character(*), intent(in) :: x
        integer, dimension(:), intent(out) :: t
    
        integer m
        integer :: i, j
    
        m = len(x)
    
        i = 1
        t(1) = 0
        j = 0
    
        do while (i < m)
            do while(j > 0 .and. x(i:i) /= x(j:j))
                j = t(j)
            end do
            i = i + 1
            j = j + 1
            t(i) = j
        end do
    end subroutine 
    

    You can then use it in the Morris-Pratt algorithm as taken straight from the Wikipedia page with adjustment for Fortran indices:

    function kmp_index(S, W) result(res)
        implicit none
    
        integer :: res
    
        character(*), intent(in) :: S       ! text to search
        character(*), intent(in) :: W       ! word to find
    
        integer :: m                        ! zero-based offset in S
        integer :: i                        ! one-based offset in W and T
    
        integer, dimension(len(W)) :: T     ! KMP table
    
    
    
        call kmp_table(W, T)
    
        i = 1
        m = 0
    
        do while (m + i <= len(S))
            if (W(i:i) == S(m + i:m + i)) then
                if (i == len(W)) then
                    res = m + 1
                    return
                end if
                i = i + 1
            else
                if (T(i) > 0) then
                    m = m + i - T(i)
                    i = T(i)
                else
                    i = 1
                    m = m + 1
                end if
            end if
        end do
    
        res = 0
    
    end function
    

    (The index m is zero-based here, because t is only ever used in conjunction with i in S(m + i:m + i). Adding two one-based indices will yield an offset of one, whereas keeping m zero-based makes this a neutral addition. m is a local variable that isn't exposed to code from the outside.)

    Alternatively, you could make your Fortran arrays zero-based by specifying a lower bound of zero for your string and array. That will clash with the useful character(*) notation, though, which always uses one-based indexing. In my opinion, it is better to think about the whole algorithm in the typical one-based indexing scheme of Fortran.