Search code examples
bashshellmergesort

Merge sort recursive algorithm in Bash cannot exit and return value


I'm implementing a merge sort algorithm in bash, but looks like it loops forever and gives error on m1 and m2 subarrays. It's a bit hard to stop loop in conditions since I have to use echo and not return. Anyone have any idea why this happens?

MergeSort (){
  
  local a=("$@")

  if [ ${#a[@]} -eq 1 ]
  then
    echo ${a[@]}
  elif [ ${#a[@]} -eq 2 ]
  then
    if [ ${a[0]} -gt ${a[1]} ]
    then
      local t=(${a[0]} ${a[1]})
      echo ${t[@]}
    else
      echo ${a[@]}
    fi
  else

    local p=($(( ${#a[@]} / 2 )))
    local m1=$(MergeSort "${a[@]::p}")
    local m2=$(MergeSort "${a[@]:p}")
    local ret=()

    while true
    do
      if [ "${#m1[@]}" > 0 ] && [ "${#m2[@]}" > 0 ]
      then
        if [ ${m1[0]} <= ${m2[0]} ]
        then
          ret+=(${m1[0]})
          m1=${m1[@]:1}
        else
          ret+=(${m2[0]})
          m2=${m2[@]:1}
        fi
      elif [ ${#m1[@]} > 0 ]
      then
        ret+=(${ret[@]} ${m1[@]})
        unset m1
      elif [ ${#m2[@]} > 0 ]
      then
        ret+=(${ret[@]} ${m2[@]})
        unset m2
      else
        break
      fi
    done
  fi

  echo ${ret[@]}
}

a=(6 5 6 4 2)
b=$(MergeSort "${a[@]}")
echo ${b[@]}

Solution

  • There are multiple issues in your shell script:

    • you should use -gt instead of > for numeric comparisons on array lengths
    • <= is not a supported string comparison operator. You should use < and quote it as '<', or better use '>' and transpose actions to preserve sort stability.
    • there is no need for local t, and your code does not swap the arguments. Just use echo ${a[1]} ${a[0]}
    • you must parse the result of recursive calls to MergeSort as arrays: local m1=($(MergeSort "${a[@]::p}"))
    • when popping initial elements from m1 and m2, you must reparse as arrays: m1=(${m1[@]:1})
    • instead of ret+=(${ret[@]} ${m1[@]}) you should just append the elements with ret+=(${m1[@]}) and instead of unset m1, you should break from the loop. As a matter of fact, if either array is empty you should just append the remaining elements from both arrays and break.
    • furthermore, the while true loop should be simplified as a while [ ${#m1[@]} -gt 0 ] && [ ${#m2[@]} -gt 0 ] loop followed by the tail handling.
    • the final echo ${ret[@]} should be moved inside the else branch of the last if
    • to handle embedded spaces, you should stringize all expansions but as the resulting array is expanded with echo embedded spaces that appear in the output are indistinguishable from word breaks. There is no easy workaround for this limitation.

    Here is a modified version:

    #!/bin/bash
    
    MergeSort (){
    
      local a=("$@")
    
      if [ ${#a[@]} -eq 1 ]; then
        echo "${a[@]}"
      elif [ ${#a[@]} -eq 2 ]; then
        if [ "${a[0]}" '>' "${a[1]}" ]; then
          echo "${a[1]}" "${a[0]}"
        else
          echo "${a[@]}"
        fi
      else
    
        local p=($(( ${#a[@]} / 2 )))
        local m1=($(MergeSort "${a[@]::p}"))
        local m2=($(MergeSort "${a[@]:p}"))
        local ret=()
    
        while [ ${#m1[@]} -gt 0 ] && [ ${#m2[@]} -gt 0 ]; do
          if [ "${m1[0]}" '>' "${m2[0]}" ]; then
            ret+=("${m2[0]}")
            m2=("${m2[@]:1}")
          else
            ret+=("${m1[0]}")
            m1=("${m1[@]:1}")
          fi
        done
        echo "${ret[@]}" "${m1[@]}" "${m2[@]}"
      fi
    }
    
    a=(6 5 6 4 2 a c b c aa 00 0 000)
    b=($(MergeSort "${a[@]}"))
    echo "${b[@]}"
    

    Output: 0 00 000 2 4 5 6 6 a aa b c c