Search code examples
c#.netvb.netperformancestringbuilder

Fastest search method in StringBuilder


I have a StringBuilder named stb_Swap_Tabu used to store Course's Names, I am using the following method to find a course:

stb_Swap_Tabu.ToString.Contains("CourseName")

in my case, Performance is the most important issue. Is there any faster way?


Solution

  • StringBuilder wasn't really intended for all string purposes. If you really need to search one, you have to write your own method.

    There are several string-searching algorithms suited to different cases.

    The following is a simple implementation of the Knuth–Morris–Pratt algorithm that only cares about ordinal matches (no case-folding, no culture-related collation, just a plain codepoint to codepoint match). It has some initial Θ(m) overhead where m is the length of the word sought, and then finds in Θ(n) where n is the distance to the word sought, or the length of the whole string-builder if it isn't there. This beats the simple char-by-char compare which is Θ((n-m+1) m) (Where O() notation describes upper-bounds, Θ() describes both upper and lower bounds).

    All this said, it does sound like creating a list might be a better approach to the task in hand.

    public static class StringBuilderSearching
    {
      public static bool Contains(this StringBuilder haystack, string needle)
      {
        return haystack.IndexOf(needle) != -1;
      }
      public static int IndexOf(this StringBuilder haystack, string needle)
      {
        if(haystack == null || needle == null)
          throw new ArgumentNullException();
        if(needle.Length == 0)
          return 0;//empty strings are everywhere!
        if(needle.Length == 1)//can't beat just spinning through for it
        {
          char c = needle[0];
          for(int idx = 0; idx != haystack.Length; ++idx)
            if(haystack[idx] == c)
              return idx;
          return -1;
        }
        int m = 0;
        int i = 0;
        int[] T = KMPTable(needle);
        while(m + i < haystack.Length)
        {
          if(needle[i] == haystack[m + i])
          {
            if(i == needle.Length - 1)
              return m == needle.Length ? -1 : m;//match -1 = failure to find conventional in .NET
            ++i;
          }
          else
          {
            m = m + i - T[i];
            i = T[i] > -1 ? T[i] : 0;
          }
        }
        return -1;
      }      
      private static int[] KMPTable(string sought)
      {
        int[] table = new int[sought.Length];
        int pos = 2;
        int cnd = 0;
        table[0] = -1;
        table[1] = 0;
        while(pos < table.Length)
          if(sought[pos - 1] == sought[cnd])
            table[pos++] = ++cnd;
          else if(cnd > 0)
            cnd = table[cnd];
          else
            table[pos++] = 0;
        return table;
      }
    }