Search code examples
c#.netarraysalgorithmstrstr

strstr() equivalent in C#


I have two byte[] and I want to find the first occurrence of the second byte[] in the first byte[] (or a range in it).

I don't want to use strings for efficiency (translating the first byte[] to a string will be inefficient).

Basically I believe that's what strstr() does in C.

What is the best way to do that (so it be efficient and easy to use)?

This is how it should look like:

int GetOffsetOfArrayInArray(byte[] bigArray, int bigArrayOffset, int bigArrayCount, byte[] smallArray);

Thanks!

UPDATE:

I want a solution that would be more efficient than a simple search. This means that using the fact that comparing buffers can be more efficient should be used - memcmp() is more efficient than iterating over the bytes.

Also, I know there are algorithms that optimize scenarios like this one:

  • big array: "12312351231234"
  • small array: "1231234"
  • Naive algorithm: 7 compares to find that "1231235" is different than "1231234", 2 compares to find the next "1", 4 compares to find that "1235" is different than "1231", 3 compares to find the next "1", 7 compares to find match. A total of 7+2+4+3+7=23 compares.
  • Smart algorithm: 7 compares to find that "1231235" is different than "1231234", directly jumps to the next "1" (without comparing), 4 compares to find that "1235" is different than "1231", directly jumps beyond the "5", 7 compares to find the match. A total of 7+4+7=18 compares.

Solution

  • I don't have any code for you but the name of the fastest solution you will find is the Boyer-Moore algorithm. It can do better than O(n).

    Here is an implementation for strings on CodeProject. Looks like a conversion to byte[] should not be too difficult.