Search code examples
c#arrayssub-array

Find the first occurrence/starting index of the sub-array in C#


Given two arrays as parameters (x and y) and find the starting index where the first occurrence of y in x. I am wondering what the simplest or the fastest implementation would be.

Example:

when x = {1,2,4,2,3,4,5,6}
     y =       {2,3}
result
     starting index should be 3

Update: Since my code is wrong I removed it from the question.


Solution

  • Here is a simple (yet fairly efficient) implementation that finds all occurances of the array, not just the first one:

    static class ArrayExtensions {
    
      public static IEnumerable<int> StartingIndex(this int[] x, int[] y) {
        IEnumerable<int> index = Enumerable.Range(0, x.Length - y.Length + 1);
        for (int i = 0; i < y.Length; i++) {
          index = index.Where(n => x[n + i] == y[i]).ToArray();
        }
        return index;
      }
    
    }
    

    Example:

    int[] x = { 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4 };
    int[] y = { 2, 3 };
    foreach (int i in x.StartingIndex(y)) {
      Console.WriteLine(i);
    }
    

    Output:

    1
    5
    9
    

    The method first loops through the x array to find all occurances of the first item in the y array, and place the index of those in the index array. Then it goes on to reduce the matches by checking which of those also match the second item in the y array. When all items in the y array is checked, the index array contains only the full matches.

    Edit:
    An alternative implementation would be to remove the ToArray call from the statement in the loop, making it just:

    index = index.Where(n => x[n + i] == y[i]);
    

    This would totally change how the method works. Instead of looping through the items level by level, it would return an enumerator with nested expressions, deferring the search to the time when the enumerator was iterated. That means that you could get only the first match if you wanted:

    int index = x.StartingIndex(y).First();
    

    This would not find all matches and then return the first, it would just search until the first was found and then return it.