Search code examples
c#arraysiteratorenumerable

Elegant transformation between an IEnumerable<object[]> and an IEnumerable<object>[]


Is there any way to transform an IEnumerable<int[]> to IEnumerable<int>[] ? Concretely I have a following IEnumerable:

    IEnumerable<int[]> data = new List<int[]>()
    {
    new int[]{1,11},
    new int[]{2,22},
    new int[]{3,33},
    new int[]{4,44},
    // ...
    // ...
    };

and I want to turn it into the following shape:

    IEnumerable<int>[] data = new List<int>[] 
    { 
    new List<int>(){1,2,3,4 },
    new List<int>(){11,22,33,44}
    };

The only solution I have so far come up with goes as follows:

public static IEnumerable<int>[] Convert(IEnumerable<int[]> data)
{
    var length = data.First().Length;
    var output = new List<int>[length];
    for (int i = 0; i < length; i++)   
        output[i] = new List<int>();
    foreach (var entry in data) 
    {
        for (int i = 0; i < length; i++)    
        {
            output[i].Add(entry[i]);
        }
    }
    return output;
}

This though isn't ideal, as I need to iterate over the whole data set. Desired solution would be making use of LINQ or built-in iterator pattern features (yield return). Is there any better solution to this problem?


Solution

  • If you don't mind the multiple enumerations of data, you can do this:

    public static IEnumerable<int>[] Convert(IEnumerable<int[]> data)
    {
        var length = data.First().Length;
        var output = new IEnumerable<int>[length];
        for (int i = 0; i < length; i++)
        {
            output[i] = CustomEnumerable(i);
        }
    
        return output;
    
        IEnumerable<int> CustomEnumerable(int index)
        {
            foreach (var entry in data) 
            {
                yield return entry[index];
            }
        }
    }
    

    As you can see, instead of populating lists, I'm returning custom IEnumerable<int>s, which are created with a local iterator function (using yield return).

    If you care about iterating data only once, you could do something like this:

    IEnumerable<int>[] Convert(IEnumerable<int[]> data)
    {
        var found = new List<int[]>();
        using var enumerator = data.GetEnumerator();
        var proxy = ProxyEnumerable();
        
        var length = proxy.First().Length;
        var output = new IEnumerable<int>[length];
        for (int i = 0; i < length; i++)
        {
            output[i] = CustomEnumerable(i);
        }
    
        return output;
    
        IEnumerable<int> CustomEnumerable(int index)
        {
            foreach (var entry in proxy) 
            {
                yield return entry[index];
            }
        }
        
        IEnumerable<int[]> ProxyEnumerable()
        {    
            foreach (var value in found)
            {
                yield return value;
            }
    
            while (enumerator.MoveNext())
            {
                var value = enumerator.Current;
                found.Add(value);
                yield return value;
            }
        }
    }
    

    Here I added another IEnumerable<int[]> that is filling a cache List<int[]> as we iterate over it. So that data is iterated only once, and subsequent iteration use the cache.

    Try it online.