Search code examples
c#genericscollectionstuplesdeconstructor

How to convert tuple list to list tuple?


For example, I have an IEnumerable<(int, char)> list. How to convert list into (IEnumerable<int>, IEnumerable<char>)?

Is there a fast way to do this? It would be better to work with System.Linq.


Solution

  • There are two issues to consider:

    1. You don't want to iterate over the input more than once.
    2. You want to size the returned lists to the correct length when creating them if possible, to avoid multiple list resizing.

    To efficiently find the length of an IEnumerable<T> you can use the .NET 6 Enumerable.TryGetNonEnumeratedCount().

    Note that of course this will not work for some IEnumerable types, but it will work in many cases.

    Also note that for small list sizes, calling Enumerable.TryGetNonEnumeratedCount() will likely make things slower, since a default-sized list would probably already be big enough to prevent resizing.

    A method using this would look something like this:

    public static (IEnumerable<T>, IEnumerable<U>) Deconstruct<T,U>(IEnumerable<(T,U)> sequence)
    {
        List<T> listT;
        List<U> listU;
    
        if (sequence.TryGetNonEnumeratedCount(out int count))
        {
            listT = new List<T>(count);
            listU = new List<U>(count);
        }
        else
        {
            listT = new List<T>();
            listU = new List<U>();
        }
    
        foreach (var item in sequence)
        {
            listT.Add(item.Item1);
            listU.Add(item.Item2);
        }
    
        return (listT, listU);
    }
    

    This code isn't very elegant because there's no short way of writing the code to initialise the lists to the correct size. But it is probably about as efficient as you are likely to get.

    You could possibly make it slightly more performant by returning arrays rather than lists if you know the count:

    public static (IEnumerable<T>, IEnumerable<U>) Deconstruct<T,U>(IEnumerable<(T,U)> sequence)
    {
        if (sequence.TryGetNonEnumeratedCount(out int count))
        {
            var arrayT = new T[count];
            var arrayU = new U[count];
    
            int i = 0;
    
            foreach (var item in sequence)
            {
                arrayT[i] = item.Item1;
                arrayU[i] = item.Item2;
                ++i;
            }
    
            return (arrayT, arrayU);
        }
        else
        {
            var listT = new List<T>();
            var listU = new List<U>();
    
            foreach (var item in sequence)
            {
                listT.Add(item.Item1);
                listU.Add(item.Item2);
            }
    
            return (listT, listU);
        }
    }
    

    I would only go to such lengths if performance testing indicated that it's worth it!