Search code examples
c#iterator

Avoiding copies of large struct parameter when composing iterator methods


I have a large struct that I know from profiling, is expensive to copy. I am passing instances of this struct around using the in keyword to great effect.

Now I want to pass this as a parameter to an iterator method, which itself passes the value to other iterator methods - but in is not allowed, which means the value gets copied each time it's passed to a method.

The context here is a struct containing save state for a video game, and the iterator method is a 'load' method which spreads processing of the save data over several frames, in the Unity game engine (which implements coroutines using iterators). The load method is complex so needs to be factored into several methods.

Example:

struct SaveData{
    // large data
}

// Async loading - can spread processing across frames (yay!) but copies lots of data (boo!)

IEnumerator LoadAsync(SaveData saveData) {// wish I could use 'in' here!
    // use some part of saveData
    yield return;
    // use more of saveData
    yield return InnerLoad(saveData); // wish I could use 'in' here!
}

IEnumerator InnerLoadAsync(SaveData saveData) {// wish I could use 'in' here!
    // use saveData
    yield return;
}


// Synchronous loading - very efficient (yay!) but blocks, causing an unacceptably long delay (boo!)

void LoadSynchronous(in SaveData saveData){
    // use some part of saveData
    // use more of saveData
    InnerLoadSynchronous(in saveData);
}

void InnerLoadSynchronous(in SaveData saveData){
    // use saveData
}

I understand why in general in is not allowed for iterators (e.g. what if the iterator/coroutine outlasted the owner of the value?) - so I can see why a copy is needed for the outermost iterator function. For the inner calls though, since they're called with yield return, the inner iterator won't outlast the inner, so it seems like there should be some way to use in.

Are there any language features I'm missing here, or perhaps a nice pattern I can use to work around it? I think wrapping the type with an outer class would work, but it seems a little messy and of course still requires one copy since I can't have a ref or in member.


Solution

  • For the inner calls though, since they're called with yield return, the inner iterator won't outlast the inner, so it seems like there should be some way to use in.

    You're missing something. Let's take a simple example:

    public class C
    {
        public static void Main()
        {
            var enumerator = Outer(3);
    
            Console.WriteLine("Enumerating 1");
            enumerator.MoveNext();
    
            Console.WriteLine("Enumerating 2");
            enumerator.MoveNext();
    
            var innerEnumerator = (IEnumerator)enumerator.Current;
    
            Console.WriteLine("Enumerating Inner 1");
            innerEnumerator.MoveNext();
        }
        
        public static IEnumerator Outer(int i)
        {
            yield return null;
            Console.WriteLine("Yielding Inner");
            yield return Inner(i);
        }
        
        public static IEnumerator Inner(int i)
        {
            Console.WriteLine($"Inner {i}");
            yield break;   
        }
    }
    

    This prints:

    Enumerating 1
    Enumerating 2
    Yielding Inner
    Enumerating Inner 1
    Inner 3
    

    (SharpLab).

    As you can see, Inner isn't enumerated straight away. The compiler-generated implementation of Inner returns the compiler-generated IEnumerable to the caller of Outer, and it isn't until that caller explicitly calls MoveNext that the body of Inner is executed.

    But, Inner was invoked much earlier. The compiler-generated implementation of Inner executed in full, and returned the generated IEnumerator, just after the Yielding Inner above. So Inner needs to store the variable i somewhere in a compiler-generated class, which is why it can't be in.