In this answer and this GitHub issue (top item) there is a description of foreach
optimization used by C# compiler.
Basically, instead of allocating IEnumerable<T>
, the generated code calls GetEnumerator()
and then MoveNext()
on the returned object, always using a direct call
and therefore avoiding boxing and virtual calls.
Is it possible to write the same logic in intermediary language? I am quite a beginner with IL, however am familiar with the Unsafe package and the way it works. I wonder if it is possible to write an unsafe method in IL that accepts some object and calls it methods and properties directly?
(Also, could someone kindly give a link to the line in the Roslyn repo where this foreach
optimization happens? The repo is so big and complex, I am lost there so far.)
Update:
Here is a method template
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[ILSub(@"
.. IL code here to be replaced by ilasm.exe
.. Is there a way to do the same without boxing and virtual calls?
")]
public T CallIEnumerableMoveNextViaIL<T>(IEnumerable<T> enumerable)
{
// I know that the `enumerable` returns an enumerator that is a struct, but its type could be custom
// Next two calls are virtual via an interface, and enumerator is boxed
var enumerator = enumerable.GetEnumerator();
enumerator.MoveNext();
return enumerator.Current;
}
And here IL for that method:
IL_0000: ldarg.1
IL_0001: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<!!T>::GetEnumerator()
IL_0006: dup
IL_0007: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
IL_000c: pop
IL_000d: callvirt instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<!!T>::get_Current()
IL_0012: ret
From the comments it looks like it is impossible to call the methods such as get_Current()
without knowing the type.
Let's have a couple of examples of how foreach
gets compiled.
First, on a regular IEnumerator-returning GetEnumerator:
public class A : IEnumerable
{
public IEnumerator GetEnumerator()
{
throw new NotImplementedException();
}
}
foreach(object o in new A())
{
}
.locals init (class [mscorlib]System.Collections.IEnumerator V_0,
class [mscorlib]System.IDisposable V_1)
IL_0000: newobj instance void Testing.Program/A::.ctor()
IL_0005: call instance class [mscorlib]System.Collections.IEnumerator Testing.Program/A::GetEnumerator()
IL_000a: stloc.0
.try
{
IL_000b: br.s IL_0014
IL_000d: ldloc.0
IL_000e: callvirt instance object [mscorlib]System.Collections.IEnumerator::get_Current()
IL_0013: pop
IL_0014: ldloc.0
IL_0015: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
IL_001a: brtrue.s IL_000d
IL_001c: leave.s IL_002f
} // end .try
finally
{
IL_001e: ldloc.0
IL_001f: isinst [mscorlib]System.IDisposable
IL_0024: stloc.1
IL_0025: ldloc.1
IL_0026: brfalse.s IL_002e
IL_0028: ldloc.1
IL_0029: callvirt instance void [mscorlib]System.IDisposable::Dispose()
IL_002e: endfinally
} // end handler
Nothing wondrous here, just calling the methods on IEnumerator. Notice that it also implements IDisposable, hence it uses its pattern.
Next, let's have a value type-returning GetEnumerator:
public class B
{
public BE GetEnumerator()
{
return new BE();
}
public struct BE
{
public object Current {
get {
throw new NotImplementedException();
}
}
public bool MoveNext()
{
throw new NotImplementedException();
}
public void Reset()
{
throw new NotImplementedException();
}
}
}
.locals init (class [mscorlib]System.Collections.IEnumerator V_0,
class [mscorlib]System.IDisposable V_1,
valuetype Testing.Program/B/BE V_2,
object[] V_3,
int32 V_4)
IL_002f: newobj instance void Testing.Program/B::.ctor()
IL_0034: call instance valuetype Testing.Program/B/BE Testing.Program/B::GetEnumerator()
IL_0039: stloc.2
IL_003a: br.s IL_0044
IL_003c: ldloca.s V_2
IL_003e: call instance object Testing.Program/B/BE::get_Current()
IL_0043: pop
IL_0044: ldloca.s V_2
IL_0046: call instance bool Testing.Program/B/BE::MoveNext()
IL_004b: brtrue.s IL_003c
Notice that nothing here has to implement IEnumerable/IEnumerator interfaces. This method is sometimes called duck-typing. This implementation internally stores the enumerator in a variable and then calls the methods on its address (ldloca
). One value type copying happens here, when the enumerator is returned from GetEnumerator.
A third example is almost a completely different thing, foreach
on an array:
foreach(object o in new object[0])
{
}
.locals init (class [mscorlib]System.Collections.IEnumerator V_0,
class [mscorlib]System.IDisposable V_1,
valuetype Testing.Program/B/BE V_2,
object[] V_3,
int32 V_4) IL_004d: ldc.i4.0
IL_004e: newarr [mscorlib]System.Object
IL_0053: stloc.3
IL_0054: ldc.i4.0
IL_0055: stloc.s V_4
IL_0057: br.s IL_0064
IL_0059: ldloc.3
IL_005a: ldloc.s V_4
IL_005c: ldelem.ref
IL_005d: pop
IL_005e: ldloc.s V_4
IL_0060: ldc.i4.1
IL_0061: add
IL_0062: stloc.s V_4
IL_0064: ldloc.s V_4
IL_0066: ldloc.3
IL_0067: ldlen
IL_0068: conv.i4
IL_0069: blt.s IL_0059
This uses no GetEnumerator, and simply traverses the array in the old-fashioned index way (higher performance).
You cannot use this exact optimization in CIL, because CIL has no duck-typing; you have to write all the signatures and method calls yourself.
However, if you need to have this optimization for any type, and you can modify the types you want to use, you can use generic interfaces in a code similar to this:
public class B : IStructEnumerable<object, BE>
{
public BE GetEnumerator()
{
return new BE();
}
}
public struct BE : IStructEnumerator<object>
{
public object Current {
get {
throw new NotImplementedException();
}
}
public bool MoveNext()
{
throw new NotImplementedException();
}
public void Reset()
{
throw new NotImplementedException();
}
}
public interface IStructEnumerable<TItem, TEnumerator> where TEnumerator : struct, IStructEnumerator<TItem>
{
TEnumerator GetEnumerator();
}
public interface IStructEnumerator<TItem>
{
TItem Current {get;}
bool MoveNext();
void Reset();
}
public static void TestEnumerator<TEnumerable, TEnumerator>(TEnumerable b) where TEnumerable : IStructEnumerable<object, TEnumerator> where TEnumerator : struct, IStructEnumerator<object>
{
foreach(object obj in b)
{
}
}