I'm working on an implementation of IQueryable; but, before I jump in the deep end I want to make sure I fully understand what the expression trees I need to evaluate will look like. In particular, I was curious about how LINQ query syntax was converted to method syntax during the compilation process.
I'm using LINQPad to see the methods generated by the compiler. I noticed that in nested iterations a temporary variable name is generated to store the state of the upper level iterations. Here is an example:
from Event in EventQueue
from Ack in Event.Acknowledgements
where Ack.User == User.Name
select Event
This is equivalent to:
EventQueue
.SelectMany(
Event => Event.Acknowledgements,
(Event, Ack) =>
new
{
Event = Event,
Ack = Ack
}
)
.Where(temp0 => (temp0.Ack.User == User.Name))
.Select(temp0 => temp0.Event)
Of course my first instinct is to attempt to break this and see what happened. So I wrote the following query:
from Event in EventQueue
from Ack in Event.Acknowledgements
let temp0 = Ack.User
where Ack.User == temp0
select Event
This is pretty much a "WHERE 1 = 1" and returns all events; however, I don't understand how it's working because the method chain that I'm given would never compile:
EventQueue
.SelectMany(
Event => Event.Acknowledgements,
(Event, Ack) =>
new
{
Event = Event,
Ack = Ack
}
)
.Select(
temp0 =>
new
{
temp0 = temp0,
temp0 = temp0.Ack.User // Anonymous object with identically-named properties
}
)
.Where(temp1 => (temp1.temp0.Ack.User == temp1.temp0))
.Select(temp1 => temp1.temp0.Event)
This has led me to the conclusion that LINQPad is not pulling these method chains from the compiler, because the query works while this method chain clearly wouldn't. LINQPad is most likely generating the method chain on its own.
How does the C# compiler (Roslyn, in this case) handle naming conflicts with generated code?
This has led me to the conclusion that LINQPad is not pulling these method chains from the compiler.
It's precisely because it pulls it from what the compiler did that you are seeing this.
You took some C# code, compiled it, and then used a tool to give you a view on that code again.
If we were to manually translate it from query-syntax C# code into extension method calls in C#, we'd likely come up with something like:
EventQueue.SelectMany(
Event => Event.Acknowledgements,
(Event, Ack) => { Event = Event, Ack = Ack}
)
.Select(x => new { x = x, temp0 = x.Ack.User})
.Where(y => (y.x.Ack.User == y.temp0))
.Select(y => y.x.Event)
Now, in doing that there were two places where I had to come up with a name for a lambda argument. I went with x
and y
here. We could just as well go with foo
and bar
or theUnbearableLightnessOfBeing
and forgettingWhatYouCameForTheMomentYouSetFootInAShop
or whatever.
The tool you were using did a similar job when trying to turn the output of the C# compiler back into C# and chose a naming scheme that starts with temp0
and then temp1
and so on. This is unfortunate because you had something explicitly called temp0
and it didn't account for this case. Really, since temp0
is a bad name anyway, if I was involved in building this tool it would not be a high priority for me to fix that.
How does the C# compiler (Roslyn, in this case) handle naming conflicts with generated code?
Two ways:
Consider:
public int DoSum()
{
int x = 2;
int y = 3;
int z = x * y + 2;
return z - 2;
}
The IL of this is going to be something like:
ldc.i4.2
ldc.i4.3
mul
ldc.i4.2
add
ldc.i4.2
sub
ret
Note that there is no x
, y
or z
in there. Something going from the IL back to the C# is going to have to make up names there.
If there is a need to do something that has a name in the produced IL and that name is not present in the source, the the C# compiler uses a name that is valid as a .NET identifier but not valid as a C# identifier. The .NET rules for allowed identifiers are much looser than the C# rules.
So it can use parameter names like <>h__TransparentIdentifier0
, <>h__TransparentIdentifier1
which aren't allowed as C# variable names, but are perfectly okay by .NET rules generally and so on and know that it need only keep track of its own created names: Since those names aren't valid in C# there won't be a conflict in what the author put into the C#. (This is also how if you do yield
the enumerable type created won't clash with any classes you create, and so on).
Again, something going from the IL back to C# is going to have to make up new names here, to attempt to produce valid C#.
You might complain that the tool is doing something wrong in using temp0
but while it might be nice for it to check against clashes with the user-defined names, it's not a bad go for the general task of "give me this back in C# from what the compiler did". If you want what the compiler really did, use the IL tab.