performance programming-languages machine-code

Why do compiled languages not perform equally if they eventually become machine code?

If C#, Java, or C++, for example, all compile to machine code, why are they not equally performant?

My understanding is that such languages are an abstraction of machine code, which is what they all eventually compile to. Shouldn't the processor determine performance?

Solution

For one thing, C++ optimizers are much more mature. Another, performance has always been the overarching goal of the C++ language designers ("you don't pay for what you don't use" is the mantra, which clearly can't be said about Java's every-method-is-virtual policy).

Beyond that, C++ templates are far more optimization-friendly than Java or C# generics. Although JITs are often praised for their ability to optimize across module boundaries, generics stops this dead in its tracks. The CLR (.NET runtime) generates only one version of machine code for a generic which covers all reference types. On the other hand, the C++ optimizer runs for each combination of template parameters, and can inline dependent calls.

Next, with C# and Java you have very little control over memory layout. Parallel algorithms can suffer an order of magnitude performance degradation from false sharing of cache lines, and there's almost nothing that the developer can do about it. OTOH C++ provides tools to place objects at specific offsets relative to RAM pages and cache boundaries.