I am seeing a small performance difference between the following two functionally similar code, I hope someone will be able to help me understand why is there a difference.
//Case 1 Faster
Parallel.ForEach(data, x => func(x))
//Case 2 Slower
Parallel.ForEach(Partitioner.Create(data), x => func(x))
data is of type List<double>
As far as I understand, the default partitioning in the first case would also be similar to Partitioner.Create(data), so there should be no difference in performance.
Is there a way to figure out how the partitioning is done at runtime?
I am answering my own question, in case someone is wondering about the same thing.
I wrote a new MyList class inheriting from IList and implementing all methods as wrappers around a list instance along with an additional Console.WriteLine to debug.
Interestingly, in the first case even if I pass it the IEnumerable instance, it seems to find out if it is a list underneath and it calls List's indexer functions. While in the second case it is calling GetEnumerator which I suppose is slower due to function calls and synchronization required for enumerables. The same thing happens in the first case if I pass data.Select(x=>x) instead of data.
I guess the implementation of parallel tries to find out if the IEnumerable is a List underneath and uses it if it can.