I have to apply simple formatting to 10,000+ Excel files. I already have a multicore program working. It opens a new Excel instance for every Excel file. If it matters, this code is currently inside of my Form code.
I was hoping to use one Excel instance with many workbooks. Is it possible to make use of multicore capabilities if I only use one instance of Excel? How?
-If the answer to the above is no, a more complicated question might be: How many instances of Excel should I spawn, and how can I split the workbooks between each instance?
Current code here:
private void SelectFilesButtonClick(object sender, EventArgs e)
{
var listOfExcelFiles = OpenExcel.FileNames.ToList();
Parallel.ForEach(listOfExcelFiles, TrivialExcelEditFunction);
}
private void TrivialExcelEditFunction(string file)
{
//Open instance of Excel
//Do processing
//Close instance of Excel
}
Updated code below, but still doesn't limit itself to the proper number of cores. Not sure why.
private void SelectFilesButtonClick(object sender, EventArgs e)
{
var listOfExcelFiles = OpenExcel.FileNames.ToList();
int cores = Environment.ProcessorCount;
//Split one list into list of lists. Number of lists based on number of cpu cores
List<List<object>> listOfLists = Split(listOfExcelFiles, cores);
//Limits number of threads to number of cores
Parallel.ForEach(listOfLists, new ParallelOptions { MaxDegreeOfParallelism = cores }, EditExcel);
}
private void TrivialExcelEditFunction(string file)
{
//Open instance of Excel
foreach (string file in files)
{
//Do processing
}
//Close instance of Excel
}
Lets say I have 4 cores. My thinking was to divide list of files into 4 equal lists, limit threads to 4, then I'd be able to process files in 4 instances of Excel. I thought that would mean TrivialExcelEditFunction would only run 4 times. Instead this function is being run anywhere between 14 and 27 times. Please tell me where I went wrong.
I think this is what @Servy is referring to above. I use this daily at the moment in a piece of code I am working with which does handle Excel and most certainly hasn't failed yet. Also make sure that you marshal you COM object correctly. A little extra info as well, Excel 2010 uses multi-core internally, watch out for performance (with large file especially).
var tasks = new Task[Environment.ProcessorCount];
for (int i = 0; i< Environment.ProcessorCount; i++)
{
tasks [i] = Task.Factory.StartNew(() =>
{
// your Excel code here.
});
}
Task.WaitAll(tasks);