Search code examples
performanceexcelepplusspreadsheetgear

Performance difference when reading/writing many files with EPPlus versus Spreadsheet Gear


I've made a simple performance test between EPPlus and Spreadsheet Gear to see if there is any significant difference that would justify buying Spreadsheet Gear.

I am no expert at either application so it's possible the tests aren't written the most efficient way.

The test does the following: 1. Opens an existing Excel-file with 1000 rows and 3 columns. Saves the three values into an entity that is saved into a List<>. 2. Open a new Excel-object 3. Create a header row (bold) with the title of each column. 4. Write back the 1000 entities. 5. Save the new Excelfile.

If I run this test once EPPlus comes out the winner (approx times are EPPlus = 280ms, SG = 500ms). If I run the test 10 times in a row instead (a for-loop opening, copying, saving 10 seperate times) Spreadsheet Gear is faster instead (approx times per file: EPPlus = 165ms, SG = 95ms). For 20 tests the approx times are EPPlus = 160ms / file and SG = 60ms / file.

It seems like (to a certain extent at least) Spreadsheet Gears gets faster and faster the more files I create. Could anyone explain why EPPlus is the slower one when running consecutive tests? And can I make changes to the code to change this?

EPPlus test function:

var timer = new Stopwatch();
  timer.Start();
  var data = new List<Item>();
  using (var excelIn = new ExcelPackage(new FileInfo(folder + fileIn)))
  {
    var sheet = excelIn.Workbook.Worksheets[1];
    var row = 2;
    while (sheet.Cells[row, 1].Value != null)
    {
      data.Add(new Item()
        {
          Id = int.Parse(sheet.Cells[row, 1].Text),
          Title =  sheet.Cells[row, 2].Text,
          Value = int.Parse(sheet.Cells[row, 3].Text)
        });

      row++;
    }
  }

  using (var excelOut = new ExcelPackage())
  {
    var sheet = excelOut.Workbook.Worksheets.Add("Out");
    sheet.Cells.LoadFromCollection(data);
    sheet.InsertRow(1, 1);
    sheet.Cells[1, 1, 1, 3].Style.Font.Bold = true;
    sheet.Cells[1, 1].Value = "Id";
    sheet.Cells[1, 2].Value = "Title";
    sheet.Cells[1, 3].Value = "Value";

    excelOut.SaveAs(new FileInfo(folder + "EPPlus_" + Guid.NewGuid() + ".xlsx"));
  }

  timer.Stop();
  return timer.ElapsedMilliseconds;

Spreadsheet Gear:

var timer = new Stopwatch();
  timer.Start();

  var data = new List<Item>();
  var excelIn = Factory.GetWorkbook(folder + fileIn);
  var sheetIn = excelIn.Worksheets[0];
  var rowIn = 1;
  while (sheetIn.Cells[rowIn, 0].Value != null)
  {
    data.Add(new Item()
    {
      Id = int.Parse(sheetIn.Cells[rowIn, 0].Text),
      Title = sheetIn.Cells[rowIn, 1].Text,
      Value = int.Parse(sheetIn.Cells[rowIn, 2].Text)
    });

    rowIn++;
  }
  excelIn.Close();

  var excelOut = Factory.GetWorkbook();
  var sheetOut = excelOut.Worksheets.Add();
  sheetOut.Name = "Out";
  var rowOut = 0;
  sheetOut.Cells[rowOut, 0, rowOut, 2].Font.Bold = true;
  sheetOut.Cells[rowOut, 0].Value = "Id";
  sheetOut.Cells[rowOut, 1].Value = "Title";
  sheetOut.Cells[rowOut++, 2].Value = "Value";

  foreach (var item in data)
  {
    sheetOut.Cells[rowOut, 0].Value = item.Id;
    sheetOut.Cells[rowOut, 1].Value = item.Title;
    sheetOut.Cells[rowOut++, 2].Value = item.Value;
  }

  excelOut.SaveAs(folder + "SpreadsheetGear_" + Guid.NewGuid() + ".xlsx", FileFormat.OpenXMLWorkbook);
  excelOut.Close();

  timer.Stop();
  return timer.ElapsedMilliseconds;

Main function

var runs = 1;
  var testerG = new TestSpreadsheetGear();
  var testerE = new TestEpPlus();
  var msE = 0.0;
  var msG = 0.0;
  var i = 0;
  for (i = 0; i < runs; ++i)
  {
    msG += new TestSpreadsheetGear().Run(folder, originalFile);
  }

  for(i = 0; i < runs; ++i)
  {
    msE += new TestEpPlus().Run(folder, originalFile);
  }

  Console.WriteLine("Spreadsheet time: " + msG + ". Per file: " + msG / runs);
  Console.WriteLine("EP Plus time: " + msE + ". Per file: " + msE / runs);
  Console.ReadKey();

Solution

  • I believe that the reason for the results you are seeing is the fact that on the first run the .NET CLR must JIT the code. Since SpreadsheetGear is a complete spreadsheet engine under the hood (as opposed to a read / write library) there is more code to JIT - thus the first run is taking longer for SpreadsheetGear than EPPlus (I am speculating here but have a great deal of experience in benchmarking .NET code over the last 10 years).

    I do not have EPPlus installed but I did write a test which tries to do the same thing you are doing. with SpreadsheetGear 2012 Since I don't have your starting workbook I first build the workbook. Then, I used more optimal SpreadsheetGear APIs. The first time I run I get 141 milliseconds for SpreadsheetGear 2012. After the first run I get 9 or 10 milliseconds for each run on an overclocked Core i7-980x running Win7 x86 and a release build run without debugger.

    I have pasted my code below (just paste it into a .NET 4.0 C# console application).

    One more thought I have is that this is a very small test case. To really see the performance of SpreadsheetGear 2012 try this with 100,000 rows or even 1 million rows.

    Disclaimer: I own SpreadsheetGear LLC

    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using SpreadsheetGear;
    
    namespace SGvsEPPlus
    {
        class Program
        {
            internal struct Item
            {
                internal Item(int id, string title, int value)
                {
                    Id = id;
                    Title = title;
                    Value = value;
                }
    
                internal int Id;
                internal string Title;
                internal int Value;
            }
    
            static void Test(int rows)
            {
                string filename = @"C:\tmp\MyWorkbook.xlsx";
                Console.Write("Test({0})...", rows);
                var timer = new Stopwatch();
                // Create workbook since we don't have poster's original workbook.
                timer.Restart();
                var workbook = Factory.GetWorkbook();
                var values = (SpreadsheetGear.Advanced.Cells.IValues)workbook.Worksheets[0];
                for (int row = 1; row <= rows; row++)
                {
                    values.SetNumber(row, 0, row);
                    values.SetText(row, 1, "Title " + row);
                    values.SetNumber(row, 2, row * 10);
                }
                Console.Write("Create workbook={0:0}...", timer.Elapsed.TotalMilliseconds);
                // Save workbook
                timer.Restart();
                workbook.SaveAs(filename, FileFormat.OpenXMLWorkbook);
                Console.Write("Save workbook={0:0}...", timer.Elapsed.TotalMilliseconds);
                // Track total time of original test.
                var totalTimer = Stopwatch.StartNew();
                // Open workbook
                timer.Restart();
                var excelIn = Factory.GetWorkbook(filename);
                Console.Write("Open excelIn={0:0}...", timer.Elapsed.TotalMilliseconds);
                // Copy workbook to list
                timer.Restart();
                var sheetIn = excelIn.Worksheets[0];
                var valuesIn = (SpreadsheetGear.Advanced.Cells.IValues)sheetIn;
                var rowIn = 1;
                var data = new List<Item>(rows);
                while (valuesIn[rowIn, 0] != null)
                {
                    data.Add(new Item(
                        (int)valuesIn[rowIn, 0].Number,
                        valuesIn[rowIn, 1].Text,
                        (int)valuesIn[rowIn, 2].Number));
                    rowIn++;
                }
                excelIn.Close(); // Not necessary but left for consistency.
                Console.Write("excelIn->data={0:0}...", timer.Elapsed.TotalMilliseconds);
                timer.Restart();
                var excelOut = Factory.GetWorkbook();
                var sheetOut = excelOut.Worksheets[0];
                var valuesOut = (SpreadsheetGear.Advanced.Cells.IValues)sheetOut;
                sheetOut.Name = "Out";
                var rowOut = 0;
                sheetOut.Cells[rowOut, 0, rowOut, 2].Font.Bold = true;
                sheetOut.Cells[rowOut, 0].Value = "Id";
                sheetOut.Cells[rowOut, 1].Value = "Title";
                sheetOut.Cells[rowOut++, 2].Value = "Value";
                foreach (var item in data)
                {
                    valuesOut.SetNumber(rowOut, 0, item.Id);
                    valuesOut.SetText(rowOut, 1, item.Title);
                    valuesOut.SetNumber(rowOut, 2, item.Value);
                    rowOut++;
                }
                Console.Write("data->excelOut={0:0}...", timer.Elapsed.TotalMilliseconds);
                timer.Restart();
                excelOut.SaveAs(@"C:\tmp\SpreadsheetGear_" + Guid.NewGuid() + ".xlsx", FileFormat.OpenXMLWorkbook);
                excelOut.Close(); // Again - not necessary.
                Console.WriteLine("Save excelOut={0:0}...", timer.Elapsed.TotalMilliseconds);
                Console.WriteLine("    Total={0:0}", totalTimer.Elapsed.TotalMilliseconds);
            }
            static void Main(string[] args)
            {
                // Do it three times with 1000 rows. Note that the first
                // time takes longer because code must be JITted.
                Test(1000);
                Test(1000);
                Test(1000);
            }
        }
    }