Search code examples
c#.netexport-to-excelepplus

EPPlus multiple saves


I'm using EPPlus library in asp.net + C# to create .xlsx file. My file may contain a few million records and i'm creating one ExcelWorksheet per 100K records. problem is that a 40Mb xlsx that contains almost 1.5M records my take multiple Gbs of main memory and after that i get an OutOfMemoryException, question is: How can i reduce memory taken X Gb for 40M?:-| or save ExcelPackage gradually and deallocate memory, (it seems that i can not save more than once and after that whole package is disposed).

If there is no solution, can anybody introduce a alternative for EPPlus (with almost all features like styling, worksheets and etc)


Solution

  • This issue seems to be solved in the latest version of EPPlus (4.0.x)


    EDIT: Adding reference links to pages which point to improved memory management in EPPlus 4.0.4.

    https://epplus.codeplex.com/releases/view/118053#ReviewsAnchor Reviews by users about improved memory performance in 4.x version as compared to 3.x version.

    https://epplus.codeplex.com/wikipage?title=Roadmap Version 4.0: New cellstore to improve insert, delete performance and memory consumption

    This link explains how to ensure that loading HUGE number of cells is optimized.

    http://epplus.codeplex.com/wikipage?title=FAQ&referringTitle=Documentation Refer section "I have a lot of data I want to load. What should I think of to get the best performance?"

    Also, I've personally tested EPPlus 4.0.4 today, by writing out 1.5 Million records at one go, of 5 numeric rows and 1 DateTime row, and the peak memory working set reported by Windows Task Manager was just 711 MB. The Non-Paged Pool shown by Windows Task Manager was just 75K or so! Of course, I'm not sure whether these numbers capture the full impact of the memory footprint, but these are indicative. The output Excel file was around 59MB (might be that my columns were more than the sample data mentioned by you in your original post.)

    NOTE: I did get an "OutOfMemoryException" when I tried to write 4.5 Million records of 7 columns in one go!

    Is my test rigorous enough? Maybe not...Works well for me though.


    However, one workaround I could think of to overcome the large memory requirements in earlier versions is to split and save an xlsx file for every 100K records. After saving, start using a new file (with appropriate file name counter increment) for the next 100K records.

    At the end of your operation, you'd end up having 10 files of 100K records for say a total of 1 million records.

    Might seem a bit of a hack, but hey, might be better than having to rewrite your code base to use some other library (free or commercial).