Search code examples
epplus

Detect and/or fail fast when attempting to load a docx package with EPPlus


I'm using EPPlus v6.1 and am adding in some detection of failure scenarios such as non-Excel files, which is working fine for non office files but is not working as expected for non-Excel files packed in the OfficeXml format such as Word docs. I would have expected it to fail in the new ExcelPackage(file) constructor or that the loaded package would expose a property that clearly determines that the file is a Word doc (or at least not an Excel doc).

There is no way that I can see to gracefully fail in this scenario. Some of the exposed properties will throw an exception, but the only non-exception flow that seems reasonable is checking that the Workbook.WorkSheets count is zero, although that seems to be a valid Excel scenario however unlikely or of little use it might be.

Is there a way to check for non-Excel files that doesn't rely on Exceptions, file extensions, testing against potentially valid Excel scenarios, or cracking open the raw XML?


Solution

  • Since the xml-based office files are just zip archives and only xlsx files have a folder named "xl" on the top level, this should work:

    using FileStream fs = new FileStream(@"C:\tmp\excel.xlsx", FileMode.Open);
    using System.IO.Compression.ZipArchive za = new System.IO.Compression.ZipArchive(fs, ZipArchiveMode.Read);
    if (!za.Entries.Any(e => e.FullName.StartsWith("xl/")))
    {
     // not an excel file          
    }