Search code examples
c#excelxlsexceldatareader

Weird XLS file failing when using ExcelDataReader


I'm working with a XLS spreadsheet that is rendered from a third-party software. If I open the file in a text editor it's obvious that it's HTML saved as an XLS file. It appears that MS used to allow this pattern and has recently stopped in Excel per a couple of articles I found:

So, now to the actual tech question. If I send the file (as a stream) directly into the ExcelDataReader using the CreateReader method it throws this error:

"Invalid file signature."

IExcelDataReader excelReader;
excelReader = ExcelReaderFactory.CreateReader(stream);

If on the other hand, I open it in Excel and do a "Save As" and save the XLS as an XLSX, then send it into my code; it imports fine.

My preference would be to not force my user to do the "Save As" step. And getting the third-party to change their format is not a realistic option (as we all know).

Any recommendations or maybe tricks to make this work with the ExcelDataReader?

Here's a screen shot of what I see if I open the XLS in Notepad++ (notice the xls extension):enter image description here


Solution

  • My preference would be to not force my user to do the "Save As" step. And getting the third-party to change their format is not a realistic option (as we all know).

    So, I can suggest you to use Microsoft.Office.Interop.Excel package to save it as a supported format before reading it by ExcelDataReader like this:

    var app = new Microsoft.Office.Interop.Excel.Application();
    var workbook = app.Workbooks.Open(htmlFileName);
    
    if (File.Exists(excelFileName))
    {
        File.Delete(excelFileName);
    }
    
    workbook.SaveAs(
        excelFileName,
        Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookDefault,
        Type.Missing, Type.Missing,  
        false, false,
        Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
    
    workbook.Close();
    app.Quit();
    app = null;
    workbook = null;
    
    using (var stream = File.Open(excelFileName, FileMode.Open, FileAccess.Read))
    {
        var reader = ExcelReaderFactory.CreateReader(stream);
        // ...
    }