Search code examples
c#excelopenxml-sdk

OpenXML C# - What determines the name of "sheet.xml" in open office file strucure?


I am working on a project using C# and OpenXML. I get an error saying "Excel was able to open the file by repoaring or removing the unreadable content".

I know why the error is happening (merged cells are added with conflicting ranges), I am however confused on the naming of /xl/worksheets/sheets__.xml. Most are named appropriately sheet1, sheet2, etc. Sometimes they are named "sheet1e.xml" or "sheet1d.xml".

If I save the file after excel makes the repair I am able to open it in Open XML SDK Productivity Tool. I don't see the sheets named that way though.

Renaming the file to ".zip" extension I am able to see the sheets appropriately named "sheet1e.xml" as the error stated.

I thought naming them like this would preserve that in the file structure:

        document.WorkbookPart.Workbook.Sheets.AppendChild(new Sheet()
        {
            Id = document.WorkbookPart.GetIdOfPart(worksheetPart),
            SheetId = (UInt32)counter,
            Name = "Sheet_" + counter
        }); 

I must be missing something.. Is there a way to set the names of the .xml files to match what is said in the error; I was hoping there would be so the user at least knows what sheet the problem happened on and can correct the overlapping data in the database. Maybe I shouldn't be setting the name at all?

Update

I believe I've found the reason but don't understand the why. Would be nice to hear from someone with more knowledge than myself. It seems that after sheet 25 they were being named sheet1A, 1B, etc. Removing the "_" from my sheet name fixed the issue.


Solution

  • The OpenXML format is a zip structure of xml files and some additional files. All those files have relations defined between them.

    For example Sheet1.xml must be present in:

    [Content_Types].xml

    Override ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml" PartName="/xl/worksheets/sheet1.xml"

    and

    xl/workbook.xml

    sheet r:id="rId1" sheetId="1" name="SheetName"

    The corespondency for r:id="rId1" can be found in:
    xl/_rels/workbook.xml.rels

    Relationship Target="worksheets/sheet1.xml" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Id="rId1"

    Relation target shows us that xl/worksheets/sheet1.xml is the sheet definition.

    The sheet1.xml name should not be changed. You should be also be careful if you change the sheetID.

    Furthermore, the number 1 from the name, sheet1.xml, does not mean that it is the first sheet in the excel file. The sheet order is given by something else.