I'm attempting to clear out rows from an excel template. The code before this goes through and creates the workbook based off a template. That code produces fine excel files with no errors. It's only when adding this part do I run into problems:
Sheet theSheet = workbookPart.Workbook.Descendants<Sheet>()
.Where(s => s.Name == task).FirstOrDefault();
if (theSheet != null)
{
WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(theSheet.Id);
SheetData sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
var rows = sheetData.Elements<Row>().Where(r => r.RowIndex > 1).ToArray();
for (int x = 0; x < rows.Count(); x++)
{
((Row)rows[x]).Remove();
}
worksheetPart.Worksheet.Save();
}
It clears out the rows successfully. However when I open the file in excel I receive the following error:
Excel found unreadable content in 'excel.xlsx'. Do you want to recover the contents of this workbook?...
Clicking yes gives the following details:
<repairedRecords summary="Following is a list of repairs:">
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1c.xml part</repairedRecord>
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1d.xml part</repairedRecord>
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1b.xml part</repairedRecord>
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1a.xml part</repairedRecord>
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet26.xml part</repairedRecord>
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1f.xml part</repairedRecord>
<repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1e.xml part</repairedRecord>
</repairedRecords>
If I open the Excel File in "Open XML SDK 2.5 Productivity Tool" and validate it. It gives a bit more info:
Error Node Type: Worksheet
Error Part: /xl/worksheets/sheet1a.xml (this is the only line chat changes and it corresponds to the above errors)
Error Node Path: /x:worksheet[1]
Related Node Type: OpenXmlUnknownElement
Related Part:
Description: The element has invalid child element 'http://schemas.openxmlformats.org/sheadsheetml/2006/main:row'.
If I open the original Excel file that this code is modifying, sheet1a/sheet1b, etc. do not exist. Where are they coming from? Is there something I'm missing? How can these sheets contain invalid row elements when all I'm doing is removing rows? Thank you for any suggestions.
Edit: Trimmed down form of sheet1a.xml:
<?xml version="1.0" encoding="utf-8"?>
<x:worksheet xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:dimension ref="A1:AK180" />
<x:sheetViews>
<x:sheetView workbookViewId="0" />
</x:sheetViews>
<x:sheetFormatPr defaultRowHeight="15" />
<x:cols>
<x:col min="1" max="1" width="13.85546875" bestFit="1" customWidth="1" />
</x:cols>
<x:sheetData>
<x:row>
<x:c r="A1" t="inlineStr">
<x:is>
<x:t>TestResultFileId</x:t>
</x:is>
</x:c>
</x:row>
<x:row r="2">
<x:c r="A2" t="inlineStr">
<x:is>
<x:t>6F2DFA01-27EE-E211-8250-0025906392BB</x:t>
</x:is>
</x:c>
</x:row>
</x:sheetData>
<x:row r="1" spans="1:37">
<x:c r="A1" t="s">
<x:v />
</x:c>
</x:row>
<x:conditionalFormatting sqref="A1:AK1048576">
<x:cfRule type="expression" dxfId="7" priority="1">
<x:formula />
</x:cfRule>
</x:conditionalFormatting>
<x:pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3" />
</x:worksheet>
Based on the XML of the worksheet, this part:
<x:row r="1" spans="1:37">
<x:c r="A1" t="s">
<x:v />
</x:c>
</x:row>
should not exist outside of the SheetData element. In fact, there appears to be a duplicate, since
<x:row>
<x:c r="A1" t="inlineStr">
<x:is>
<x:t>TestResultFileId</x:t>
</x:is>
</x:c>
</x:row>
also exist. Note that the "actual" header Row has no RowIndex assigned, but according to the CellReference of "A1", this particular row is on row 1.
Note that this:
var rows = sheetData.Elements<Row>().Where(r => r.RowIndex > 1).ToArray();
will probably ignore any Row objects without the RowIndex assigned (I didn't test this though...). Which can happen. Excel should have assigned a value, but any third-party software isn't bound to do so (because the Open XML specs state that RowIndex is an optional attribute).
I don't know why there's a Row object outside of SheetData. Check that the original template Excel file doesn't have this "Row object outside of SheetData object" case. If it does, then the original template file was faulty in the first place.
You might want to consider the option of storing the 1st Row in a separate variable first. Then wipe out all the child elements of SheetData. Then Append() that 1st Row in. This might be easier. You can wipe out children like so (insert bad parenting joke here):
sheetData.RemoveAllChildren();