Search code examples
c#excelsaxopenxmlopenxml-sdk

Using a template with OpenXML and SAX


I'm creating a large XLSX file from a datatable, using the SAX method proposed in Parsing and Reading Large Excel Files with the Open XML SDK. I'm using an XLSX file as a template.

The method described in that post works fine to substitute a new sheet in for an existing one, but I want to copy the header row from the sheet in the template (string values, formatting, etc), instead of just using the header row from the datatable as the original code does.

I've tried the code below, but the XLSX file ends up with no text in the header row - the formatting is copied, just not the text. I've looked in the XML file for the sheet and it looks OK to me (referencing the sharedStrings.xml file, which still has the definition of the strings). The reflected code from the Open XML SDK 2.0 Productivity Tool shows a slightly odd result though: the cells don't appear to have a text value set:

cellValue1.Text = "";

even though the XML says:

<x:c r="A1" s="4" t="s">

The main code used by the OpenXmlReader is below:

while (reader.Read())
{
    if (reader.ElementType == typeof(SheetData))
    {
        if (reader.IsEndElement)
            continue;

        // Write sheet element
        writer.WriteStartElement(new SheetData());

        // copy header row from template
        reader.Read();
        do
        {
            if (reader.IsStartElement)
            {
                writer.WriteStartElement(reader);
                        }
            else if (reader.IsEndElement)
            {
                writer.WriteEndElement();
            }
            reader.Read();
        } while (!(reader.ElementType == typeof(Row) && reader.IsEndElement));
        writer.WriteEndElement();

        // Write data rows
        foreach (DataRow dataRow in resultsTable.Rows)
        {
            // Write row element
            Row r = new Row();
            writer.WriteStartElement(r);

            foreach (DataColumn dataCol in resultsTable.Columns)
            {
                Cell c = new Cell();
                c.DataType = CellValues.String;
                CellValue v = new CellValue(dataRow[dataCol].ToString());
                c.Append(v);

                // Write cell element
                writer.WriteElement(c);
            }

            // End row
            writer.WriteEndElement();
        }

        // End sheet
        writer.WriteEndElement();
    }
    else
    {
        if (reader.IsStartElement)
        {
            writer.WriteStartElement(reader);
        }
        else if (reader.IsEndElement)
        {
            writer.WriteEndElement();
        }
    }
}

Solution

  • The clue was that the Productivity Tool was showing blank values for the header cells on the generated sheet, and also that the validation formulae from the template were missing. These are both text, which wasn't copied from the template sheet to the new sheet using the combination of OpenXmlReader.Read() and OpenXmlReader.WriteStartElement().

    When the element is an OpenXmlLeafTextElement then the OpenXmlReader.GetText() method will return the text - this works for both text values in cells and for formulae.

    The working code is shown below:

    while (openXmlReader.Read())
    {
        if (openXmlReader.ElementType == typeof(SheetData))
        {
            if (openXmlReader.IsEndElement)
                continue;
    
            // write sheet element
            openXmlWriter.WriteStartElement(new SheetData());
    
            // read first row from template and copy into the new sheet
            openXmlReader.Read();
    
            do
            {
                if (openXmlReader.IsStartElement)
                {
                    openXmlWriter.WriteStartElement(openXmlReader);
                    
                    // this bit is needed to get cell values
                    if (openXmlReader.ElementType.IsSubclassOf(typeof(OpenXmlLeafTextElement)))
                    {
                        openXmlWriter.WriteString(openXmlReader.GetText());
                    }
                }
                else if (openXmlReader.IsEndElement)
                {
                    openXmlWriter.WriteEndElement();
                }
    
                openXmlReader.Read();
    
            } while (!(openXmlReader.ElementType == typeof(Row) && openXmlReader.IsEndElement));
    
            openXmlWriter.WriteEndElement();
    
            // write data rows
            foreach (DataRow dataRow in resultsTable.Rows)
            {
                // write row element
                Row r = new Row();
    
                openXmlWriter.WriteStartElement(r);
    
                foreach (DataColumn dataCol in resultsTable.Columns)
                {
                    Cell c = new Cell();
                    c.DataType = CellValues.String;
                    CellValue v = new CellValue(dataRow[dataCol].ToString());
                    c.Append(v);
    
                    // write cell element
                    openXmlWriter.WriteElement(c);
                }
    
                // end row
                openXmlWriter.WriteEndElement();
            }
    
            // end sheet
            openXmlWriter.WriteEndElement();
        }
        else
        {
            if (openXmlReader.IsStartElement)
            {
                openXmlWriter.WriteStartElement(openXmlReader);
    
                // this bit is needed to get formulae and that kind of thing
                if (openXmlReader.ElementType.IsSubclassOf(typeof(OpenXmlLeafTextElement)))
                {
                    openXmlWriter.WriteString(openXmlReader.GetText());
                }
            }
            else if (openXmlReader.IsEndElement)
            {
                openXmlWriter.WriteEndElement();
            }
        }
    }