Search code examples
c#excelspreadsheetopenxmlopenxml-sdk

Find linked formula values from worksheets and replace with actual cell value


In a OOXML spreadsheet .xlsx you can through a linking formula fecth values from another spreadsheet and have them in your worksheet as values, that will always be updated when those values in another spreadsheet are updated.

I am using Open Xml SDK and I basically want to do what this does: https://www.e-iceblue.com/Tutorials/Spire.XLS/Spire.XLS-Program-Guide/Formula/Remove-Formulas-from-Cells-but-Keep-Values-in-Excel-in-C.html

How do I:

  • Find a value that has formula linking value to a cell in another spreadsheet
  • Replace the formula value with the actual cell value
  • Do this foreach cell in each worksheet in a spreadsheet

I have tried this so far: https://learn.microsoft.com/en-us/office/open-xml/how-to-retrieve-the-values-of-cells-in-a-spreadsheet

But I am recieving a NullRefereceneException each time the cell does not contain a formula or just any value. I have tried try-catch and several other ways to escape this exception, but it is not working.

But back to the challenge as outlined above; can anyone help me out?

Basic stuff such as using SOME DIRECTIVE, foreach loop, Open(), Save() I know how to do.


Solution

  • This worked for me:

    public void Remove_CellReferences(string filepath)
    {
        using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(filepath, true))
        {
            // Delete all cell references in worksheet
            List<WorksheetPart> worksheetparts = spreadsheet.WorkbookPart.WorksheetParts.ToList();
            foreach (WorksheetPart part in worksheetparts)
            {
                Worksheet worksheet = part.Worksheet;
                var rows = worksheet.GetFirstChild<SheetData>().Elements<Row>(); // Find all rows
                foreach (var row in rows)
                {
                    var cells = row.Elements<Cell>();
                    foreach (Cell cell in cells)
                    {
                        if (cell.CellFormula != null)
                        {
                            string formula = cell.CellFormula.InnerText;
                            if (formula.Length > 0)
                            {
                                string hit = formula.Substring(0, 1); // Transfer first 1 characters to string
                                if (hit == "[")
                                {
                                    CellValue cellvalue = cell.CellValue; // Save current cell value
                                    cell.CellFormula = null; // Remove RTD formula
                                                             // If cellvalue does not have a real value
                                    if (cellvalue.Text == "#N/A")
                                    {
                                        cell.DataType = CellValues.String;
                                        cell.CellValue = new CellValue("Invalid data removed");
                                    }
                                    else
                                    {
                                        cell.CellValue = cellvalue; // Insert saved cell value
                                    }
                                }
                            }
                        }
                    }
                }
            }
            // Delete all external link references
            List<ExternalWorkbookPart> extwbParts = spreadsheet.WorkbookPart.ExternalWorkbookParts.ToList();
            if (extwbParts.Count > 0)
            {
                foreach (ExternalWorkbookPart extpart in extwbParts)
                {
                    var elements = extpart.ExternalLink.ChildElements.ToList();
                    foreach (var element in elements)
                    {
                        if (element.LocalName == "externalBook")
                        {
                            spreadsheet.WorkbookPart.DeletePart(extpart);
                        }
                    }
                }
            }
            // Delete calculation chain
            CalculationChainPart calc = spreadsheet.WorkbookPart.CalculationChainPart;
            spreadsheet.WorkbookPart.DeletePart(calc);
        }
    }