Search code examples
openxmlepplus

How do I insert data into an existing Excel spreadsheet from data table by mapping Excel sheet column header?


I have macro Excel spreadsheet with column header and some columns are computed columns.

What I need to do is get the data from database and populate this file.

DataTable I retrieved after query does not have all columns as in Excel file as Excel file has computed columns.

So I need to map the column from the data table to Excel file and load the data. I need to make sure to delete existing data from the file before I load the new data as this file needs to be created every week.

I have never worked in OpenXML Document and EPPlus.

Attempt #1 : using EPPlus

private static void OtehrMethod(DataTable dataTable, string filePath)
{
    // using EPPlus
    var package = new ExcelPackage(new FileInfo(filePath));

    ExcelWorksheet workSheet = package.Workbook.Worksheets["MySheet"];

    foreach (DataRow row in dataTable.Rows)
    {
        int i = 1;
        object cellValue = workSheet.Cells[2, i].Value;

        workSheet.Cells[1, 1].Value = Conver.ToInt(row["Id"]);
        // break;
        //workSheet.Cells[2, i].Value =row["First_Name"].ToString();
        //workSheet.Cells[3, i].Value = row["Last_Name"].ToString();
        //workSheet.Cells[4, i].Value = row["Job_Title"].ToString();
        //workSheet.Cells[5, i].Value = row["Skills"].ToString();
        i++;
    }

    package.Save();
}

Attempt #2: using Open XML

private static void SomeMethod()
{
    string filePath = ConfigurationManager.AppSettings["ExcelFilePath"];
    string workingSheetName = ConfigurationManager.AppSettings["WorkingSheetName"];

    using (SpreadsheetDocument document = SpreadsheetDocument.Open(filePath, true))
    {
        // WorkbookPart workbook = document.WorkbookPart;
        WorkbookPart workbookPart = document.WorkbookPart;
        Workbook workbook = document.WorkbookPart.Workbook;

        int sheetIndex = 0;

        foreach (WorksheetPart worksheetpart in workbook.WorkbookPart.WorksheetParts)
        {
            Worksheet worksheet = worksheetpart.Worksheet;

            string sheetName = workbookPart.Workbook.Descendants<Sheet>().ElementAt(sheetIndex).Name;

            if (sheetName.ToUpper() == workingSheetName.ToUpper())
            {
                IEnumerable<Row> rows = worksheet.GetFirstChild<SheetData>().Descendants<Row>();

                foreach (Row row in rows)
                {
                    // How do I map Excel sheet column with data table column and insert the values into Excel ?

                    // Column["FirstName"] = DTRow["FirstName"]
                    //Column["LastName"] = DTRow["LastName"]
                }
            }

            sheetIndex++;
        }
    }
    // throw new NotImplementedException();
}

Solution

  • First, EPPlus is a wrapper around OpenXML. The concepts below would therefore apply to both. The learning curve can be steep so I will start with a Vincent Tan program for you to examine:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Drawing;
    using DocumentFormat.OpenXml;
    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Spreadsheet;
    using DocumentFormat.OpenXml.Drawing.Spreadsheet;
    
    namespace ExcelOpenXmlSetCellValue
    {
        class Program
        {
            static void Main(string[] args)
            {
                string sFile = "ExcelOpenXmlSetCellValue.xlsx";
                if (File.Exists(sFile))
                {
                    File.Delete(sFile);
                }
                try
                {
                    BuildWorkbook(sFile);
                    Console.WriteLine("Program end");
                    Console.ReadLine();
                }
                catch (Exception e)
                {
                    Console.WriteLine(e.ToString());
                    Console.ReadLine();
                }
            }
    
            private static void BuildWorkbook(string filename)
            {
                using (SpreadsheetDocument xl = SpreadsheetDocument.Create(filename, SpreadsheetDocumentType.Workbook))
                {
                    WorkbookPart wbp = xl.AddWorkbookPart();
                    WorksheetPart wsp = wbp.AddNewPart<WorksheetPart>();
                    Workbook wb = new Workbook();
                    FileVersion fv = new FileVersion();
                    fv.ApplicationName = "Microsoft Office Excel";
                    Worksheet ws = new Worksheet();
                    SheetData sd = new SheetData();
    
                    Row r;
                    Cell c;
    
                    r = new Row();
                    // For the 2nd row. I know, it's obvious, but it's recommended
                    // that you assign a value. While Excel might be able to handle
                    // a Row class without the RowIndex assigned (in a certain case),
                    // other applications might find that difficult, in particular,
                    // your own application.
                    r.RowIndex = 2;
                    c = new Cell();
                    // "D" for 4th column. The "2" is equal to the RowIndex of the Row
                    // that the Cell class is in. Note that the number part has to be
                    // equal to the RowIndex or disaster will happen. Or worse, Excel
                    // spits out the "file corrupted" error and your user sees it.
                    c.CellReference = "D2";
                    // by default, an unassigned DataType means CellValues.Number
                    //c.DataType = CellValues.Number;
                    c.CellValue = new CellValue("9.81");
                    r.Append(c);
                    sd.Append(r);
    
                    // What if you need more Cell's in a Row? Blank lines added for readability.
                    r = new Row();
                    r.RowIndex = 5;
    
                    c = new Cell();
                    // this way, the number part is always right, but you need to
                    // assign the RowIndex first. This will make sense when you have
                    // to append many Cell classes. This is just an alternative to
                    // assigning the cell reference.
                    c.CellReference = "F" + r.RowIndex;
                    // There are 2 other string types, the SharedString and InlineString.
                    // They are discussed in another chapter. The CellValues.String type
                    // provides the most straightforward way of including text strings.
                    c.DataType = CellValues.String;
                    c.CellValue = new CellValue("Is");
                    r.Append(c);
    
                    c = new Cell();
                    c.CellReference = "G" + r.RowIndex;
                    c.DataType = CellValues.String;
                    c.CellValue = new CellValue("that in");
                    r.Append(c);
    
                    c = new Cell();
                    c.CellReference = "H" + r.RowIndex;
                    c.DataType = CellValues.String;
                    c.CellValue = new CellValue("metres per second squared?");
                    r.Append(c);
    
                    // This is for row 5
                    sd.Append(r);
    
                    // And now, to show you sometimes the code can be rearranged a little.
                    // But because you need to append the Cell class to the Row class,
                    // it makes sense (logically and conceptually) to initialise a new Row first.
                    c = new Cell();
                    c.CellReference = "F7";
                    c.DataType = CellValues.String;
                    c.CellValue = new CellValue("It'd better be. Just got used to the metric system.");
                    r = new Row();
                    r.RowIndex = 7;
                    r.Append(c);
                    sd.Append(r);
    
                    ws.Append(sd);
                    wsp.Worksheet = ws;
    
                    wsp.Worksheet.Save();
                    Sheets sheets = new Sheets();
                    Sheet sheet = new Sheet();
                    sheet.Name = "Sheet1";
                    sheet.SheetId = 1;
                    sheet.Id = wbp.GetIdOfPart(wsp);
                    sheets.Append(sheet);
                    wb.Append(fv);
                    wb.Append(sheets);
    
                    xl.WorkbookPart.Workbook = wb;
                    xl.WorkbookPart.Workbook.Save();
                    xl.Close();
                }
            }
        }
    }
    

    Then an excerpt from his book may help conceptually:

    So the SheetData class contains Row classes as children. Each Row class contains Cell classes as children. A Row class represents a row in the spreadsheet. A Cell class represents a cell in the spreadsheet. Sometimes, obvious things still need to be said.

    It is recommended that you assign a value for the RowIndex property of the Row class. This is basically the row number for that Row class. If it's the 4th row, assign 4. The RowIndex property is optional under Open XML specifications, but spreadsheet applications might not have a way to deal with this. Even Excel handles a blank RowIndex property correctly under special conditions (which we'll talk about later in this chapter).

    As for the Cell class, you need to take note of these properties: 

    • DataType - The data type of the cell value. You will most probably deal with CellValues.Number and CellValues.String.
    • CellReference - The cell reference in "A1 format". That means the column letter followed by the row index. For example, "C5" means the 3rd column, 5th row. There's the "R1C1 format" ("R" then row index, then "C" then column index, giving us "R5C3" for the previous example), but we're not covering that. "A1 format" is the default in Excel.
    • CellValue - Where the action happens. You might notice that CellValue is actually a class, and its main value is a string. This means you store number values in string form too (the DataType property determines the, well, data type).
    • CellFormula - Where you have a cell formula. But you can read about it in another chapter. The above 3 properties are the usual ones to handle.

    One more thing. The Row classes must be appended to the SheetData class in ascending RowIndex order. The Cell classes must be appended to its Row class in ascending column order. This is not negotiable. Excel will choke on its error vomit otherwise.

    Hopefully, this is enough to get you on track. If not, I'll check back for further questions.

    In answer to your other question, for the computed columns, build your calculations on the right hand side for the cell you build for the calculated column.