Search code examples
c#.net.net-coremappingspreadsheetgear

Read .xlsx to dto via SpreadSheetGear


I would like to read exel file to custom model (map it!) using spreadsheetgear in .Net Core 3.1 console app. My excel file looks like:

| Customers | Sales Item | Sale Date  | Contact | Quantity |
| IBM       | Keyboard   | 28-10-2011 |         | 2        |
| Logitech  | Mouse      | 27-09-2011 | joe     | 5        |

The DTO

public class CustomersDto
{
    public string Customers { get; set; }

    public string SalesItem { get; set; }

    public DateTime Date { get; set; }

    public string Contact  { get; set; }

    public int Quantity{ get; set; }
}

General code:

private static void Main(string[] args)
    {
        var workbook =
            Factory.GetWorkbook(
                @"D:\MyFiles\SellersCollections.xlsx");
        var worksheet = workbook.Worksheets[0];

        var cells = worksheet.UsedRange;
        List<CustomersDto> test = new List<CustomersDto>();
        CustomersDto dto = new CustomersDto();

        for (var i = 0; i < cells.RowCount; i++)
        {
            if (i == 1) // skip header (0)
            {
                for (var j = 0; j < cells.ColumnCount; j++)
                {
                   Console.WriteLine(cells[i,j].Value);

                   // What should I do HERE to MAP to DTO
                   // Something like:
                   // dto.Customers = cells[i, j]; ???
                }
            }
        }
        
        Console.ReadLine();
    }

Are there any ways to map cells[i, j] or IRange to model class?


Solution

  • No sort of built-in mapping or binding feature exists in SpreadsheetGear to do this for you, so you'll need to build your own routine to do this. If your workbook and DTO are well-defined and known beforehand, I would keep the outer "row loop" but remove remove the inner "column loop" and manually index over each column cell value and place it into the corresponding DTO property as needed. Of course, this means your code has some knowledge about the information contained in each column and what its corresponding property is in the DTO. If you need something that's a bit more dynamic, you'd have to build a more elaborate system, which gets out of the scope of answering here.

    In terms of getting cell values, there are probably two main IRange properties you'll want to look at:

    • IRange.Value - this property is of type object and returns the "raw" value of a cell. So for instance a double of 1.23 or a bool of true or the string of "abc".
    • IRange.Text - this is of type string and returns the "formatted" value of a cell--meaning it takes IRange.Value and applies that cell's Number Format (IRange.NumberFormat) to the value and returns the resultant string. In other words, this returns what you see from Excel's UI and / or SpreadsheetGear's WorkbookView UI control. So if a cell has an IRange.Value of double 1.23 but is formatted to display 4 decimal places (IRange.NumberFormat == "0.0000"), then IRange.Text would return the string "1.2300". Assuming a General NumberFormat for a cell that has a bool of true, IRange.Text would return the string "TRUE". And so on.

    Special consideration will need to be taken for dates. This is because dates, times and datetimes in Excel are internally stored as serial numeric double values, with the value 1.0 being January 1, 1900, 2.0 being January 2, 1900, today (March 23, 2021) as 44278, etc.; and the fractional portion of the value represents the time of that day (i.e., 0.5 is Noon, 0.625 is 3:00 PM). The fact that a cell shows up as a "date" or "time" or "datetime" is purely a function of the IRange.NumberFormat applied to the cell ("m/d/yyyy" or "h:mm:ss", etc). If you want to get an actual DateTime object from a cell, you'll need to use the IWorkbook.NumberToDateTime(double serialDate) helper method to do this.

    So bottom-line, your routine might look something like the following:

    private static void Main(string[] args)
    {
        var workbook =
        Factory.GetWorkbook(
            @"D:\MyFiles\SellersCollections.xlsx");
        var worksheet = workbook.Worksheets[0];
    
        var cells = worksheet.UsedRange;
        List<CustomersDto> dtoList = new List<CustomersDto>();
    
        for (var i = 0; i < cells.RowCount; i++)
        {
            if (i == 1) // skip header (0)
            {
                CustomersDto dto = new CustomersDto();
    
                // First two columns appear to be text, using IRange.Text should be fine
                // but could probably use IRange.Value as well.
                dto.Customers = cells[i, 0].Text;
                dto.SalesItem = cells[i, 1].Text;
    
                // Need to convert serial date stored in cell values to actual DateTimes.
                // Use IRange.Value and cast to double (may need to introduce some value 
                // type checking here if other values or empty cell values are allowed.  
                // See IRange.ValueType to detect what kind of value is stored in a given 
                // cell).
                double serialDate = (double)cells[i, 2].Value;
                // Convert serial date to a true DateTime object
                DateTime dateTime = workbook.NumberToDateTime(serialDate);
                dto.Date = dateTime;
    
                // Contact column appears to be text, so IRange.Text should suffice.
                dto.Contact = cells[i, 3].Text;
    
                // Appears to be a number so cast to double (see note above about possible
                // value type checking) and then cast again to int.
                dto.Quantity = (int)(double)cells[i, 4].Value;
    
                // Add to list
                dtoList.Add(dto);
            }
        }
    
        Console.ReadLine();
    }