Search code examples
c#exceldatatable

C# Error getting row from data table when it contains mixed characters - any advice or an alternative route appreciated


I get an excel sheet into a data table as follows

string excelAddressListPath = basePath + "/PROPERTIES.xlsx";
if (!File.Exists(excelAddressListPath))
{
    Console.WriteLine("Missing dependency address file DB, exiting....");
    System.Environment.Exit(-1);
}

DataTable dt;

System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
using (var stream = File.Open(excelAddressListPath, FileMode.Open, FileAccess.Read))
{
    using (var reader = ExcelReaderFactory.CreateReader(stream)) 
    {
        var conf = new ExcelDataSetConfiguration
        {
            ConfigureDataTable = _ => new ExcelDataTableConfiguration
            {
                UseHeaderRow = true
            }
        };

        var dataSet = reader.AsDataSet(conf);
        dt = dataSet.Tables[0];  
    }  
}

I then have the retrieve data in a function

string GetAddressFromDataTable(DataTable dt, string uprn)
{
    string address = "";

    Console.WriteLine(uprn);
    foreach (DataRow row in dt.Select($"[Property Reference] = '{uprn}'"))
    {
        address = row["Property Address"].ToString();
    }

    return address;

}

Which i call as follows


string address = GetAddressFromDataTable(dt, uprn.ToUpper());
Console.WriteLine($"From DB: {address}");

All works well when the property reference is a number 12345 but fails if it has characters such as B12345

Ive tried various ways but as i think everything is already a string i cant understand my error

error below


Cannot perform '=' operation on System.Double and System.String.


Solution

  • We don't use DataTables so much these days; the modern route would perhaps be to find a library that will bind an excel file to a list of objects, and make a strongly typed class to represent our file data then work with that

    I'll show you an example using the Sylvan.Data.Excel library - this is actively developed and the author is reachable here and on github where the library is hosted. This can often be a deciding factor in favour of using a library, as bugs can be rectified and the best support is available from an active author.

    We use Nuget Package Manager to add Sylvan.Data and Sylvan.Data.Excel to our app:

    enter image description here

    We have an Excel file that w want to read. Let's also assume we don't have any interest in the Year Built column:

    enter image description here

    We have a class to represent a house, called Property (not a great name given that a property in C# is something else, but it'll work. Just remember that when I say prop or property with a lower case p I'm referring to a C# property like int Blah {get;set;}, not a house):

    class Property
        {
            [DataMember(Name = "Property Reference")]
            public string PropertyReference { get; set; } = "";
    
            public string Name { get; set; } = "";
    
            [DataMember(Name = "Purchase Date")]
            public DateTime PurchaseDate { get; set; }
    
            [DataMember(Name = "SQFT")]
            public int SquareFeet { get; set; }
    
            [IgnoreDataMember]
            public bool Active { get; set; }
    
        }
    

    You can see that the class has properties like public int SquareFeet whose names don't match the columns of the file. We use the DataMember attribute to specify which column in the file maps to this prop. Where the name aligns (e.g. "Name") we can skip this

    We have a prop that we are going to use for some internal purpose and won't be read from or written to a file. We annotate this with IgnoreDataMember otherwise Sylvan will complain it couldn't find it in the file

    We also have a column we aren't interested in, and there is no property for this, so it won't be read


    Now we can read the excel file in and query it:

    
    using Sylvan.Data;
    using Sylvan.Data.Excel;
    
    
    namespace ConsoleApp1
    {
        class Program
        {
    
            static async Task Main()
            {
    
                using var edr = ExcelDataReader.Create("c:\\temp\\property.xlsx");
    
                var properties = edr.GetRecords<Property>().ToArray();
    
                var b12345 = properties.FirstOrDefault(p => p.PropertyReference == "B12345");
    
                if (b12345 != default)
                {
                    Console.WriteLine($"Name of prop with ref B12345 is {b12345.Name}");
                }
                else
                {
                    Console.WriteLine($"Prop with ref B12345 is not found");
                }
    
            }
        }
    }
    

    The first line of Main establishes a reader to the file (don't have it open in Excel)

    The second does the read of all 3 properties into an array

    The third runs a query using LINQ, for the first property with a ref of B12345. If you're new to C# you may not have seen this, so here's a quick breakdown:

    p => p.PropertyReference == "B12345"
    

    This is what we call a lambda, or a delegate. It's like a method, code that takes arguments and does something, but it has no name - it doesn't need one because it's only used in one place

    FirstOrDefault is a method that is specified to run on a collection of objects. Its only argument is that you, the programmer, shall pass to it is a tiny method that takes an argument of the type of object in the colelction, and returns a boolean. FirstOrDefault will loop over your collection, calling your lambda once for each element in the collection, and passing in the current element. If your lambda returns true, that is the element that FirstOrDefault will return

    This lambda thus will result in FirstOrDefault inspecting the reference for each object and returning the proeprty with reference B12345; p is the currenty Property in the array, and FirstOrDefault goes sequentially from start to finish, as if like this pseudocode:

    Property FirstOrDefault(properties, lambdaThatTakesAPropertyAndReturnsABool){
    
        foreach(var property in properties){
            if(lambdaThatTakesAPropertyAndReturnsABool(property) == true)
                return property;
        }
        return null; //didn't find
    }
    

    LINQ is a huge topic, out of scope for here. You can make a reasonable start from the docs but they're very confusing to read. Better to find a nice tutorial