Search code examples
c#.net-corememorystreamopenxlsx

phone numbers loaded to my c# .net core web app from an excel file has encoding issues


I'm loading a collection of rows from excel into a MemoryStream. the excel contains a phone number column. for example:

phoneNumber="0511234567"

up untill now I had no issues with my code. I got an excel file that causes the folling problem: on the example above, if I will run the follwing command I will get:

phoneNumber.Length() ==> 11 // it should be 10

phoneNumber.Substring(0,2) ==> "0"

phoneNumber.Substring(0,1) ==> ""

phoneNumber.SrartsWith("05") ==> true

in my code I check if phoneNumber.Substring(0,2) == "05" and in this excel file I get false though when I look at the actual value it is starting with "05". As you can see above it seems like the first char is nothing and is counted as if there is a value.

I understand this relates to something with encoding. I tried parsing to bytes and decoding and then encoding to no success. Will appriciate any input.


Solution

  • I could reproduce the behavior by adding a null to the beginning of the phone number:

    var phoneNumber = "\00511234567";
    

    So you probably have a non-printing char at the beginning of the string. To trim control characters from the string you can use the following:

    phoneNumber = new string(phoneNumber.Where(c => !char.IsControl(c)).ToArray());
    

    Update:

    Since the first char has char code 8207 which is not a control character. You could use IsDigit instead and if the phone numbers are allowed to have a hyphen you could add that as a filter condition:

    phoneNumber = new string(phoneNumber.Where(c => char.IsDigit(c) || c == '-').ToArray());