Search code examples
c#csvdatetimelinq-to-excel

LinqToExcel C# Automatically casting CSV values to DATETIME


I am currently using LinqToExcel to parse through a valid CSV file, but when I reach a particular column (which is not a date), it gets automatically casted into a datetime variable. Example:

2/010/114

will get casted to

cc = "2/10/0114 12:00:00 AM"

This is the code I am using:

var csv = new ExcelQueryFactory(filepath);

var records = from c in csv.Worksheet(0)
              let rows = new 
              {
                  cc = c[9].ToString(),
              }
              select rows;

I have also tried grabbing the results in the most raw format with the same issue:

var rawrecords = from c in csv.Worksheet()
                 select c;

Is this by design? Or am I doing something wrong? How can I ensure the original string is preserved? Seems like a bit of a weird design..


Solution

  • I've not used Linq2Excel before, but based on the example usage info on their GitHub I'd say the following ought to work:

    var csv = new ExcelQueryFactory(filepath);
    
    var records = from c in csv.Worksheet(0)
                  let rows = new 
                  {
                      cc = c[9].Cast<string>(),
                  }
                  select rows;
    

    Update

    After trying this code in LinqPad, I can confirm that it doesn't work, and that's because there doesn't appear to be any way of forcing it to interpret that cell as string, it looks like a DateTime so it's interpreting it that way.

    The only way I can see to get it to do what you want is to en-quote those values, e.g.:

    A,B
    12,"2/010/114"
    

    This does then force it to read in correctly.

    On a simpler note, I'd wonder whether you really need to complexity of LinqToExcel and could just perhaps get away with reading the file yourself manually?


    Example Reader

    I've put together the following very simple and hacky reader:

    void Main()
    {
        var reader = new CsvReader();
        reader.Read(@"C:\users\clint\desktop\test.csv", 10, 5);
        reader.GetDataAtPosition(1,1).Dump();
        reader.GetDataAtPosition(2,2).Dump();
        reader.GetDataAtPosition(2,2, s => s.Split('/')).Dump();
    }
    
    // Define other methods and classes here
    public class CsvReader
    {
        private string[,] _data;
        
        // Take a file, and estimated col and row counts (over-inflate these if needed to ensure the file can be read)
        public void Read(string file, int cols, int rows)
        {
            _data = new string[rows,cols];
            GC.Collect(2);
            var line = 0;
            var col = 0;
            using (var stream = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.Read))
            {
                using (var reader = new StreamReader(stream))
                {
                    while (!reader.EndOfStream)
                    {
                        var lineIn = reader.ReadLine();
                        var inQuotes = false;
                        var thisCellRaw = "";
                        foreach (var ch in lineIn.TrimStart().TrimEnd())
                        {
                            if (ch == '"')
                            {
                                inQuotes = !inQuotes;
                                continue;
                            }
    
                            if (ch == ',' && !inQuotes)
                            {
                                _data[line, col] = thisCellRaw;
                                thisCellRaw = "";
                                col++;
                                continue;
                            }
                            
                            thisCellRaw += ch;
                        }
                        if (!string.IsNullOrEmpty(thisCellRaw))
                        {
                            _data[line, col] = thisCellRaw;
                        }
                        line++;
                        col = 0;
                    }
                }
            }
        }
    
        public string GetDataAtPosition(int row, int col)
        {
            return GetDataAtPosition<string>(row,col);
        }
        
        public T GetDataAtPosition<T>(int row, int col, Func<string,T> transform = null)
        {
            row = row - 1;
            col = col - 1;
            var item = _data[row,col];
            if (item == null) throw new KeyNotFoundException("No data at that position");
            return (transform ?? ((s) => (T)Convert.ChangeType(item, typeof(T))))(item);
        }
    }
    

    It's not the most efficient, and shouldn't be used in production code without some serious cleanup and error handling, but this should help get you there; you could even use some form of inference e.g. int.TryParse to test whether something should be seen as an int etc.