Search code examples
c#exceloledboledbdataadapter

How to avoid non-float values removed from DataTable with data from Excel file?


I'm using the following code for get a DataTable variable with the information of a Sheet from a Excel file:

// Just a few examples about connectionString and Excel's file path:
string pathFile = @"C:\Windows\MyFolder\myExcelSample.xlsx";
string excelConnString = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + pathFile + ";Extended Properties='Excel 12.0;HDR=YES;IMEX=1;';";

using (OleDbConnection objConn = new OleDbConnection(cadenaConexion))
{
    objConn.Open();
    OleDbCommand cmd = new OleDbCommand();
    OleDbDataAdapter oleda = new OleDbDataAdapter();
    DataSet ds = new DataSet();
    DataTable dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
    string sheetName = string.Empty;

    if (dt != null)
    {
        var tempDataTable = (from dataRow in dt.AsEnumerable()
                             where !dataRow["TABLE_NAME"].ToString().Contains("FilterDatabase")
                             select dataRow).CopyToDataTable();
        dt = tempDataTable;
        sheetName = dt.Rows[TABLE_ROW]["TABLE_NAME"].ToString();
    }

    cmd.Connection = objConn;
    cmd.CommandType = CommandType.Text;
    cmd.CommandText = "SELECT * FROM [" + sheetName + "]";
    oleda = new OleDbDataAdapter(cmd);
    oleda.Fill(ds, "Fact_TEMP");
    tbl_temporal = ds.Tables["Fact_TEMP"];
    objConn.Close();
}

The Excel file has a column called "Document No#" which this code says is float type, but, this column has values that are not float.

Here are a few examples:

444036
CO27_009734
CO31_050656
444041
444041
CO24_102377
CO64_000021
444043
CO24_102378
444044
444044
CO24_102380
CO24_102381
444046
444046444049
444050
CO24_102384

And the values that are not float-type are removed in the tbl_temporal variable.

Which other ways are for solve this situation that doesn't envolve user update the type of the column (which by default is General) in the Excel file?


A few information I have to share:

  • Excel file has a ".xlsx" extension and has 40340 rows.
  • Excel file cannot be modified. This Excel file is supplied by a user which just upload his/her Excel file to our System and our System has to solve the issues and it shouldn't modify the schema or its data.

Solution

  • After following this comment of the accepted answer:

    • In your connection string, change the value of HDR=YES to HDR=NO.

    I have change the way I get the Excel info in the DataTable variable for use the first row of the DataTable as the name of the columns in the Excel file.

    This is the code I use for it:

    // Add columns to "tbl_result" DataTable.
    for (int colCount = 0; colCount < tbl_excel.Columns.Count; colCount++)
    {
        tbl_result.Columns.Add(new DataColumn()
        {
            DataType = tbl_excel.Columns[colCount].DataType,
            ColumnName = tbl_excel.Rows[0][colCount].ToString(),
            AllowDBNull = true
        });
    }
    
    // Remove row "which is actually the header  in the Excel file".
    tbl_excel.Rows.RemoveAt(0);
    
    // Set the name of the table.
    tbl_result.TableName = tbl_excel.TableName;
    
    // Import rows.
    foreach (DataRow row in tbl_excel.Rows)
    {
        tbl_result.Rows.Add(row.ItemArray);
    }
    

    I have the check a few times the Excel file because I was getting this error:

    The given value of type String from the data source cannot be converted to type float of the specified target column.

    I created the table in the SQL Server Database using the "Import Data" feature using the Excel file, but, what I didn't know is that some columns in the Excel file has values that doesn't correspond with the data type of the columns migrated in the SQL Server table.

    So, I changed these columns (which are the problematic ones):

    -- [Document No#] was float before execute this line.
    ALTER TABLE Fact_TEMP ALTER COLUMN [Document No#] NVARCHAR(255)
    
    -- [G/L Account No#] was float before execute this line.
    ALTER TABLE Fact_TEMP ALTER COLUMN [G/L Account No#] NVARCHAR(255)
    

    And after try again uploading the Excel file (which has 40340 rows), the upload worked without any problem.


    The TL;DR version is:

    • Change the value of HDR=YES to HDR=NO in your connection string.
    • Check the values of the Excel file for check whether contains invalid data (i. e. a float column which has NULL or other DataType values).
    • Check that the DataType of the SQL Server Database table has the same DataType expected.