Search code examples
sql-serverscientific-notationopenrowset

Convert scientific notation to float when using OpenRowSet to import a .CSV file


I am using openrowset to import a csv file into SQL Server. One of the columns in the csv file contains numbers in scientific notation (1.08E+05) and the column in the table it is being inserted

By default it is importing the value as 1 and ignoring the .08E+05.

I have tried using cast() and convert() to convert the value directly when the query is executed as well as setting up the datatype in the table as a character string and importing it as such. All of these methods have the same behavior where the .08E+05 is ignored.

Is there a way to have the value imported as 108000 instead of 1 without the .08E+05 without having to change the csv file itself?

Setting up the datatype as a varchar and reading in the csv file appears to have the same effect with the following code:

CREATE TABLE #dataTemp (StartDate datetime, Value varchar(12))

SET @insertDataQuery = 'SELECT Date, CSVValue from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=' 
SET @insertDataQuery = @insertDataQuery + 'C:\Data\;'',''SELECT * FROM '+ '11091800.csv' + ''')'

INSERT INTO #dataTemp EXEC(@insertDataQuery)

SELECT * FROM #dataTemp

Not all of the values in the CSV file have the scientific notation and the value without it, e.g. 81000 come across without issue.


Solution

  • For BULK INSERT methodologies I've often found it simpler to first move the data into a table of all varchars, then get rid of extraneous things like quoted delimiters and fix formatting. I remember having a heck of a time getting rid of the scientific notation, you can just play with the varchar table until you get it right. I remember attempting all kinds of precision/scale combinations until I finally found one that was compatible. I think for me it was FLOAT then DECIMAL(24,12)...

    SELECT CONVERT(DECIMAL(24, 12), CONVERT(FLOAT, '1.08E+05'));

    EDIT adding what I did to try to repro and/or demonstrate a less convoluted way.

    I created a very simple CSV file:

    StartDate,Value
    20110808,81000
    20110808,1.08E+05
    

    Then I ran the following code (for some reason I can't get MSDASQL to run on my machine to save my life):

    CREATE TABLE #dataTemp(StartDate DATETIME, Value VARCHAR(32));
    
    BULK INSERT #dataTemp FROM 'C:\data\whatever.csv' 
        WITH (ROWTERMINATOR='\n', FIELDTERMINATOR=',', FIRSTROW = 2);
    
    SELECT * FROM #dataTemp
    GO
    SELECT StartDate, CONVERT(INT, CONVERT(FLOAT, Value)) FROM #dataTemp;
    GO
    DROP TABLE #dataTemp;
    

    Results:

    StartDate               Value
    ----------------------- --------
    2011-08-08 00:00:00.000 81000
    2011-08-08 00:00:00.000 1.08E+05
    
    StartDate               (No column name)
    ----------------------- ----------------
    2011-08-08 00:00:00.000 81000
    2011-08-08 00:00:00.000 108000