Search code examples
sqlsql-serverpython-3.xfloating-accuracypypyodbc

SQL Server float unstable last digits


In SQL Server I create an aggregated column (a combination of other columns that I add, multiple, sum etc) which is of SQL datatype float.

However, when I run the same query multiple times, the last 2 digits of my float are unstable and keep changing.

Below the floats I get with the random last two digits - I try to convert to decimal and then chop off the last two digits.

select round(convert(decimal(20,19), 0.0020042890676442646), 17,1)
select round(convert(decimal(20,19), 0.0020042890676442654), 17,1)

In SSMS the result for both is: 0.0020042890676442600 as expected.

Mind you, the input constants here i took from python, so they might have been modified already. I can't take them from sql directly, as it is incredibly rare to get the calculation anomaly and i don't know how to reproduce it.

But running this via pypyodbc to python, sometimes the result is a python decimal.Decimal type with value 0.0020042890676442700 for the second statement, so it does seem to do rounding rather than truncation.

I have also noticed that the result of the calculation in sql is not always the same, and there is instability there in the last digit of the float - not sure how to test this sytematically though.

The constants casted to floats give:

select convert(float,0.0020042890676442646)
select convert(float,0.0020042890676442654)

Result: 0.00200428906764427.

Wrapped in decimals and rounded:

select round(convert(decimal(20,19), convert(float,0.0020042890676442646)), 17,1)
select round(convert(decimal(20,19), convert(float,0.0020042890676442654)), 17,1)

The result in SSMS is: 0.0020042890676442700 in both cases.

I tried sending back the floats directly instead of casting to decimal, but it seems the two unstable digits are always added at the end when they reach python. Even truncating doesn't help, other random numbers are then added.

It almost seems as if python modifies both float and Decimal during transport in a random manner, or that the instability is in sql already or both.

I tried truncating the np.float64 on the python side like this: Truncating decimal digits numpy array of floats

but as the last float digit in sql can be between e15 and e19 I can't set a consistent truncate level unless i floor everything at e15.


Solution

  • The order of processing of an aggregate is undefined, in the same way that the order of the results of any query are undefined unless you use an ORDER BY clause. In the case of floats, order matters. Order of aggregate processing can be forced using an OVER clause. Here's some code to demonstrate:

     -- demonstrate that order matters when adding floats
    
    declare @a float
    declare @b float
    declare @c float
    declare @d float
    declare @e float
    
    set @a = 1
    set @b = 1
    set @c = 9024055778268167
    
    -- add A to B, and then add C
    -- result is 9024055778268170
    
    set @d = @a + @b
    
    set @e = @d + @c
    
    select cast( @e as decimal(38,0) )
    
    -- add C to B, and then add A
    -- result is 9024055778268168
    
    set @d = @c + @b
    
    set @e = @d + @a
    
    select cast( @e as decimal(38,0) )
    
    -- put these values into a table
    
    create table OrderMatters ( x float )
    
    insert into OrderMatters ( x ) values ( @a )
    insert into OrderMatters ( x ) values ( @b )
    insert into OrderMatters ( x ) values ( @c )
    
    declare @x float
    
    -- add them in ascending order
    -- result is 9024055778268170
    
    select @x = sum(x) over (order by x asc ) from OrderMatters
    
    select cast(@x as decimal(38,0))
    
    -- add them in descending order
    -- result is 9024055778268168
    
    select @x = sum(x) over (order by x desc ) from OrderMatters
    
    select cast(@x as decimal(38,0))