Find least significant digit in a double in Python

I have a lot of financial data stored as floating point doubles and I'm trying to find the least significant digit so that I can convert the data to integers with exponent.

All the data is finite, e.g. 1234.23 or 0.0001234 but because it's stored in doubles it can be 123.23000000001 or 0.00012339999999 etc

Is there an easy or proper approach to this or will I just have to botch it?

Solution

You have a couple of options,

Firstly and most preferably, use the stdlib Decimal, not builtin float

This fixes most errors related to floats but not the infamous 0.1 + 0.2 = 0.3...4

from decimal import Demical

print(0.1 + 0.2)  # 0.30000000000000004
print(Decimal(0.1) + Decimal(0.2))  # 0.3000000000000000166533453694

An alternative option if that isn't possible, is setting a tolerance for number of repeated digits after the decimal point.

For example:

import re

repeated_digit_tolerance = 8  # Change to an appropriate value for your dataset
repeated_digit_pattern = re.compile(r"(.)\1{2,}")

def longest_repeated_digit_re(s: str):
    match = repeated_digit_pattern.search(s)

    string = match.string
    span = match.span()
    substr_len = span[1] - span[0]

    return substr_len, string

def fix_rounding(num: float) -> float:
    num_str = str(num)
    pre_dp = num_str[:num_str.index(".")]
    post_dp = num_str[num_str.index(".") + 1:]

    repetition_length, string = longest_repeated_digit_re(post_dp)

    if repetition_length > repeated_digit_tolerance:
        shortened_string = string[:repeated_digit_tolerance-1]

    return float(".".join([pre_dp, shortened_string]))

print(0.1 + 0.2) # 0.30000000000000004
print(0.2 + 0.4) # 0.6000000000000001

print(fix_rounding(0.1 + 0.2))  # 0.3
print(fix_rounding(0.2 + 0.4))  # 0.6

It's perfectly functioning code but Decimal is practially always the better option of the two, even if it wont do 0.1 + 0.2 correctly.