Search code examples
pythonperformancefloating-point-conversion

Extracting significand and exponent for base-10 representation from decimal formatted string


I am looking for an efficient Python implementation of a function that takes a decimal formatted string, e.g.

2.05000
200
0.012

and returns a tuple of two integers representing the significand and exponent of the input in base-10 floating point format, e.g.

(205,-2)
(2,2)
(12,-3)

List comprehension would be a nice bonus.

I have a gut feeling that there exists an efficient (and possibly Pythonic) way of doing this but it eludes me...


Solution applied to pandas

import pandas as pd
import numpy as np
ser1 = pd.Series(['2.05000', '- 2.05000', '00 205', '-205', '-0', '-0.0', '0.00205', '0', np.nan])

ser1 = ser1.str.replace(' ', '')
parts = ser1.str.split('.').apply(pd.Series)

# remove all white spaces
# strip leading zeros (even those after a minus sign)
parts.ix[:,0] = '-'*parts.ix[:,0].str.startswith('-') + parts.ix[:,0].str.lstrip('-').str.lstrip('0')

parts.ix[:,1] = parts.ix[:,1].fillna('')        # fill non-existamt decimal places
exponents = -parts.ix[:,1].str.len()
parts.ix[:,0] += parts.ix[:,1]                  # append decimal places to digit before decimal point

parts.ix[:,1] = parts.ix[:,0].str.rstrip('0')   # strip following zeros

exponents += parts.ix[:,0].str.len() - parts.ix[:,1].str.len()

parts.ix[:,1][(parts.ix[:,1] == '') | (parts.ix[:,1] == '-')] = '0'
significands = parts.ix[:,1].astype(float)

df2 = pd.DataFrame({'exponent': exponents, 'significand': significands})
df2

Input:

0      2.05000
1    - 2.05000
2       00 205
3         -205
4           -0
5         -0.0
6      0.00205
7            0
8          NaN
dtype: object

Output:

   exponent  significand
0        -2          205
1        -2         -205
2         0          205
3         0         -205
4         0            0
5         0            0
6        -5          205
7         0            0
8       NaN          NaN

[9 rows x 2 columns]

Solution

  • Here's a straight-forward string processing solution.

    def sig_exp(num_str):
        parts = num_str.split('.', 2)
        decimal = parts[1] if len(parts) > 1 else ''
        exp = -len(decimal)
        digits = parts[0].lstrip('0') + decimal
        trimmed = digits.rstrip('0')
        exp += len(digits) - len(trimmed)
        sig = int(trimmed) if trimmed else 0
        return sig, exp
    
    >>> for x in ['2.05000', '200', '0.012', '0.0']:
        print sig_exp(x)
    
    (205, -2)
    (2, 2)
    (12, -3)
    (0, 0)
    

    I'll leave the handling of negative numbers as an exercise for the reader.