I am looking for an efficient Python implementation of a function that takes a decimal formatted string, e.g.
2.05000
200
0.012
and returns a tuple of two integers representing the significand and exponent of the input in base-10 floating point format, e.g.
(205,-2)
(2,2)
(12,-3)
List comprehension would be a nice bonus.
I have a gut feeling that there exists an efficient (and possibly Pythonic) way of doing this but it eludes me...
import pandas as pd
import numpy as np
ser1 = pd.Series(['2.05000', '- 2.05000', '00 205', '-205', '-0', '-0.0', '0.00205', '0', np.nan])
ser1 = ser1.str.replace(' ', '')
parts = ser1.str.split('.').apply(pd.Series)
# remove all white spaces
# strip leading zeros (even those after a minus sign)
parts.ix[:,0] = '-'*parts.ix[:,0].str.startswith('-') + parts.ix[:,0].str.lstrip('-').str.lstrip('0')
parts.ix[:,1] = parts.ix[:,1].fillna('') # fill non-existamt decimal places
exponents = -parts.ix[:,1].str.len()
parts.ix[:,0] += parts.ix[:,1] # append decimal places to digit before decimal point
parts.ix[:,1] = parts.ix[:,0].str.rstrip('0') # strip following zeros
exponents += parts.ix[:,0].str.len() - parts.ix[:,1].str.len()
parts.ix[:,1][(parts.ix[:,1] == '') | (parts.ix[:,1] == '-')] = '0'
significands = parts.ix[:,1].astype(float)
df2 = pd.DataFrame({'exponent': exponents, 'significand': significands})
df2
Input:
0 2.05000
1 - 2.05000
2 00 205
3 -205
4 -0
5 -0.0
6 0.00205
7 0
8 NaN
dtype: object
Output:
exponent significand
0 -2 205
1 -2 -205
2 0 205
3 0 -205
4 0 0
5 0 0
6 -5 205
7 0 0
8 NaN NaN
[9 rows x 2 columns]
Here's a straight-forward string processing solution.
def sig_exp(num_str):
parts = num_str.split('.', 2)
decimal = parts[1] if len(parts) > 1 else ''
exp = -len(decimal)
digits = parts[0].lstrip('0') + decimal
trimmed = digits.rstrip('0')
exp += len(digits) - len(trimmed)
sig = int(trimmed) if trimmed else 0
return sig, exp
>>> for x in ['2.05000', '200', '0.012', '0.0']:
print sig_exp(x)
(205, -2)
(2, 2)
(12, -3)
(0, 0)
I'll leave the handling of negative numbers as an exercise for the reader.