Using pandas to deal with timestamps, I am concatening two columns and then convert the result in floating. It appears that when I display the two columns I observe two different results. How can the conversion from string to float can affect the value? Thanks for your help.
Here is the content of the data.csv file
epoch_day,epoch_ns
1533081601,224423000
Here is my test program:
import pandas as pd
pd.options.display.float_format = '{:.10f}'.format
df_mid = pd.read_csv("data.csv")
df_mid['result_1']=df_mid['epoch_day'].astype(str).str.cat(df_mid['epoch_ns'].astype(str), sep =".")
df_mid['result_2'] = df_mid['epoch_day'].astype(str).str.cat(df_mid['epoch_ns'].astype(str), sep =".").astype(float)
print(df_mid)
The result is :
epoch_day epoch_ns result_1 result_2
0 1533081601 224423000 1533081601.224423000 1533081601.2244229317
Thanks for your help
FX
Floating-point numbers are represented in computer hardware as base 2 (binary) fractions. Most decimal fractions cannot be represented exactly as binary fractions.
When you convert your string, python creates a float which is the closest binary fraction for your input.
You can actually see to which decimal number this corresponds by running the following:
from decimal import Decimal
Decimal(1533081601.224423000)
OUTPUT: Decimal('1533081601.224422931671142578125')
You can see the Python documentation for more info https://docs.python.org/2/tutorial/floatingpoint.html