I imported an Excel file by using pandas.read_excel()
in Python.
Then I want to do math calculation to each number in a specific column, and generate a new column. But there is an error:
TypeError: cannot convert the series to
How can I solve this? Below is my code.
import pandas as pd
import math
N_DATA=pd.read_excel(r"path\datajl.xls",index_col='R')
rchdecay=N_DATA['column_name']
rchdcayf=math.exp(-rchdecay*0.008)
I think you need numpy.exp
:
import numpy as np
rchdecay=N_DATA['column_name']
rchdcayf=np.exp(-rchdecay*0.008)
Sample:
import pandas as pd
import numpy as np
N_DATA = pd.DataFrame({'column_name':[1,2,3]})
print (N_DATA)
column_name
0 1
1 2
2 3
rchdcayf=np.exp(-N_DATA['column_name']*0.008)
print (rchdcayf)
0 0.992032
1 0.984127
2 0.976286
Name: column_name, dtype: float64
Or apply
math.exp
, but it is slowier:
rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
print (rchdcayf1)
0 0.992032
1 0.984127
2 0.976286
Name: column_name, dtype: float64
Timings:
len(df)=3
In [61]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
The slowest run took 5.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 209 µs per loop
In [62]: %timeit np.exp(-N_DATA['column_name']*0.008)
The slowest run took 4.59 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 168 µs per loop
len(df)=3k
:
In [64]: %timeit np.exp(-N_DATA['column_name']*0.008)
1000 loops, best of 3: 214 µs per loop
In [65]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
1000 loops, best of 3: 873 µs per loop
Code for timings:
import pandas as pd
import numpy as np
import math
N_DATA = pd.DataFrame({'column_name':[1,2,3]})
N_DATA = pd.concat([N_DATA]*1000).reset_index(drop=True)
rchdcayf=np.exp(-N_DATA['column_name']*0.008)
print (rchdcayf)
rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
print (rchdcayf1)