Search code examples
pythonnumpypandasmathexp

How to do math calculation on each number in a specific column


I imported an Excel file by using pandas.read_excel() in Python.
Then I want to do math calculation to each number in a specific column, and generate a new column. But there is an error:

TypeError: cannot convert the series to

How can I solve this? Below is my code.

import pandas as pd
import math

N_DATA=pd.read_excel(r"path\datajl.xls",index_col='R')
rchdecay=N_DATA['column_name']
rchdcayf=math.exp(-rchdecay*0.008)

Solution

  • I think you need numpy.exp:

    import numpy as np
    
    rchdecay=N_DATA['column_name']
    rchdcayf=np.exp(-rchdecay*0.008)
    

    Sample:

    import pandas as pd
    import numpy as np
    
    N_DATA = pd.DataFrame({'column_name':[1,2,3]})
    print (N_DATA)
       column_name
    0            1
    1            2
    2            3
    
    rchdcayf=np.exp(-N_DATA['column_name']*0.008)
    print (rchdcayf)
    0    0.992032
    1    0.984127
    2    0.976286
    Name: column_name, dtype: float64
    

    Or apply math.exp, but it is slowier:

    rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
    print (rchdcayf1)
    0    0.992032
    1    0.984127
    2    0.976286
    Name: column_name, dtype: float64
    

    Timings:

    len(df)=3

    In [61]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
    The slowest run took 5.46 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000 loops, best of 3: 209 µs per loop
    
    In [62]: %timeit np.exp(-N_DATA['column_name']*0.008)
    The slowest run took 4.59 times longer than the fastest. This could mean that an intermediate result is being cached.
    10000 loops, best of 3: 168 µs per loop
    

    len(df)=3k:

    In [64]: %timeit np.exp(-N_DATA['column_name']*0.008)
    1000 loops, best of 3: 214 µs per loop
    
    In [65]: %timeit (-N_DATA['column_name']*0.008).apply(math.exp)
    1000 loops, best of 3: 873 µs per loop
    

    Code for timings:

    import pandas as pd
    import numpy as np
    import math
    
    N_DATA = pd.DataFrame({'column_name':[1,2,3]})
    N_DATA = pd.concat([N_DATA]*1000).reset_index(drop=True)
    
    rchdcayf=np.exp(-N_DATA['column_name']*0.008)
    print (rchdcayf)
    
    rchdcayf1=(-N_DATA['column_name']*0.008).apply(math.exp)
    print (rchdcayf1)