Following is what my dataframe looks like:
symbol time open high low close
0 AAPL 09:35:00 219.19 219.67 218.38 218.64
1 AAPL 09:40:00 218.63 219.55 218.62 218.93
2 AAPL 09:45:00 218.91 219.09 218.27 218.44
3 AAPL 09:50:00 218.44 218.90 218.01 218.65
4 AAPL 09:55:00 218.67 218.79 218.08 218.59
5 AAPL 10:00:00 218.59 219.20 218.16 219.01
I am trying to apply a function from talib
package that takes two arguments, high
& low
. Following is my attempt which returns all NaN
:
import pandas as pd
import numpy as np
import talib as ta
def f(x):
return ta.SAR(df.high, df.low, acceleration=0.05, maximum=0.2)
df['PSAR1'] = df.groupby(['symbol']).apply(f)
However, the function works fine without a groupby clause and returns a number for the following:
df['PSAR2'] = ta.SAR(df.high,df.low, acceleration=0.05, maximum=0.2)
symbol time open high low close PSAR1 PSAR2
0 AAPL 09:35:00 219.190 219.670 218.380 218.640 NaN NaN
1 AAPL 09:40:00 218.630 219.550 218.620 218.930 NaN 218.380000
2 AAPL 09:45:00 218.910 219.090 218.270 218.440 NaN 219.550000
3 AAPL 09:50:00 218.440 218.900 218.010 218.650 NaN 219.550000
4 AAPL 09:55:00 218.670 218.790 218.080 218.590 NaN 219.396000
5 AAPL 10:00:00 218.590 219.200 218.160 219.010 NaN 219.257400
What am I doing wrong with apply
with multiple arguments & groupby
?
EDIT: With @bsmith89's help, the following worked.
def f(df):
return pd.DataFrame(ta.SAR(df.high, df.low, acceleration=0.05, maximum=0.2),columns= ['PSAR'])
y = df.groupby(['symbol']).apply(f)
df['PSAR'] = y.PSAR.reset_index(drop=True)
You have written your function to take x
as an argument, but then you operate on df
instead.
I haven't tested it, but try rewriting as
def f(df):
return ta.SAR(df.high, df.low, acceleration=0.05, maximum=0.2)