Search code examples
pythonpandas-groupbyta-lib

Group-by for function that generates more than one variable


I am trying to calculate some group-by items using Technical analysis library (TA-lib) here: https://mrjbq7.github.io/ta-lib/

Some of the functions such as AROON will generate two variables, AR_UP and AR_DOWN.

Without doing group-by, I would use the following:

dft['AR_UP'], dft['AR_DOWN'] = ta.AROON(dft['High'], dft['Low'], 14)

And it would generate AR_UP and AR_DOWN in dft

However, when I try to apply a group-by:

grouped=dft.groupby(["StockCode"]).apply(lambda x: (ta.AROON(x['High'], x['Low'], 14)))

This gives me grouped as:

StockCode
ABA    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
ABP    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
ABW    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
ACQ    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
ACU    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
                             ...                        
WTL    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
WZR    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
XPL    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
YBR    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
Z1P    ([nan, nan, nan, nan, nan, nan, nan, nan, nan,...
Length: 282, dtype: object

Could anyone tell me where I'm going wrong?

I would also like to reassign this back to the original dataframe, so something like:

grouped=(grouped.reset_index()
.groupby("StockCode",as_index=False)
.apply(lambda x: x.assign (AR_UP, AR_DOWN=(ta.AROON(x['High'], x['Low'], 14))))
.set_index('index') )

Would it be possible?

Thanks!


Solution

  • Does this work? It works for my example function but I couldn't install the package. Maybe it will work for your function too. groupby followed by apply doesn't work well with multiple return values.

    import numpy as np
    import pandas as pd
    
    dft={'High': range(11,21),
         'Low': range(1,11),
         'StockCode':np.append(np.repeat("A", 6), np.repeat("B", 4))}
    
    dft=pd.DataFrame(dft)
    
    dft=dft.sort_values("StockCode")
    
    def aroon(x, y, n):
        return x/sum(x)*n, y/sum(y)*n
    
    def my(x):
        a, b=aroon(x["High"], x["Low"], 2)
        return pd.DataFrame({"UP":a, "LOW":b})
    
    grp=dft.groupby("StockCode").apply(lambda x: my(x))
    
    dat=pd.concat([dft, grp], axis=1)
    
    print dat
    
       High  Low StockCode       LOW        UP
    0    11    1         A  0.095238  0.271605
    1    12    2         A  0.190476  0.296296
    2    13    3         A  0.285714  0.320988
    3    14    4         A  0.380952  0.345679
    4    15    5         A  0.476190  0.370370
    5    16    6         A  0.571429  0.395062
    6    17    7         B  0.411765  0.459459
    7    18    8         B  0.470588  0.486486
    8    19    9         B  0.529412  0.513514
    9    20   10         B  0.588235  0.540541