Search code examples
pythonnumpydataframelogarithmarithmetic-expressions

What's fastest way to perform calculation based on values of other columns in a dataframe?


I have df, and I have to apply this formula:

enter image description here

to every row, then add the new series (as a new column).

Right now my code is :

 new_col = deque()
    for i in range(len(df)):
        if i < n:
            new_col.append(0)
        else:
            x = np.log10(np.sum(ATR[i-n:i])/(max(high[i-n:i])-min(low[i-n:i])))
            y = np.log10(n)
            new_col.append(100 * x/y)
    df['new_col'] = pd.DataFrame({"new_col" : new_col})

ATR, high, low are obtained from columns of my existing df. But this method is very slow. Is there a faster way to perform the task? Thanks.


Solution

  • Without sample data, I can't test the following, but it should work:

    tmp_df = df.rolling(n).agg({'High':'max', 'Low':'min', 'ATR':'sum'})
    
    df['new_col'] = (100*np.log10(tmp_df['ATR'])) / (tmp_df['High'] - tmp_df['Low']) / np.log10(n)
    
    df['new_col'] = df['new_col'].shift().fillna(0)