Search code examples
pythonpandasfinancequantitative-finance

Condition for certain values in a for loop (Python pandas)


I have a dataframe of stock returns for several companies on the S&P. I have caluclate their returns.

I want to have a for loop which gets the top 5 (for instance) companies with the highest return values (on average over the period) and lowest 5 companies return values from a large dataframe. Then I want to plot it.

I have a code which gives me all the companies' returns and plots them.

graph_df = pd.DataFrame()

for name in df['Name'].unique():
    print(name) # keeping track of which tickers are included
    name_df = df.copy()[df['Name']==name] # creating a new df, copying in the name columns
    name_df[f"{name}_factor"] = name_df["Difference"]
    if graph_df.empty:
        graph_df = name_df[[f"{name}_factor"]]
    else:
        graph_df = graph_df.join(name_df[f"{name}_factor"])

Plot

This is a snippet from print(graph_df):

             StockRet1   StockRet2   StockRet3   StockRet4   StockRet5
Dates                                                                  
1990-01-31   0.000000    0.000000    0.000000    0.000000    0.000000  
1990-02-28  -0.098820    0.000000   -0.076981    0.019755   -0.012170  
1990-03-30   0.023653    0.000000   -0.011115    0.044565   -0.042621  
1990-04-30   0.010303    0.000000   -0.074512    0.067319   -0.003569  

The plot is very messy as you can see, so I want to only show the highest and the lowest. I appreciate all help!


Solution

  • when you have the entire dataFrame:

    df = df[np.concatenate((df.sum().sort_values().index[:5],df.sum().sort_values().index[-5:]))]
    

    this gives you the top 5 and bottom 5. you just have to plot them.

    to avoid any overlapping between array you could wrap the concatenate in a set() to get unique values.