Search code examples
pythonpandasloopstop-n

Multiply top h values times k for each row in a dataframe python


I have a dataframe with some dates as rows and values in columns. To have an idea the df looks like the below:

            c1  c2  c3  c4
12/12/2016  38  10   1   8
12/11/2016  44  12  17  46
12/10/2016  13   6   2   7
12/09/2016   9  16  13  26

I am trying to find a way to iterate over each row and multiply only the top 2 values times k = 3. The results should be in a new column of the existing df. Any suggestion or hint is highly appreciated!

Thanks!


Solution

  • nlargest

    df.assign(newcol=df.apply(sorted, 1).iloc[:, -2:].sum(1) * 3)
    
                c1  c2  c3  c4  newcol
    12/12/2016  38  10   1   8     144
    12/11/2016  44  12  17  46     270
    12/10/2016  13   6   2   7      60
    12/09/2016   9  16  13  26     126
    

    partition

    df.assign(newcol=np.partition(df, -2)[:, -2:].sum(1) * 3)
    
                c1  c2  c3  c4  newcol
    12/12/2016  38  10   1   8     144
    12/11/2016  44  12  17  46     270
    12/10/2016  13   6   2   7      60
    12/09/2016   9  16  13  26     126