Search code examples
pythonpandasdataframedata-analysis

how to put column name into data frame cell with specific conditions in pandas


I have a dataframe like this:

         ADR     WD      EF    INF    SSI   DI
0        1.0    NaN     NaN    NaN    NaN  NaN
1        NaN    NaN     1      1      NaN  NaN
2        NaN    NaN     NaN    NaN    1    NaN
3        NaN    1       1      1      NaN  NaN
4        NaN    1.0     NaN    NaN    NaN  NaN

I want the result to be like this:

[["ADR"],["EF","INF"],["SSI"],["WD","EF","INF"],["WD"]]

As you see the name of the column has been replaced if there is 1 in that column. and all has been put in another array.

I have looked at this post link but it did not help me as the name has changed staticly.

Thanks:)


Solution

  • Use:

    df1 = df.stack().reset_index()
    df1.columns = ['a','b','c']
    df1 = df1[df1['c'] == 1]
    
    a = df1.groupby('a')['b'].apply(list).tolist()
    print (a)
    [['ADR'], ['EF', 'INF'], ['SSI'], ['WD', 'EF', 'INF'], ['WD']]