Search code examples
pythonpandaslistnatolist

dataframe combine row values to a list without NaN


I want to combine values in each row to a list and add a column on the dataframe. But some columns have NaN, how can I remove NaN in the list?

df

A  B  C
2  3  NaN
1  3  1
3  4  6

Expected output

A  B  C    D
2  3  NaN  [2, 3]
1  3  1    [1, 3, 1]
3  4  6    [3, 4, 6]

I know we can use dropna() in a dataframe, but in this case of combining row values, how can we use dropna()?

My code is here, I tried dropna() after tolist() and df, they did not output correctly

df['D'] = df.values.tolist()

Solution

  • Stack (this eliminates NaN), groupby index from initial frame and then aggregate to lis

    df['D']=df.stack().groupby(level=0).agg(list)
    
    
    
       A  B    C                D
    0  2  3  NaN       [2.0, 3.0]
    1  1  3  1.0  [1.0, 3.0, 1.0]
    2  3  4  6.0  [3.0, 4.0, 6.0]