Search code examples
python-3.xpandasconcatenation

Concat values on dataframe columns excluding NaN's


I have a dataframe with n store columns, here I'm just showing the first 2:

     ref_id  store_0     store_1
0       100        c           b
1       300        d         NaN

I want a way to concat only the non-NaN values from store columns into a new column adding a comma between each value, and finally drop those columns. Desired output is:

     ref_id  stores  
0       100     c,b
1       300       d

Right now I've tried df['stores'] = df['store_0'] + ',' + df['store_1'] with this result:

     ref_id  store_0  store_1  stores    
0       100        c        b     c,b
1       300        d      NaN     NaN

Solution

  • You can use:

    cols = df.filter(like='store_').columns
    
    df2 = (df
     .drop(columns=cols)
     .assign(stores=df[cols].agg(lambda s: s.dropna()
                                            .str.cat(sep=','), 
                                 axis=1))
     )
    

    Or, for in place modification:

    cols = df.filter(like='store_').columns
    
    df['stores'] = df[cols].agg(lambda s: s.dropna().str.cat(sep=','), axis=1)
    df.drop(columns=cols, inplace=True)
    

    Output:

       ref_id stores
    0     100    c,b
    1     300      d