Search code examples
pythonlistpandasrowdataframe

Merge multiple column values into one column in python pandas


I have a pandas data frame like this:

   Column1  Column2  Column3  Column4  Column5
 0    a        1        2        3        4
 1    a        3        4        5
 2    b        6        7        8
 3    c        7        7        

What I want to do now is getting a new dataframe containing Column1 and a new columnA. This columnA should contain all values from columns 2 -(to) n (where n is the number of columns from Column2 to the end of the row) like this:

  Column1  ColumnA
0   a      1,2,3,4
1   a      3,4,5
2   b      6,7,8
3   c      7,7

How could I best approach this issue?


Solution

  • You can call apply pass axis=1 to apply row-wise, then convert the dtype to str and join:

    In [153]:
    df['ColumnA'] = df[df.columns[1:]].apply(
        lambda x: ','.join(x.dropna().astype(str)),
        axis=1
    )
    df
    
    Out[153]:
      Column1  Column2  Column3  Column4  Column5  ColumnA
    0       a        1        2        3        4  1,2,3,4
    1       a        3        4        5      NaN    3,4,5
    2       b        6        7        8      NaN    6,7,8
    3       c        7        7      NaN      NaN      7,7
    

    Here I call dropna to get rid of the NaN, however we need to cast again to int so we don't end up with floats as str.