Search code examples
pythonpandasstring-concatenation

Combine columns ignoring nan


I want to create a concatenated column, separated by commas, in the most efficient and generalisable way:

df fruit colour
0   apple green
1   NaN   orange 
2   NaN   NaN

listy= ["fruit", "colour"]
df[listy].apply(lambda x:",".join(x.dropna()),axis=1))
Type error: sequence item 3: expected str instance, float found

Excepted Output

0 apple, green
1 Orange 
2 NaN

Solution

  • I think there are some numeric values, convert them to strings:

    listy= ["fruit", "colour"]
    out=df[listy].apply(lambda x:", ".join(x.dropna().astype(str)),axis=1).replace('', np.nan)
    
    print (out)
    0    apple, green
    1          orange
    2             NaN
    dtype: object
    

    Or:

    listy= ["fruit", "colour"]
    out = df[listy].stack().astype(str).groupby(level=0).agg(', '.join).reindex(df.index)
    print (out)
    0    apple, green
    1          orange
    2             NaN
    dtype: object
    

    If need find numeric values at least in one column try convert to numeric and filter them:

    listy= ["fruit", "colour"]
    print(df.loc[df[listy].apply(pd.to_numeric, errors='coerce').notna().any(axis=1), listy])