Search code examples
pythonpandascelladdition

Add two columns values with ',' between them only if none of the cells is null


I have the following dataframe:

>>>name   breakfast  lunch   dinner
0 Zoey    apple      egg     noodels
1 Rena    pear               pasta
2 Shila             tomato  potatoes
3 Daphni coffee             soup 
4 Dufi                  

I would like to create new column which will have all the food values that each name ate on the same day. I have tried to do it using '+' and seperate between the words with ',' as following:

df['food']=df['breakfast']+','+df['lunch']+','+df['dinner']

but then if I have null values I have ',' in the middle:


>>>name   breakfast  lunch   dinner     food
0 Zoey    apple      egg     noodels    apple,egg,noodels
1 Rena    pear               pasta      pear,,pasta
2 Shila             tomato  potatoes    ,tmatoe,potatoes
3 Daphni coffee             soup       coffee,,soupp
4. Dufi                                ,,

and I would like to get it clean with ',' in the right places, e.g not to put , if there is null:

>>>name   breakfast  lunch   dinner     food
0 Zoey    apple      egg     noodels    apple,egg,noodels
1 Rena    pear               pasta      pear,pasta
2 Shila             tomato  potatoes    tmatoe,potatoes
3 Daphni coffee             soup       coffee,soup
4 Dufi                  

is there any way to do this ? to define that if there is null cell do not add it/ don't put , in wrong places


Solution

  • Solution if no missing values, only empty strings is joined only values filtering empty strings:

    cols = ['breakfast','lunch','dinner']
    df['food'] = df[cols].apply(lambda x: ','.join(y for y in x if y != ''), axis=1)
    print (df)
         name breakfast   lunch    dinner               food
    0    Zoey     apple     egg   noodels  apple,egg,noodels
    1    Rena      pear             pasta         pear,pasta
    2   Shila            tomato  potatoes    tomato,potatoes
    3  Daphni    coffee              soup        coffee,soup
    4   Dufi                                                
    

    Or with list comprehension:

    cols = ['breakfast','lunch','dinner']
    df['food'] = [','.join(y for y in x if y != '') for x in df[cols].to_numpy()]
    print (df)
         name breakfast   lunch    dinner               food
    0    Zoey     apple     egg   noodels  apple,egg,noodels
    1    Rena      pear             pasta         pear,pasta
    2   Shila            tomato  potatoes    tomato,potatoes
    3  Daphni    coffee              soup        coffee,soup
    4   Dufi                                                
    

    Solution if missing values is similar only use NaN != NaN:

    cols = ['breakfast','lunch','dinner']
    df['food'] = [','.join(y for y in x if y == y) for x in df[cols].to_numpy()]
    print (df)
         name breakfast   lunch    dinner               food
    0    Zoey     apple     egg   noodels  apple,egg,noodels
    1    Rena      pear     NaN     pasta         pear,pasta
    2   Shila       NaN  tomato  potatoes    tomato,potatoes
    3  Daphni    coffee     NaN      soup        coffee,soup
    4   Dufi        NaN     NaN       NaN