Search code examples
pythonpandasreplaceduplicatesfillna

Need to add a row based off criteria from another column


I need to add a row in my df with certain text based off criteria from another column. Depending on the column criteria, then a row will be added under that certain row.

ID     Name      Order    Children   Pet  
12     Joe       Parent     yes      dog
13     Mary      Parent     yes      cat
14     Paul      Parent     yes      dog
15     Emma      Parent     yes      cat

In my example dataframe, I would want to transform the data and insert rows if a parent has a child and if a parent has a pet (dogs only for this case). I would want the following end result.

ID     Name         Order    Children   Pet  
12     Parent; Joe  Parent     yes      dog
12     Child; Joe
12     Dog; Joe
13     Parent; Mary Parent     yes      cat
13     Child; Mary
14     Parent; Paul Parent     yes      dog
14     Child; Paul
14     Dog; Paul
15     Parent; Emma Parent     yes      cat
15     Child; Emma

To add the prefix

df['Name'] = 'Parent; ' + df['Name'].astype(str)

After that, I've tried duplicating the df and then stacking the df's - it works but it's an exhausting method. So I'd like to have an if statement instead. If Child == 'yes', add row below appropriate parent that reads 'Child;' + parent name. Then another if for if Pet == 'dog', add row below child that reads 'Dog; ' + parent name.

I will use a df.drop later to remove the order, children and pet columns.


Solution

  • Does something like this works for you or you need your data model to be different ?

    In [1]:
    import pandas as pd
    ##Create Dataframe
    cols = ['Name', 'Order', 'Children', 'Pet']  
    data = [['Joe', 'Parent', 'yes', 'dog'],
            ['Mary', 'Parent', 'yes', 'cat'],
            ['Paul', 'Parent', 'yes', 'dog'],
            ['Emma', 'Parent', 'yes', 'cat']
           ]
    df = pd.DataFrame(data=data, columns=cols)
    
    ## Split in as many group as we want
    Parent = 'Parent; ' + df['Name']
    Child = 'Child ; ' + df.loc[df['Children']=='yes', 'Name']
    Dog = 'Dog ; ' + df.loc[df['Pet']=='dog', 'Name']
    ## Concatenate them
    df_final = pd.concat([Parent, Child, Dog])
    df_final.to_frame().sort_index()
    
    Out [1]:
        Name
    0   Parent; Joe
    0   Child ; Joe
    0   Dog ; Joe
    1   Parent; Mary
    1   Child ; Mary
    2   Parent; Paul
    2   Child ; Paul
    2   Dog ; Paul
    3   Parent; Emma
    3   Child ; Emma