Search code examples
pythonpandasdataframe

Make new column based on values in other columns pandas


I want to introduce a new col in df based on other col values. If c1-c3 cols have only 1 unique value then that unique value will go into c4 col. If c1-c3 cols have two different values then "both" will go into c4 col. NaN should not be considered as a valid value. Only c2 and c3 have a few NaNs.

Minimal example:

df = pd.DataFrame({
                     "c1": ["left", "right", "right", "left", "left","right"], 
                     "c2": ["left", "right", "right", "right", "NaN","right"], 
                     "c3": ["NaN", "NaN", "left", "NaN", "left","right"]})

Required df:

answerdf = pd.DataFrame({
                     "c1": ["left", "right", "right", "left", "left","right"], 
                     "c2": ["left", "right", "right", "right", "NaN","right"], 
                     "c3": ["NaN", "NaN", "left", "NaN", "left","right"], 
                        "c4":["left", "right", "both", "both", "left","right"] })

Solution

  • import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({
        "c1": ["left", "right", "right", "left", "left", "right"],
        "c2": ["left", "right", "right", "right", np.nan, "right"],
        "c3": [np.nan, np.nan, "left", np.nan, "left", "right"]
    })
    
    def worker(row):
        if "left" in row.values and "right" in row.values:
            return "both"
        if "left" in row.values:
            return "left"
        if "right" in row.values:
            return "right"
        return np.nan
    
    df["c4"] = df[["c1", "c2", "c3"]].apply(worker, axis=1)
    

    This returns nan if neither left nor right is given and might be easier to understand

    Output

        c1  c2  c3  c4
    0   left    left    NaN     left
    1   right   right   NaN     right
    2   right   right   left    both
    3   left    right   NaN     both
    4   left    NaN     left    left
    5   right   right   right   right