Search code examples
pandasif-statementlambda

How to populate a column with a value if condition is met across 2+ other colums


My dataframe is similar to the table below. I have 6 columns, each with 'Yes' or 'No' if a specific antibiotic was given.

AZITH CLIN CFTX METRO CFTN DOXY TREATED
Yes Yes No No No No
No Yes No Yes No No
Yes Yes No No No No
No No No No No No
Yes Yes Yes Yes Yes Yes
No Yes Yes Yes No Yes
No No No No No Yes
No No No No No No
Yes Yes Yes No No No
Yes No Yes Yes No No
Yes No No No Yes No
No No Yes Yes No Yes
No No No No Yes Yes
No No No Yes No Yes

I want to fill the column 'TREATED' with 'True' if specific combinations of antibiotic columns contain 'Yes.' If the conditions aren't met, then I would like to fill the 'TREATED' column with a 'False' value.

If ['AZITH'] & ['CLIN'] == 'Yes' |

['AZITH'] & ['CFTX'] & ['CLIN'] == 'Yes' |

['AZITH'] & ['CFTX'] & ['METRO']== 'Yes' |

['AZITH'] & ['CFTN'] == 'Yes' |

['CFTX'] & ['DOXY'] & ['METRO']== 'Yes' |

['CFTN'] & ['DOXY'] == 'Yes' |

['DOXY'] & ['METRO']== 'Yes' ,

Then return 'True' in column 'TREATED'

Else 'False'

What I had in mind was some sort of if statement or use of lambda function, however, I am having trouble.

This must not be exclusive to the above combinations but also include for example if all 6 medications were given. If that's the case, then 'True' should be returned because the condition has been met to give at least 2 of the treatment medications.

The desired output is below:

AZITH CLIN CFTX METRO CFTN DOXY TREATED
Yes Yes No No No No Yes
No Yes No Yes No No No
Yes Yes No No No No Yes
No No No No No No No
Yes Yes Yes Yes Yes Yes Yes
No Yes Yes Yes No Yes Yes
No No No No No Yes No
No No No No No No No
Yes Yes Yes No No No Yes
Yes No Yes Yes No No Yes
Yes No No No Yes No Yes
No No Yes Yes No Yes Yes
No No No No Yes Yes Yes
No No No Yes No Yes Yes

Solution

  • With the dataframe you provided:

    import pandas as pd
    
    df = pd.DataFrame(
        {
            "AZITH": [
                "Yes",
                "No",
                "Yes",
                "No",
                "Yes",
                "No",
                "No",
                "No",
                "Yes",
                "Yes",
                "Yes",
                "No",
                "No",
                "No",
            ],
            "CLIN": [
                "Yes",
                "Yes",
                "Yes",
                "No",
                "Yes",
                "Yes",
                "No",
                "No",
                "Yes",
                "No",
                "No",
                "No",
                "No",
                "No",
            ],
            "CFTX": [
                "No",
                "No",
                "No",
                "No",
                "Yes",
                "Yes",
                "No",
                "No",
                "Yes",
                "Yes",
                "No",
                "Yes",
                "No",
                "No",
            ],
            "METRO": [
                "No",
                "Yes",
                "No",
                "No",
                "Yes",
                "Yes",
                "No",
                "No",
                "No",
                "Yes",
                "No",
                "Yes",
                "No",
                "Yes",
            ],
            "CFTN": [
                "No",
                "No",
                "No",
                "No",
                "Yes",
                "No",
                "No",
                "No",
                "No",
                "No",
                "Yes",
                "No",
                "Yes",
                "No",
            ],
            "DOXY": [
                "No",
                "No",
                "No",
                "No",
                "Yes",
                "Yes",
                "Yes",
                "No",
                "No",
                "No",
                "No",
                "Yes",
                "Yes",
                "Yes",
            ],
        }
    )
    

    Here is one way to do it:

    mask = (
        ((df["AZITH"] == "Yes") & (df["CLIN"] == "Yes"))
        | ((df["AZITH"] == "Yes") & (df["CLIN"] == "Yes") & (df["CFTX"] == "Yes"))
        | ((df["AZITH"] == "Yes") & (df["CFTX"] == "Yes") & (df["METRO"] == "Yes"))
        | ((df["AZITH"] == "Yes") & (df["CFTN"] == "Yes"))
        | ((df["CFTX"] == "Yes") & (df["DOXY"] == "Yes") & (df["METRO"] == "Yes"))
        | ((df["CFTN"] == "Yes") & (df["DOXY"] == "Yes"))
        | ((df["DOXY"] == "Yes") & (df["METRO"] == "Yes"))
    )
    df.loc[mask, "TREATED"] = "Yes"
    df = df.fillna("No")
    

    Then:

    print(df)
    # Output
       AZITH CLIN CFTX METRO CFTN DOXY TREATED
    0    Yes  Yes   No    No   No   No     Yes
    1     No  Yes   No   Yes   No   No      No
    2    Yes  Yes   No    No   No   No     Yes
    3     No   No   No    No   No   No      No
    4    Yes  Yes  Yes   Yes  Yes  Yes     Yes
    5     No  Yes  Yes   Yes   No  Yes     Yes
    6     No   No   No    No   No  Yes      No
    7     No   No   No    No   No   No      No
    8    Yes  Yes  Yes    No   No   No     Yes
    9    Yes   No  Yes   Yes   No   No     Yes
    10   Yes   No   No    No  Yes   No     Yes
    11    No   No  Yes   Yes   No  Yes     Yes
    12    No   No   No    No  Yes  Yes     Yes
    13    No   No   No   Yes   No  Yes     Yes