Search code examples
pythonpandasdataframebinarymultiple-columns

Convert binary columns into columns with string values based on their column headers?


This is my dataset:

Dept Cell culture Bioinfo Immunology Trigonometry Algebra Microbio Optics
Biotech 1 1 1 0 0 0 0
Biotech 1 0 1 0 0 0 0
Math 0 0 0 1 1 0 0
Biotech 0 0 0 0 0 1 0
Physics 0 0 0 0 0 0 1

This is how I want my result:

|Biotech|Cell culture|Bioinfo   |Immunology|
|Biotech|Cell culture|Immunology|          |
|Math   |Trigonometry|Algebra   |          |
|Biotech|Microbio    |          |          |
|Physics|Optics      |          |          |

Solution

  • Convert column Dept to index and for values with 1 convert columns names to lists and then to new DataFrame:

    df2 = df.set_index('Dept')
    
    cols = df2.columns.to_numpy()
    df2 = pd.DataFrame([cols[m] for m in df2.eq(1).to_numpy()], index=df2.index)
    print (df2)
                        0           1           2
    Dept                                         
    Biotech  Cell culture     Bioinfo  Immunology
    Biotech  Cell culture  Immunology        None
    Math     Trigonometry     Algebra        None
    Biotech      Microbio        None        None
    Physics        Optics        None        None
    

    If performance not important or small data:

    df2 = df.set_index('Dept')
    
    cols = df2.columns.to_numpy()
    df2 = pd.DataFrame(df2.eq(1).apply(lambda x: cols[x], axis=1).tolist(), index=df2.index)
    print (df2)
                        0           1           2
    Dept                                         
    Biotech  Cell culture     Bioinfo  Immunology
    Biotech  Cell culture  Immunology        None
    Math     Trigonometry     Algebra        None
    Biotech      Microbio        None        None
    Physics        Optics        None        None
    

    Last if need all columns:

    df2 = df2.reset_index()
    df2.columns = range(len(df2.columns))
    print (df2)
             0             1           2           3
    0  Biotech  Cell culture     Bioinfo  Immunology
    1  Biotech  Cell culture  Immunology        None
    2     Math  Trigonometry     Algebra        None
    3  Biotech      Microbio        None        None
    4  Physics        Optics        None        None