Search code examples
pythonpandasdataframepivot-tabledata-analysis

how to bring same category columns under a single header in a pandas dataframe?


I am new to pandas and facing difficulty to get an idea, to solve this problem, can someone help me with a way to approach this.

I have a source dataframe in the below format enter image description here

Here I am sharing the code to generate source dataframe:

df = pd.DataFrame({"Employee ID":[1, 2, 3, 4, 5], "Name":["A", "B", "C", "D", "E"],
               "Departments":["hr", "hr, manager", "manager", "developer", "tester, manager, developer"],
               "Groups":["group-1", "group-2, group-3", "", "", "group-1"]})

print(df)

enter image description here

I would like to convert it to the below format expected output

I have tried with pivot table and other stackover flow example but it's not helping much, can someone help me a way to understand and solve this problem.

Thanks in advance :)


Solution

  • You can use custom function with Series.str.get_dummies, then if need replace values by columns names add numpy.where, but first convert columns for not processing to MultiIndex by DataFrame.set_index:

    def f(x):
        m = x.str.get_dummies(', ').astype(bool)
        a = np.where(m, m.columns, '')
        return pd.DataFrame(a, columns=m.columns, index=x.index)
    
    df1 = df.set_index(['Employee ID','Name'])
    
    df = pd.concat([f(df1[x]) for x in df1.columns], axis=1, keys=df1.columns)
    print (df)
                     Departments                        Groups                  
                       developer  hr  manager  tester  group-1  group-2  group-3
    Employee ID Name                                                            
    1           A                 hr                   group-1                  
    2           B                 hr  manager                   group-2  group-3
    3           C                     manager                                   
    4           D      developer                                                
    5           E      developer      manager  tester  group-1