Search code examples
pythonpandasdataframenumpydata-cleaning

Need specific sets of columns to be converted into a row and the rest of columns to repeat values


I have data in the following format

ID SCHOOL Name1 Name1 Subject1 Name1 Grade1 Name1 Subject2 Name1 Grade2 Name2 Name2 Subject1 Name2 Grade1 Name2 Subject2 Name2 Grade2
1 S1 Mr. ABC Math 6 Science 7 Mr. XYZ Social 8 EVS 9
2 S2 Mr. PQR Math 10 Science 11 Mr. KLM Social 8 EVS 9

Can I transform it in the following format using Python

ID SCHOOL Name Subject Grade
1 S1 Mr. ABC Math 6
1 S1 Mr. ABC Science 7
1 S1 Mr. XYZ Social 8
1 S1 Mr. XYZ EVS 9
2 S2 Mr. PQR Math 10
2 S2 Mr. PQR Science 11
2 S2 Mr. KLM Social 8
2 S2 Mr. KLM EVS 9

Solution

  • there might be a nicer solution but this also works:

    df_1=df[['ID', 'SCHOOL','Name1', 'Name1 Subject1',
           'Name1 Grade1']]
    df_2=df[['ID', 'SCHOOL','Name1', 'Name1 Subject2',
           'Name1 Grade2']]
    
    df_3=df[['ID', 'SCHOOL','Name2', 'Name2 Subject1',
           'Name2 Grade1']]
    
    df_4=df[['ID', 'SCHOOL','Name2', 'Name2 Subject2',
           'Name2 Grade2']]
    
    df_list=[df_1,df_2,df_3,df_4]
    for i in df_list:
        i.columns=['ID','SCHOOL','Name','Subject','Grade']
    final=pd.concat(df_list)
    print(final)
    '''
        ID  SCHOOL  Name    Subject Grade
    0   1   S1      Mr. ABC Math    6
    1   2   S2      Mr. PQR Math    10
    0   1   S1      Mr. ABC Science 7
    1   2   S2      Mr. PQR Science 11
    0   1   S1      Mr. XYZ Social  8
    1   2   S2      Mr. KLM Social  8
    0   1   S1      Mr. XYZ EVS     9
    1   2   S2      Mr. KLM EVS     9
    
    '''