Rearranging data for multi line headers in pandas

I am working with the following data from FWI in pandas

I want to transform it to following form for each country

I have tried using the melt, stack, and pivot function but have not been able to achieve desired result with these multi line headers. Link for the dataset: https://freedomhouse.org/sites/default/files/2021-02/Country_and_Territory_Ratings_and_Statuses_FIW1973-2021.xlsx Can anyone help please?

Solution

You need to read the column headers as a MultiIndex, then unstack the columns (unstack works on the row index, so you have to transpose the dataframe).

df = pd.read_excel('Country_and_Territory_Ratings_and_Statuses_FIW1973-2021.xlsx', 'Territory Ratings, Statuses',skiprows=1, header=[0,1], index_col=0)

The issue with this particular Excel table is that there is a trailing space after some 'CL' header names (i.e. they are 'CL' and 'CL '). You can fix this by re-creating the index like so:

df.columns = pd.MultiIndex.from_arrays([df.columns.get_level_values(0),
                                        df.columns.get_level_values(1).str.strip()])

Then unstack:

result = df.T.unstack(0).T