Search code examples
pythonpandasdataframenesteddata-preprocessing

Multiple header/nested column name in Excel using Pandas


I want to convert/read an excel file that has at least 3 headers. I've searched some solutions about nested columns that can use header=[0,1] to read excel file when it has two row of header. But my file has 3 headers. When I use header=[0,2], it didn't read it as I expected. Below is the example of the data, it has three headers.

enter image description here

But when I read it with df_test = pd.read_excel("testing_column.xlsx", header=[0,2]) It only showed two headers (the first one and the third one)

enter image description here

Does anyone has any suggestion on how to solve this? Thank you!


Solution

  • You were very close to the answer as you just need to include the middle row as well, by specifying 0 and 2 it had accounted for the first and third row as the header, if you want to include the second one as well you can refer to it by index 1:

    df_test = pd.read_excel("testing_column.xlsx", header=[0,1,2])