Search code examples
pythonexcelpandasdataframeheader

Pandas - Use second row as header and keep the first row in dataframe


I have an excel file with multiple rows as a template. enter image description here

I need to keep all rows in the dataframe when read with pandas but the last of these rows must be the header. Here is an example after reading the excel into a df, where the first row has actually onle one field with content (Version=2.0) and the second row (index 0) should be the headers.

   Version=2.0   Unnamed: 1    Unnamed: 2             Unnamed: 3   Unnamed: 4   Unnamed: 5
0  Part #        Manufacturer  Manufacturer Parts #   UPC / ISBN   NE#          Currency

When read with header=[1], the first line is totally omitted, but I need to keep it.

After filling up the dataframe with other data, I want to print the whole dataframe into an excel keeping the first rows and using the last of the template rows as headers as mentioned above.

I want the result to be as follows:

   Version=2.0
   Part #        Manufacturer  Manufacturer Parts #   UPC / ISBN   NE#          Currency
0  data          data          data                   data         data         data
1  data          data          data                   data         data         data

I tried reading the excel with header=[0,1] but when written, the excel file looks like this, using the code: df.to_excel("test.xlsx", sheet_name='Sheet1', index=True, merge_cells=False)

enter image description here

With index=False it does not work at all as multiindex to excel does not work without using the index according to the error message.


Solution

  • Save your first row (Version=2.0) and read_excel with skiprows=1 to skip the first line:

    with pd.ExcelFile('test.xlsx', engine='openpyxl') as xlr:
        version = xlr.book.active['A1'].value
        df = pd.read_excel(xlr, skiprows=1)
    

    Use the same method to save your file:

    with pd.ExcelWriter('test2.xlsx', engine='openpyxl') as xlw:
        df.to_excel(xlw, startrow=1, index=False)
        xlw.book.active['A1'] = version