Search code examples
pandasdataframepandas.excelwriter

python pandas dataframe write multi row header


data1

A  B  C  D  E   <--- columns
a  b  c  d  e
a  b  c  d  e
a  b  c  d  e


result what i want

A  B  C  D  E <--- columns
A  B  C  D  E <--- columns
A  B  C  D  E <--- columns
a  b  c  d  e
a  b  c  d  e
a  b  c  d  e

I searched for this one and it finally failed :)

How do I get good results? I ask for your help. thank you


Thank you verymuch

When converting to Excel, a new empty column was created.

test

a   b   c   d   e   f   g   h   i   j   k   l   m   n   o   p   q   r   s   t   u   v   w   x   y
1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2
3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3
4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4
4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4
5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5


dir_path = os.path.dirname(os.path.realpath(__file__))
df = pd.read_excel(dir_path + '/import/sample.xlsx', header=0)
headers = [list(df.columns), list(df.columns), list(df.columns)]
df.columns = pd.MultiIndex.from_arrays(headers)
writer = pd.ExcelWriter(dir_path + '/export/sample.xlsx', options={'strings_to_urls': False}, )
df.to_excel(writer, sheet_name='result')
writer.save()

result

    a   b   c   d   e   f   g   h   i   j   k   l   m   n   o   p   q   r   s   t   u   v   w   x   y
    a   b   c   d   e   f   g   h   i   j   k   l   m   n   o   p   q   r   s   t   u   v   w   x   y
    a   b   c   d   e   f   g   h   i   j   k   l   m   n   o   p   q   r   s   t   u   v   w   x   y
                                <<<<---- empty row  ---->>>>                                                    0   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
1   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2
2   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3   3
3   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4
4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4
5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5

I tried to get rid of it but failed.

df = df[df['a'] != ""]

Solution

  • Use MultiIndex.from_arrays:

    df.columns = pd.MultiIndex.from_arrays([df.columns] * 3)
    print (df)
       A  B  C  D  E
       A  B  C  D  E
       A  B  C  D  E
    0  a  b  c  d  e
    1  a  b  c  d  e
    2  a  b  c  d  e
    

    If need custom MultiIndex:

    L = [list('abcde'), list('fghij'), list('klmno')]
    df.columns = pd.MultiIndex.from_arrays(L)
    print (df)
       a  b  c  d  e
       f  g  h  i  j
       k  l  m  n  o
    0  a  b  c  d  e
    1  a  b  c  d  e
    2  a  b  c  d  e
    

    Also is possible use next 2 functions for create MultiIndex:

    EDIT:

    Empty row is bug and solution is here:

    writer = pd.ExcelWriter("file1.xlsx")
    headers = pd.DataFrame(df.columns.tolist()).T
    headers.to_excel(writer, header=False, index=False)
    df.columns = pd.Index(range(len(df.columns)))  
    df.to_excel( writer, header=False, index=False, startrow=len(headers))
    writer.save()