Search code examples
pythonpandasopenpyxlxlwings

How to insert a row of value in a table with python (not Null Values?


I have a table with multiply lines as follows:

table1 col1 col2 col2
row1 1 2 3
row2 3 4 6
row3 4 5 7
row4 5 4 6
row5 6 2 3
row6 7 4 6

I want to change it like this:

table1 col1 col2 col2
row1 1 2 3
row2 3 4 6
table1 col1 col2 col2
row1 4 5 7
row3 5 4 6
table3 col1 col2 col2
row4 6 2 3
row5 7 4 6

namely,just insert a row(title) to separate it,because they belong to different subtables. I have try use insert function to inert a value of title

d.DataFrame(np.insert(df.values, 0, values=["col1", "col1", "col3"], axis=0))

but in a specific column of DataFrame, the type of all values must be the same. I also use xlwings (insert function)and openpyxl (insert_rows function) to insert one row but it seems that they can only insert with a blank value, not specific values. After finish constructing this table, I will use it to set some styles. In excel I just need to copy and paste, is there a flexible way? or inserting maybe not a good way, and just to split and combine tables(with subtitle and keep format)?

addition: [data link][1]

[1]: https://cowtransfer.com/s/a160ccec698a48,you need input code 454008


Solution

  • you can try:

    s=pd.DataFrame([df.columns]*int(len(df)/2),columns=df.columns)
    s.index=pd.Series(s.index+1).cumsum()
    df=pd.concat([df,s]).sort_index().iloc[:-1].reset_index(drop=True)
    

    output of df:

        table1  col1    col2    col2
    0   row1    1       2       3
    1   row2    3       4       6
    2   table1  col1    col2    col2
    3   row3    4       5       7
    4   row4    5       4       6
    5   table1  col1    col2    col2
    6   row5    6       2       3
    7   row6    7       4       6
    

    Update:

    you can try:

    s=pd.DataFrame([df.columns]*int(len(df)/16),columns=df.columns)
    s.index=[x+1 for x in range(16,16*(len(s)+1),16)]
    df=pd.concat([df,s]).sort_index().reset_index(drop=True)
    #If needed to remove last row then use:
    #df=df.iloc[:-1]
    

    Sample dataframe used by me for testing:

    import numpy as np
        
    df=pd.DataFrame()
    df['table1']=[f'row {x}' for x in range(65)]
    df['col1']=np.random.randint(1,10,(65,1))
    df['col2']=np.random.randint(1,10,(65,1))
    df['col3']=np.random.randint(1,10,(65,1))
    df.columns=[*df.columns[:-1],'col2']