python excel pandas dataframe multi-index

How to prevent repeating data for additional indexes in a pandas dataframe

I wanted to see how I can add new data when I include a new index into the excel sheet. This is what I mean:

Picture 1 (What my code currently outputs):

Picture 2 (What I want it to output):

I want to add [1, 2] to row 1, data type 2, and [3, 4] to row 2, data type 2. Here's my code for picture 1:

import pandas as pd

# Create a Pandas dataframe from the data.
df1 = pd.DataFrame([['a', 'b'], ['c', 'd']],
                    index=['row 1', 'row 2'],
                    columns=['col 1', 'col 2'])

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')

# Add extra index
new_idx = pd.Index(['data type 1', 'data type 2'])
mi1 = pd.MultiIndex.from_product([df1.index, new_idx])
out1 = df1.reindex(df1.index.repeat(len(new_idx))).set_index(mi1)

# Convert the dataframe to an XlsxWriter Excel object.
out1.to_excel(writer, sheet_name='Sheet1')

# Close the Pandas Excel writer and output the Excel file.
writer.close()

Is there a possible way to do this?

Solution

slice(None) can be used to select all the rows at that level. Since you want to change values where the index is 'data type 2' in level=1, try the following:

# write new values to data type 2 indices
out1.loc[(slice(None), 'data type 2'), :] = [[1, 2], [3, 4]]

So the full code would look like:

# Create a Pandas dataframe from the data.
df1 = pd.DataFrame([['a', 'b'], ['c', 'd']], index=['row 1', 'row 2'], columns=['col 1', 'col 2'])

# # Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')

# Add extra index
new_idx = pd.Index(['data type 1', 'data type 2'])
mi1 = pd.MultiIndex.from_product([df1.index, new_idx])
out1 = df1.reindex(df1.index.repeat(len(new_idx))).set_index(mi1)

# write new values to data type 2 indices
out1.loc[(slice(None), 'data type 2'), :] = [[1, 2], [3, 4]]    # <-- add this line

# # Convert the dataframe to an XlsxWriter Excel object.
out1.to_excel(writer, sheet_name='Sheet1')

writer.close()

Then pd.read_excel('pandas_simple.xlsx', index_col=[0,1]) produces the following frame:

Another way is to change the way you contruct out1; instead of reindexing df1, construct another dataframe using the same "metadata" as df1 and concatenate the two:

out1 = (
    pd.concat({
        'data type 1': df1, 
        'data type 2': pd.DataFrame([[1, 2], [3, 4]], columns=df1.columns, index=df1.index)
    }).swaplevel().sort_index()
)