Search code examples
openpyxlxlsxxlsxwriteroverwriteordereddictionary

How to overwrite in every sheet without modifying other existing content?


I have an ordereddict with cells containing data starting from row 6.

  • I would like to code a loop that allows me to overwrite the string ''random text'' on the A1 position of all the sheets in my excel file without erasing the other content.
  • Is this even possible with xlsxwriter ? Is it possible with openpyxl or any other library?

My code:

# Importing modules
import openpyxl as op
import pandas as pd
import numpy as np
import xlsxwriter
from openpyxl import Workbook, load_workbook

# Defining my file
my_file = r'\machukovich\Desktop\stack.xlsx'

# Loading the file into a dictionary of Dataframes
dfs_my_file = pd.read_excel(my_file, sheet_name=None, skiprows=2)

# The path of the new file I wish to write on
my_new_path = r'\machukovich\Desktop\new.xlsx'

# Create a Pandas Excel writer using XlsxWriter as the engine.
with pd.ExcelWriter(my_new_path, engine="xlsxwriter") as writer:
    for sheet_name, df in dfs_my_file.items():
            df.to_excel(writer, sheet_name=sheet_name, startrow=6, index=False)
            worksheet.write('A1', 'random text')
            worksheet.write(2, 0, 'random text 2')

# Close the Pandas Excel writer and output the Excel file.
writer.close()    
writer.save()

The error I am obtaining:

AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10248/1589633714.py in <module>
     25     for sheet_name, df in dfs_my_file.items():
     26             df.to_excel(writer, sheet_name=sheet_name, startrow=6, index=False)
---> 27             worksheet.write('A1', 'random text')
     28             worksheet.write(2, 0, 'random text 2')
     29 

AttributeError: 'dict' object has no attribute 'write'

My ordereddict:

{'Sheet_1':     ID      Name  Surname  Grade
 0  104  Eleanor     Rigby      6
 1  168  Barbara       Ann      8
 2  450    Polly   Cracker      7
 3   90   Little       Joe     10,
 'Sheet_2':     ID       Name   Surname  Grade
 0  106       Lucy       Sky      8
 1  128    Delilah  Gonzalez      5
 2  100  Christina   Rodwell      3
 3   40      Ziggy  Stardust      7,
 'Sheet_3':     ID   Name   Surname  Grade
 0   22   Lucy  Diamonds      9
 1   50  Grace     Kelly      7
 2  105    Uma   Thurman      7
 3   29   Lola      King      3}

As you can see I have tried by writing some code below the for loop. But I am not achieving nor understanding the error. I'd appreciate so much any explanation or tip beforehand.


Solution

  • It's a little confusing exactly what you're doing, the code seem incomplete and the error unusual;

    The error

    AttributeError: 'dict' object has no attribute 'write'
    

    pointing to the line

    worksheet.write('A1', 'random text')
    

    suggests that worksheet is a dictionary. Not sure why it would be but either way your code does not show how worksheet is defined so really it should be an undefined object.

    So back to the requirement;
    As best I can understand the Excel file 'stack.xlsx' is made up from 3 sheets. For all sheets, Header is on row 3 and data in the range A4:D7.
    The file you want to write to 'new.xlsx' status is unknown however it would appear from the fundamentals of the question that it exists and has sheets populated with some data.
    From the code your are reading all 3 sheets into a dataframe dfs_my_file which you then want to write to 'new.xlsx' as separate dataframes delimited by the sheet name along with the random text.

    This is being done with Pandas to_excel.

    1. If you use Xlsxwriter as the engine, then the file 'new.xlsx' will either be created if it doesn't exist or overwritten if it does. Meaning any existing data will be lost.
    2. If you use Openpyxl as the engine then the file can exist and any existing sheets/data may be retained other than data you are specifically overwriting, i.e if you write to cell 'A1' then any value already in that cell is lost.
      You can also create/overwrite workbooks/sheets with Openpyxl engine too if that was desired using the mode and 'if_sheet_exists` params.

      For clarity I have included examples for both engines below;

    Xlsxwriter I've made a change to your existing code below. 'new.xlsx' can exist or not but as mentioned if exists all existing data will be lost

    # Importing modules
    import pandas as pd
    
    
    # Defining my file
    my_file = r'\machukovich\Desktop\stack.xlsx'
    
    # Loading the file into a dictionary of Dataframes
    dfs_my_file = pd.read_excel(my_file, sheet_name=None, skiprows=2)
    
    # The path of the new file I wish to write on
    my_new_path = r'\machukovich\Desktop\new.xlsx'
    
    ### Create a Pandas Excel writer using XlsxWriter as the engine.
    ### Note Xlsxwiter is the default so doesn't need to be specified
    with pd.ExcelWriter(my_new_path, engine="xlsxwriter") as writer:
        for sheet_name, df in dfs_my_file.items():
            df.to_excel(writer, sheet_name=sheet_name, startrow=6, index=False)
            ### Create the worksheet object (this is an Xlsxwriter object)
            worksheet = writer.sheets[sheet_name]
            worksheet.write('A1', 'random text')
            worksheet.write(2, 0, 'random text 2')
    
    ### Using a Context Manager so there is no need for close or save
    # Close the Pandas Excel writer and output the Excel file.
    # writer.close()  
    # writer.save()
    

    Openpyxl To retain an existing 'new.xlsx' and any data contained within, use the Openpyxl engine;
    Obviously the file 'new.xlsx' must already exist or file not found error will be returned.

    # Importing modules
    import pandas as pd
    from openpyxl.styles import Font
    from openpyxl.styles.borders import Border, Side, BORDER_THICK
    from openpyxl.styles import Font, NamedStyle
    
    
    ### Create a Border format (thick style, outline)
    thick_border = Border(
        left=Side(border_style=BORDER_THICK, color='00000000'),
        right=Side(border_style=BORDER_THICK, color='00000000'),
        top=Side(border_style=BORDER_THICK, color='00000000'),
        bottom=Side(border_style=BORDER_THICK, color='00000000')
    )
    
    ### Create a Named Style (Sets Font size 14 to Black, Bold and thick outline border) 
    black_bold = NamedStyle(name="black_bold")
    black_bold.font = Font(color='00000000', bold=True,  size=14)
    bd_style = Side(border_style='thick')
    border = Border(left=bd_style, top=bd_style, right=bd_style, bottom=bd_style)
    black_bold.border = border
    
    # Defining my file
    my_file = r'\machukovich\Desktop\stack.xlsx'
    
    # Loading the file into a dictionary of Dataframes
    dfs_my_file = pd.read_excel(my_file, sheet_name=None, skiprows=2)
    
    # The path of the new file I wish to write on
    my_new_path = r'\machukovich\Desktop\new.xlsx'
    
    ### Create a Pandas Excel writer using Openpyxl as the engine.
    ### Set mode to a [append] and if_sheet_exists to 'overlay'
    with pd.ExcelWriter(my_new_path,
                        mode="a",
                        engine="openpyxl",
                        if_sheet_exists='overlay') as writer:
        for sheet_name, df in dfs_my_file.items():
            df.to_excel(writer, sheet_name=sheet_name, startrow=6, index=False)
            ### Create the workbook and worksheet objects (these are Openpyxl objects)
            workbook = writer.book
            worksheet = writer.sheets[sheet_name]
    
            ### Add the named style to the workbook so it can be used
            if 'black_bold' not in workbook.named_styles:
                workbook.add_named_style(black_bold)
    
            ### Use a loop for the Header Row 7. The loop syntax means the change is applied 
            ### to all cells in that row Or can set the loop for just the Header cells A7-D7
            for cell in worksheet["7:7"]:
                ### Either apply the named style to each cell in the row 
                # cell.style = 'black_bold'
                ### OR
                ### Apply formats to the cell individually
                # cell.font = Font(name="Tahoma", size=16, color="000000FF")
                # cell.border = thick_border
    
            worksheet['A1'].value = 'random text'
            worksheet.cell(2, 1).value = 'random text 2'
    

    ----Additional Information----
    Added cell formatting for the Header row using Named Style or Cell formatting.
    Note A named style is applied as a style to the workbook and can be selected from the Styles toolbar once added. If the style is modified in code for a workbook that already has the style added it will need to be removed first before the changed style can be added.