Search code examples
pythonexcelxlrdxlwtopenpyxl

Insert row into Excel spreadsheet using openpyxl in Python


I'm looking for the best approach for inserting a row into a spreadsheet using openpyxl.

Effectively, I have a spreadsheet (Excel 2007) which has a header row, followed by (at most) a few thousand rows of data. I'm looking to insert the row as the first row of actual data, so after the header. My understanding is that the append function is suitable for adding content to the end of the file.

Reading the documentation for both openpyxl and xlrd (and xlwt), I can't find any clear cut ways of doing this, beyond looping through the content manually and inserting into a new sheet (after inserting the required row).

Given my so far limited experience with Python, I'm trying to understand if this is indeed the best option to take (the most pythonic!), and if so could someone provide an explicit example. Specifically can I read and write rows with openpyxl or do I have to access cells? Additionally can I (over)write the same file(name)?


Solution

  • Answering this with the code that I'm now using to achieve the desired result. Note that I am manually inserting the row at position 1, but that should be easy enough to adjust for specific needs. You could also easily tweak this to insert more than one row, and simply populate the rest of the data starting at the relevant position.

    Also, note that due to downstream dependencies, we are manually specifying data from 'Sheet1', and the data is getting copied to a new sheet which is inserted at the beginning of the workbook, whilst renaming the original worksheet to 'Sheet1.5'.

    EDIT: I've also added (later on) a change to the format_code to fix issues where the default copy operation here removes all formatting: new_cell.style.number_format.format_code = 'mm/dd/yyyy'. I couldn't find any documentation that this was settable, it was more of a case of trial and error!

    Lastly, don't forget this example is saving over the original. You can change the save path where applicable to avoid this.

        import openpyxl
    
        wb = openpyxl.load_workbook(file)
        old_sheet = wb.get_sheet_by_name('Sheet1')
        old_sheet.title = 'Sheet1.5'
        max_row = old_sheet.get_highest_row()
        max_col = old_sheet.get_highest_column()
        wb.create_sheet(0, 'Sheet1')
    
        new_sheet = wb.get_sheet_by_name('Sheet1')
    
        # Do the header.
        for col_num in range(0, max_col):
            new_sheet.cell(row=0, column=col_num).value = old_sheet.cell(row=0, column=col_num).value
    
        # The row to be inserted. We're manually populating each cell.
        new_sheet.cell(row=1, column=0).value = 'DUMMY'
        new_sheet.cell(row=1, column=1).value = 'DUMMY'
    
        # Now do the rest of it. Note the row offset.
        for row_num in range(1, max_row):
            for col_num in range (0, max_col):
                new_sheet.cell(row = (row_num + 1), column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value
    
        wb.save(file)