Search code examples
exceldataframevariablesdynamicxlsxwriter

Python - XlsxWriter: Referring to columns by column name, not column letter (Hardcode vs. dynamic variable)


I'm wondering if there's a way to refer to columns when using XlsxWriter by their name, and not the column letter or index, in the event that the column index changes but I still want the formatting to apply to the same column, even if its position changes.

I am also exporting the contents of a data frame to an Excel file beforehand so if there's a way to reference the column using Pandas and then assign that reference to a variable that's invoked where the column reference in the XlsxWriter method goes then that could work too.

Instead of this:

df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation(A1:A500, {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

It it possible to write some sort of dynamic variable in place of the hardcoded cell references?

df.to_excel(writer, sheet_name='RCM_Output',index = False)

worksheet.data_validation('Block Status':1:500, {'validate': 'list',
                                  'source': ['blocked', 'unblocked']})

Solution

  • The best way to do this is to use the row/col notation form of data_validation() and use Pandas to get the index of the column and the dimensions of the dataframe. Something like this:

    import pandas as pd
    
    
    # Create a Pandas dataframe from some data.
    df = pd.DataFrame(
        {
            "Data 1": [10, 20, 30, 20],
            "Data 2": [10, 20, 30, 20],
            "Block Status": ["unblocked", "blocked", "unblocked", "unblocked"],
            "Data 3": [10, 20, 30, 20],
            "Data 4": [10, 20, 30, 20],
        }
    )
    
    # Create a Pandas Excel writer using XlsxWriter as the engine.
    writer = pd.ExcelWriter("pandas_example.xlsx", engine="xlsxwriter")
    
    # Convert the dataframe to an XlsxWriter Excel object.
    df.to_excel(writer, sheet_name="Sheet1")
    
    # Get the xlsxwriter workbook and worksheet objects.
    workbook = writer.book
    worksheet = writer.sheets["Sheet1"]
    
    # Get the max row from the dimensions of the dataframe.
    row_num = df.shape[0]
    
    # Get the column index from the name. We add 1 to account for the index.
    # If the index is turned off then this can be omitted.
    col_num = 1 + df.columns.get_loc("Block Status")
    
    # Add the data validation based on these dimensions.
    worksheet.data_validation(
        1, col_num, row_num, col_num,
        {
            "validate": "list",
            "source": ["blocked", "unblocked"]
        },
    )
    
    # Close the Pandas Excel writer and output the Excel file.
    writer.close()
    
    

    Output:

    enter image description here