Im comparing two HR related excel sheets. I'm comparing a previous weeks info to a current weeks info to ensure people were promoted as they should be.
Overall it produces an excel sheet that combines both dataframes and highlights the differences of any employee with a promotion or step increase. This works, but I have to go into the excel sheet after and manually make formatting changes before I can provide a "final copy" for others to use.
My issue is there is a function that highlights differences in pay settings between the two dataframes and highlights them. Prior to exporting to_excel, I am trying to make changes to that excel file using some type of formatting. This produces an error because the color_changes = merger_changes.style.apply(highlight, axis = None) produces a Styler, which openpyxl wont work with because its a styler.
I tried different products to get the changed columns to highlight and then create a final excel product using openpyxl that will produce an excel sheet that I dont have to edit after. I keep getting errors because openpyxl wont work with a styler. Am I coding this out of order? Will another product work with a styler rather than openpyxl or am I just missing a block of code to make this all work together?
I want it to not only highlight the differences in the "new step" and "new grade" but then export to an excel sheet that I can adjust the format (column widths and date format) prior to exporting to_excel so the excel sheet that it exports is a final product.
import pandas as pd
import datetime as dt
#new_WFRL
new_WFRL = pd.read_excel("H:/DIR/Human Resources/HR Audits/Raw Files/WFRL 6.24.24.xlsx", usecols=["Empl ID", "Name", "Grade", "Step", "Grade Entry Date", "Step Date", "WGI Due Dt"], parse_dates=["Grade Entry Date", "Step Date", "WGI Due Dt"], date_format="%m/%d/%y", index_col=False).sort_values("Name")
new_WFRL = new_WFRL.add_prefix("New ")
new_WFRL[['New Grade Entry Date', 'New Step Date', 'New WGI Due Dt']] = new_WFRL[['New Grade Entry Date', 'New Step Date', 'New WGI Due Dt']].apply(pd.to_datetime, format='mixed')
#old_WFRL
old_WFRL = pd.read_excel("H:/DIR/Human Resources/HR Audits/Raw Files/WFRL 6.14.24.xlsx", usecols=["Empl ID", "Name", "Grade", "Step", "Grade Entry Date", "Step Date", "WGI Due Dt"], parse_dates=["Grade Entry Date", "Step Date", "WGI Due Dt"], date_format="%m/%d/%y", index_col=False).sort_values("Name")
old_WFRL = old_WFRL.add_prefix("Old ")
old_WFRL[['Old Grade Entry Date', 'Old Step Date', 'Old WGI Due Dt']] = old_WFRL[['Old Grade Entry Date', 'Old Step Date', 'Old WGI Due Dt']].apply(pd.to_datetime, format='mixed')
merged_WFRL = pd.merge(old_WFRL, new_WFRL, how = "outer", left_on = "Old Empl ID", right_on = "New Empl ID")
merged_WFRL
#this is the function that will show a change
def compare_WFRL(df):
if df["Old Grade"] == df["New Grade"] and df["Old Step"]== df["New Step"]:
return 1
else:
return 0
#applies the above function
merged_WFRL["changes"] = merged_WFRL.apply(compare_WFRL, axis =1)
#filters all the rows so changes are the only ones left
merged_changes = merged_WFRL[merged_WFRL["changes"] == 0]
merged_changes
def highlight(df):
grade_mismatch = df['Old Grade'] != df['New Grade']
step_mismatch = df['Old Step'] != df['New Step']
color = df.copy()
color.loc[:, :] = ''
color.loc[grade_mismatch, 'New Grade'] = 'background-color: yellow'
color.loc[step_mismatch, 'New Step'] = 'background-color: red'
return color
merged_changes.pop('changes')
#applies the correct style as designed above in "highlight"
#this creates the dataframe into a styler - openpyxl wont work with stylers
color_changes = merged_changes.style.apply(highlight, axis = None)
#index=False drops the index column when exporting to excel.
color_changes.to_excel("H:/DIR/Human Resources/HR Audits/WFRL Comparison Reports/6.24.24v6.xlsx", index=False)
I would suggest to save the dataframe first using the pandas to_excel() function.
Then using the openpyxl library to format the excel.
import pandas as pd
import datetime as dt
#new_WFRL
new_WFRL = pd.read_excel("H:/DIR/Human Resources/HR Audits/Raw Files/WFRL 6.24.24.xlsx", usecols=["Empl ID", "Name", "Grade", "Step", "Grade Entry Date", "Step Date", "WGI Due Dt"], parse_dates=["Grade Entry Date", "Step Date", "WGI Due Dt"], date_format="%m/%d/%y", index_col=False).sort_values("Name")
new_WFRL = new_WFRL.add_prefix("New ")
new_WFRL[['New Grade Entry Date', 'New Step Date', 'New WGI Due Dt']] = new_WFRL[['New Grade Entry Date', 'New Step Date', 'New WGI Due Dt']].apply(pd.to_datetime, format='mixed')
#old_WFRL
old_WFRL = pd.read_excel("H:/DIR/Human Resources/HR Audits/Raw Files/WFRL 6.14.24.xlsx", usecols=["Empl ID", "Name", "Grade", "Step", "Grade Entry Date", "Step Date", "WGI Due Dt"], parse_dates=["Grade Entry Date", "Step Date", "WGI Due Dt"], date_format="%m/%d/%y", index_col=False).sort_values("Name")
old_WFRL = old_WFRL.add_prefix("Old ")
old_WFRL[['Old Grade Entry Date', 'Old Step Date', 'Old WGI Due Dt']] = old_WFRL[['Old Grade Entry Date', 'Old Step Date', 'Old WGI Due Dt']].apply(pd.to_datetime, format='mixed')
merged_WFRL = pd.merge(old_WFRL, new_WFRL, how = "outer", left_on = "Old Empl ID", right_on = "New Empl ID")
merged_WFRL
#this is the function that will show a change
def compare_WFRL(df):
if df["Old Grade"] == df["New Grade"] and df["Old Step"]== df["New Step"]:
return 1
else:
return 0
#applies the above function
merged_WFRL["changes"] = merged_WFRL.apply(compare_WFRL, axis =1)
#filters all the rows so changes are the only ones left
merged_changes = merged_WFRL[merged_WFRL["changes"] == 0]
# Export DataFrame to Excel
export_file = "H:/DIR/Human Resources/HR Audits/WFRL Comparison Reports/6.24.24v6.xlsx"
merged_changes.to_excel(export_file, index=False)
######################################################################################################
from openpyxl import load_workbook
from openpyxl.styles import PatternFill
# Load the workbook and select the active worksheet
wb = load_workbook(export_file)
ws = wb.active
# Define cell styles
yellow_fill = PatternFill(start_color="FFFF00", end_color="FFFF00", fill_type="solid")
red_fill = PatternFill(start_color="FF0000", end_color="FF0000", fill_type="solid")
# Apply conditional formatting
for row in range(2, len(merged_changes) + 2):
old_grade_cell = ws.cell(row=row, column=1) # Old Grade
new_grade_cell = ws.cell(row=row, column=2) # New Grade
old_step_cell = ws.cell(row=row, column=3) # Old Step
new_step_cell = ws.cell(row=row, column=4) # New Step
# Check for New Grade not equal to Old Grade
if new_grade_cell.value != old_grade_cell.value:
new_grade_cell.fill = yellow_fill
# Check for New Step not equal to Old Step
if new_step_cell.value != old_step_cell.value:
new_step_cell.fill = red_fill
# Save the workbook
wb.save(export_file)
I have assumed only 4 columns to be present in the merged_changes dataframe in the sequence of 'Old Grade', 'New Grade', 'Old Step', 'New Step'