I need to calculate the value which has the similar name in the header. Here i have Bill and Non Bill fields in the column. I need to calculate all the Bill separately and non-bill separately and need to sum in another column Bill Amt Total and Non Bill Amt Total using xlswrriter in python.
Input file
Name | Bill - Php | Non Bill - Php | Bill - JS | Non Bill -JS
Alex | 30 | | 10 |
Ram | 10 | 20 | |
Stephen | | | 20 |
Robert | | 10 | | 10
Mohan | | 20 | 10 |
Output file:
Name | Bill - Php | Non Bill - Php | Bill - JS | Non Bill -JS | Bill Total Amt | Non Bill Total Amt
Alex | 30 | | 10 | | 40 |
Ram | 10 | 20 | | | 10 | 20
Stephen | | | 20 | | | 20
Robert | | 10 | | 10 | | 20
Mohan | | 20 | 10 | | 10 | 20
Simply select columns by string prefix (str.startswith()
) and then perform horizontal sum by df[selected_columns].sum(axis=1)
.
In addition, Pandas
comes with Excel save/load capability, so you don't really need to import xlsxwriter
or openpyxl
.
The data is reproduced in Excel format.
import pandas as pd
import io
import numpy as np
df = pd.read_csv(io.StringIO("""
Name | Bill - Php | Non Bill - Php | Bill - JS | Non Bill -JS
Alex | 30 | | 10 |
Ram | 10 | 20 | |
Stephen | | | 20 |
Robert | | 10 | | 10
Mohan | | 20 | 10 |
"""), sep=r"\|\s*", engine='python')
# cleanup
df.columns = [c.strip() for c in df.columns]
df["Name"] = df["Name"].str.strip()
# save .xlsx
df.to_excel("/mnt/ramdisk/data.xlsx", index=False)
# load .xlsx
df = pd.read_excel("/mnt/ramdisk/data.xlsx")
for prefix in ("Bill", "Non Bill"):
# select the columns to be summed
cols_to_sum = [c for c in df.columns if c.startswith(prefix)]
# new column name
col = f"{prefix} Amt Total"
# sum the selected columns horizontally
df[col] = df[cols_to_sum].sum(axis=1)
# (optional) replace 0 with nan
df[col] = df[col].replace({0.0: np.nan})
# save a new file
df.to_excel("/mnt/ramdisk/out.xlsx", index=False)
Check out the columns added:
print(df.iloc[:,-2:])
# Out[219]:
# Bill Amt Total Non Bill Amt Total
# 0 40.0 NaN
# 1 10.0 20.0
# 2 20.0 NaN
# 3 NaN 20.0
# 4 10.0 20.0