I have a data set consisting of 10 columns and I am currently using the groupby() function to get averages of the first 6 columns and then using groupby() to get min and max values of the other 4 columns. The issue is when I try to put the results into a new excel sheet, I keep getting errors that 'list' object has no attribute 'to_excel'
avg = df.groupby('column1')[['column2' , 'column3', 'column4' ,'column5', 'column6', 'column7' ]].mean()
c8= (df.assign(c8=df['column8'].abs())
.groupby(['column1'])['column8'].agg([('min' , 'min'), ('max', 'max')])
.add_prefix('c8'))
c9= (df.assign(c9=df['column9'].abs())
.groupby(['column1'])['column9'].agg([('min' , 'min'), ('max', 'max')])
.add_prefix('c9'))
c10= (df.assign(c10=df['column'].abs())
.groupby(['column1'])['column10'].agg([('min' , 'min'), ('max', 'max')])
.add_prefix('c10'))
df = [avg , c8, c9, c10]
#print(df)
wb = pd.ExcelWriter('C:filepath.xlsx' , engine='xlsxwriter')
df.to_excel(wb,sheet_name='Results')
wb.save()
Whenever I run any of the functions by themselves, they all work fine, the issue is when I put them together.
What I am expecting is an excel sheet filled with the corresponding data presented like this:
column1 column2 column3 column4 column5 column6 column7 c8min c8max c9min c9max c10min c10max
I would do it this way with GroupBy.agg
in a single shot :
fmin = lambda x: abs(x).min()
fmax = lambda x: abs(x).max()
out = (
df.groupby("column1", as_index=False)
.agg(column2=("column2", "mean"), column3=("column3", "mean"),
column4=("column4", "mean"), column5=("column5", "mean"),
column6=("column6", "mean"), column7=("column7", "mean"), # avg
c8min=("column8", fmin), c8max=("column8", fmax), # c8
c9min=("column9", fmin), c9max=("column9", fmax), # c9
c10min=("column10", fmin), c10max=("column10", fmax)) # c10
)
out.to_excel("C:/filepath.xlsx", sheet_name="Results", index=False)
Or following your approach, you can use concat
to avoid the AttributeError
:
list_dfs = [avg , c8, c9, c10]
(pd.concat(list_dfs, axis=1).reset_index()
.to_excel("C:/filepath.xlsx", sheet_name="Results", index=False))
By the way, why you're assigning the new columns cx
and not using them ? I suppose you want something like :
#Example with c8
c8= (df.assign(c8=df["column8"].abs())
.groupby(["column1"])["c8"].agg([("min" , "min"), ("max", "max")])
.add_prefix("c8"))