I'm new to pandas so apologies if my explanations of things are wrong.
I have a data frame created as follows:
arrays = [array1, array2]
index = pd.MultiIndex.from_arrays(arrays, names = ("name1", "name2"))
df = pd.DataFrame({"name3": array3, "name4": array4}, index=index)
Then I perform a weighted mean, using the indices, using code from the second top answer here.
df2 = df.groupby(df.index).apply(lambda x: np.average(x.name3, weights=x.name4))
print(df2)
The output on the console looks like this:
(1, 2) 3
(4, 5) 6
(7, 8) 9
where (x,y) are the indices that I have grouped by and the number at the end is the weighted mean.
When I export to a .csv file, I get a file that looks like this:
,0
"(1, 2)",3
"(4, 5)",6
"(7, 8)",9
This is not what I want. I would like to get a .csv file that looks like this:
name1,name2,avg
1,2,3
4,5,6
7,8,9
I've tried using reset.index() but this does not work. I want to remove the brackets, quotation marks and the rogue ,0 at the start of the .csv file. How can I do this? Many thanks in advance.
Use, df.groupby
level instead of indices:
df2 = df.groupby(level=df.index.names).apply(lambda x: np.average(x.name3, weights=x.name4))
# save the df2 to csv file
df2.rename('avg').reset_index().to_csv('data.csv', index=False)