I am trying to simply divide two columns element-wise, but for some reason this returns two columns instead of one as I would expect.
I think it has something to do with the fact that I need to create the dataframe iteratively, so I opted for by appending rows one at a time. Here's some testing code:
import pandas as pd
df = pd.DataFrame(columns=['image_name partition zeros ones total'.split()])
# Create a DataFrame
data = {
'dataset': ['177.png', '276.png', '208.png', '282.png'],
'partition': ['green', 'green', 'green', 'green'],
'zeros': [1896715, 1914720, 1913894, 1910815],
'ones': [23285, 5280, 6106, 9185],
'total': [1920000, 1920000, 1920000, 1920000]
}
for i in range(len(data['ones'])):
row = []
for k in data.keys():
row.append(data[k][i])
df = df.append(pd.Series(row, index=df.columns), ignore_index=True)
df_check = pd.DataFrame(data)
df_check["result"] = df_check["zeros"] / df_check["total"]
df["result"] = df["zeros"] / df["total"]
df
If you try to run this, you'll see that all work as expected with df_check
and the code fails when it get to df["result"] = df["zeros"] / df["total"]
:
ValueError: Cannot set a DataFrame with multiple columns to the single column result
In fact, If I try to inspect the result of the division I notice there are two columns with all missing values:
>>> df["zeros"] / df["total"]
total zeros
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
Any suggestion why this happens and how to fix it?
The problem is the following line
df = pd.DataFrame(columns=['image_name partition zeros ones total'.split()])
the split()
method create a list itself, so avoid the list and use the following
df = pd.DataFrame(columns='image_name partition zeros ones total'.split())