I would like to use a MultiIndex DataFrame to easily select portions of the DataFrame. I created an empty DataFrame as follows:
mi = mindex = {'input':['a','b','c'],'optim':['pareto','alive']}
mi = pd.MultiIndex.from_tuples([(c,k) for c in mi.keys() for k in mi[c]])
mc = pd.MultiIndex(names=['Generation','Individual'],labels=[[],[]],levels=[[],[]])
population = pd.DataFrame(index=mi,columns=mc)
which seems to be good. However, I do not know how to insert a single data to start populating my DataFrame. I tried the followings:
population.loc[('optim','pareto'),(0,0)]=True
where I tried to define a new column double index (0,0) leading to a NotImplementedError
. I also tried with (0,1), which gave a ValueError
.
I tried also with no columns indexes:
population.loc[('optim','pareto')]=True
Which gave no error...but no change in the DataFrame either... Any help? Thanks in advance.
EDIT To clarify my question, once populated, my DataFrame should look like this:
Generation 1 2
Individual 1 2 3 4 5 6
input a 1 1 2 ...
b 1 2 2 ...
c 1 1 2 ...
optim pareto True True False ...
alive True True False ...
EDIT 2 I found out that what I was doing works if I define my first column at the DataFrame creation. In particular with:
mc = pd.MultiIndex.from_tuples([(0,0)])
I get a first column full of nan
and I can add data as I wanted to (also for new columns):
population.loc[('optim','pareto'),(0,1)]=True
I still do not know what is wrong with my first definition...
Even if I do not know why my initial definition was wrong, the following works as expected:
mi = {'input':['a','b','c'],'optim':['pareto','alive']}
mi = pd.MultiIndex.from_tuples([(c,k) for c in mi.keys() for k in mi[c]])
mc = pd.MultiIndex.from_tuples([(0,0)],names=['Generation','Individual'])
population = pd.DataFrame(index=mi,columns=mc)
It looks like the solution was to initialize the columns at the DataFrame creation (here with a (0,0) column). The created DataFrame is then:
Generation 0
Individual 0
input a NaN
b NaN
c NaN
optim pareto NaN
alive NaN
which can be then be populated adding values to the current column or new columns/rows.