In a grouped (by id
) data frame df
id, last_name, first_name
sdfg, Muller, Peter
fgjh, Smith, Rob
fgjh, Smith, Robert
hjsa, Jackson, Michael
I want to add another index person
at the very beginning as an ongoing number, so that we'll get
person, id, last_name, first_name
0, sdfg, Muller, Peter
1, fgjh, Smith, Rob
1, fgjh, Smith, Robert
2, hjsa, Jackson, Michael
I tried to use reset_index(...)
, but this sets an ongoing number regardless of the grouping.
Is it possible at all?
Here's one way to do this:
df.set_index((~df.index.duplicated()).cumsum()-1, append=True).swaplevel()
# last_name first_name
# id
# 0 sdfg Muller Peter
# 1 fgjh Smith Rob
# fgjh Smith Robert
# 2 hjsa Jackson Michael
This approach works by negating the result of df.index.duplicated
and taking the cumulative sum (minus 1) to generate an array that increments each time the value in the index changes. It then adds it to your existing index with set_index
and append=True
, before finally using swaplevel
so that the id
portion of your new MultiIndex is at level 1.