Search code examples
pythonpandasdataframeindexinglevels

what is level parameter in df.reset_index() & what is the use of it?


In pandas.DataFrame.reset_index documentation here the functionality of 'level' parameter is written as: Only remove the given levels from the index. Removes all levels by default. But when i used the below code line: df.reset_index(level = 0) It did nothing

below are the result without & with level parameter.

>>df.reset_index(inplace = True)

      index    y     proba  y_cap
0       1664  1.0  0.899965      1
1       2099  1.0  0.899828      1
2       1028  1.0  0.899825      1
3       9592  1.0  0.899812      1
4       8324  1.0  0.899768      1
...      ...  ...       ...    ...
10095   8294  1.0  0.500081      1
10096   1630  1.0  0.500058      1
10097   7421  1.0  0.500058      1
10098    805  1.0  0.500047      1
10099   5012  1.0  0.500019      1

>>df.reset_index(level = 0, inplace = True)

       index    y     proba  y_cap
0       1664  1.0  0.899965      1
1       2099  1.0  0.899828      1
2       1028  1.0  0.899825      1
3       9592  1.0  0.899812      1
4       8324  1.0  0.899768      1
...      ...  ...       ...    ...
10095   8294  1.0  0.500081      1
10096   1630  1.0  0.500058      1
10097   7421  1.0  0.500058      1
10098    805  1.0  0.500047      1
10099   5012  1.0  0.500019      1

Also if running any of the below code blocks for 2nd time:

>>df.reset_index(inplace = True)

OR

>>df.reset_index(level = 0, inplace = True)

I get below output with level_0 column added to it?? with some random values in it.

       level_0  index    y  proba  y_cap
0            0   1664  1.0    1.0      1
1         3808   2280  1.0    1.0      1
2         3828   6394  1.0    1.0      1
3         3827   3410  1.0    1.0      1
4         3826   4992  1.0    1.0      1
...        ...    ...  ...    ...    ...
10095     7193   5399  1.0    0.0      1
10096     7194   1801  1.0    0.0      1
10097     7195   3777  1.0    0.0      1
10098     7196   3314  1.0    0.0      1
10099    10099   5012  1.0    0.0      1

And again if I run the code for 3rd time it pops below error:

cannot insert level_0, already exists

Please let me understand the significance of level parameter & when it is used? As when i re-run my code it adds the level_0 column


Solution

  • The level parameter applies to DataFrames with multi-indexes.