I am trying to split a dataset into two separate ones by finding its minimum point in the first column. I have used idxmin to firstly identify the location of the minimum entry and secondly iloc to slice the array from 0 to the minimum point.
The error I encounter is:
TypeError: cannot do positional indexing on RangeIndex with these indexers [1 96
dtype: int64] of type Series
An example dataset is as shown:
x y
0 1.000000 6
1 1.000000 2
2 0.999999 5
3 0.999996 3
4 0.999986 4
.. ... ...
196 0.999987 3
197 0.999996 3
198 0.999999 2
199 1.000000 1
200 1.000000 4
The x column starts from 1 and decreases to a minimum point near zero, where it increases back to 1. I am looking for the smallest x and its corresponding y point to separate the two.
This is the current code I have written:
data = pd.DataFrame(data)
minimum = pd.DataFrame.idxmin(data)
lower_surface = data.iloc[:minimum]
I understand that the variable minimum will return a location in the DataFrame, and hence I thought I could use iloc to separate the array from the beginning to the minimum point but this is not the case.
You should pick one column as reference. Using the whole DataFrame, you will get an index for each column, which cannot be used to slice:
data.idxmin()
x 4
y 199
dtype: int64
You should instead run:
minimum = data['x'].idxmin()
Also, technically you have to use loc
to slice, not iloc
since idxmax
return an indice not a position.
data.loc[:minimum]
Output:
x y
0 1.000000 6
1 1.000000 2
2 0.999999 5
3 0.999996 3
4 0.999986 4
If you want to slice with iloc
you have to use numpy.argmin
:
import numpy as np
data.iloc[:np.argmin(data['x'])]
The output is however slightly different since iloc
excludes the end of the slice:
x y
0 1.000000 6
1 1.000000 2
2 0.999999 5
3 0.999996 3