How to split an array using its minimum entry

I am trying to split a dataset into two separate ones by finding its minimum point in the first column. I have used idxmin to firstly identify the location of the minimum entry and secondly iloc to slice the array from 0 to the minimum point.

The error I encounter is:

TypeError: cannot do positional indexing on RangeIndex with these indexers [1 96

dtype: int64] of type Series

An example dataset is as shown:

   x             y
0    1.000000  6
1    1.000000  2
2    0.999999  5
3    0.999996  3
4    0.999986  4
..        ...           ...
196  0.999987  3
197  0.999996  3
198  0.999999  2
199  1.000000  1
200  1.000000  4

The x column starts from 1 and decreases to a minimum point near zero, where it increases back to 1. I am looking for the smallest x and its corresponding y point to separate the two.

This is the current code I have written:

data = pd.DataFrame(data)

minimum = pd.DataFrame.idxmin(data)

lower_surface = data.iloc[:minimum]

I understand that the variable minimum will return a location in the DataFrame, and hence I thought I could use iloc to separate the array from the beginning to the minimum point but this is not the case.

Solution

You should pick one column as reference. Using the whole DataFrame, you will get an index for each column, which cannot be used to slice:

data.idxmin()

x      4
y    199
dtype: int64

You should instead run:

minimum = data['x'].idxmin()

Also, technically you have to use loc to slice, not iloc since idxmax return an indice not a position.

data.loc[:minimum]

Output:

          x  y
0  1.000000  6
1  1.000000  2
2  0.999999  5
3  0.999996  3
4  0.999986  4

If you want to slice with iloc you have to use numpy.argmin:

import numpy as np

data.iloc[:np.argmin(data['x'])]

The output is however slightly different since iloc excludes the end of the slice:

          x  y
0  1.000000  6
1  1.000000  2
2  0.999999  5
3  0.999996  3