Using iloc to replace a range of rows in the dataframe is not saved when assigning that index range to the original dataframe

I have a dataframe with the following structure:

date    position    quantity    Unit
0   2015-01-01 00:00:00 1   720 MAW

Based on the conditions I set I wish to reorder dates during the daylight savings time in October (1 hour gets added here) based on the "position" column. So I subseted the dataframe and got the index values back:

indexData = data[(data.date.dt.hour >= 2) & (data.date.dt.hour < 3) & (data.date.dt.day == 25) & (data.date.dt.month == 10) & (data.date.dt.year == 2015)].index

So when I do this I can see that the range is ordered correctly (but not yet replaced in the dataframe):

data.iloc[indexData[0]:(indexData[-1])+1].sort_values("position")

date    position    quantity    Unit
28521   2015-10-25 02:00:00 25305   420 MAW
28523   2015-10-25 02:15:00 25306   418 MAW
28524   2015-10-25 02:30:00 25307   415 MAW
28526   2015-10-25 02:45:00 25308   415 MAW
28520   2015-10-25 02:00:00 25309   410 MAW
28522   2015-10-25 02:15:00 25310   407 MAW
28525   2015-10-25 02:30:00 25311   399 MAW
28527   2015-10-25 02:45:00 25312   378 MAW

BUT when I try to assign this range to the original data drame it does not change:

data.iloc[indexData[0]:(indexData[-1])+1] = data.iloc[indexData[0]:(indexData[-1])+1].sort_values("position")

# here I check the range if updated
data[(data.date.dt.hour >= 2) & (data.date.dt.hour < 3) & (data.date.dt.day == 25) & (data.date.dt.month == 10) & (data.date.dt.year == 2015)]

date    position    quantity    Unit
28520   2015-10-25 02:00:00 25309   410 MAW
28521   2015-10-25 02:00:00 25305   420 MAW
28522   2015-10-25 02:15:00 25310   407 MAW
28523   2015-10-25 02:15:00 25306   418 MAW
28524   2015-10-25 02:30:00 25307   415 MAW
28525   2015-10-25 02:30:00 25311   399 MAW
28526   2015-10-25 02:45:00 25308   415 MAW
28527   2015-10-25 02:45:00 25312   378 MAW

Any ideas where I am going wrong or should try?

Solution

Let's assume we have data like the following and we want to sort by position rows only for October (basically we want to swap lines 4 and 5):

date    position    quantity    Unit
0   2015-09-25 02:15:00 25310   407 MAW
1   2015-09-25 02:00:00 25309   410 MAW
2   2015-09-25 02:30:00 25311   399 MAW
3   2015-09-25 02:45:00 25312   378 MAW
4   2015-10-25 02:15:00 25306   418 MAW
5   2015-10-25 02:00:00 25305   420 MAW
6   2015-10-25 02:30:00 25307   415 MAW
7   2015-10-25 02:45:00 25308   415 MAW

Choosing October with some other filters you applied (basically using your logic here):

subset_index = data[(data.date.dt.hour >= 2) & (data.date.dt.hour < 3) & (data.date.dt.day == 25) & (data.date.dt.month == 10) & (data.date.dt.year == 2015)].index.values

subset_index is array([4, 5, 6, 7]) - these are rows we want to sort.

Having extracted index values to sort, you have two options:

Option 1: keep the index and reassign values in the right order(right index).

Just assign the values(numpy array) in the correct order to a dataframe subset:

data.loc[subset_index] = data.loc[subset_index].sort_values("position").values

The result is

date    position    quantity    Unit
0   2015-09-25 02:15:00 25310   407 MAW
1   2015-09-25 02:00:00 25309   410 MAW
2   2015-09-25 02:30:00 25311   399 MAW
3   2015-09-25 02:45:00 25312   378 MAW
4   2015-10-25 02:00:00 25305   420 MAW
5   2015-10-25 02:15:00 25306   418 MAW
6   2015-10-25 02:30:00 25307   415 MAW
7   2015-10-25 02:45:00 25308   415 MAW

As you can see only October rows have been sorted by position, September rows 0 and 1 were left unsorted.

Option 2: sort the index in the right order and reindex the dataframe.

Create a new sorted index order for this subset

reordered_subset_index = data.loc[subset_index].sort_values("position").index.values

reordered_subset_index is array([5, 4, 6, 7]). Now we need to create a correct index order for the whole dataframe:

new_index = data.index.values
new_index[subset_index] = new_index[reordered_subset_index]

new_index is array([0, 1, 2, 3, 5, 4, 6, 7]). This index defines the correct order of rows in our initial dataframe.

So now we can reindex our dataframe and just sort it by our new index:

data.reindex(new_index).sort_index()

The result is the same as in Option 1.