Getting local max/min values from variable row ranges defined with a condition?

I have the following question. I'm working on a time series in pandas that has a column with an histogram with values that sometimes are positive and sometimes negative. I need to fill a new column of the dataframe with the local max/min value for each window of time or range that is constantly changing since is the difference between the row number of the moment it turns from positive to negative to the last time it turned from negative to positive and viceversa. I also need to use a pandas or numpy method for efficiency.

I been experimenting by making an auxiliary column with df.loc filled with the row position of the last time it changed from positive to negative or vice-versa like this:

df.loc[(df.Histogram.shift(1) > 0) & (df.Histogram < 0), 'LOC'] = df.index.get_loc(df.Histogram)
df.LOC.fillna(method='ffill')

(Which resulted in an error) To later trying to calculate the differences between these row positions to define the current max/min time window and then apply a df.Histogram.rolling(loc_differences).max() method but I failed to make it work since .rolling only accepts a fixed window value and since I couldn't fill a column with locations. I know there must be a simple solution for this. This is an example of what I'm looking for:

Date             Histogram     Max/Min Value
01/02/2021         0.2            0.7
02/02/2021         0.3            0.7
03/02/2021         0.7            0.7
04/02/2021         0.2            0.7
05/02/2021        -0.2           -0.5
06/02/2021        -0.5           -0.5
07/02/2021        -0.1           -0.5
08/02/2021         0.4            0.4
09/02/2021         0.3            0.4
10/02/2021        -0.2           -0.2 
11/02/2021         0.2            0.7 
12/02/2021         0.7            0.7
13/02/2021         0.2            0.7
14/02/2021         0.3            0.7
15/02/2021         0.6            0.7
16/02/2021         0.2            0.7
17/02/2021        -0.2           -0.5
18/02/2021        -0.5           -0.5
19/02/2021        -0.1           -0.5
20/02/2021         0.4            0.4
21/02/2021         0.3            0.4
22/02/2021        -0.2           -0.3
23/02/2021        -0.1           -0.3 
24/02/2021        -0.3           -0.3
25/02/2021        -0.1           -0.3 
16/02/2021         0.2            0.3
27/02/2021         0.1            0.3    
28/02/2021         0.3            0.3

Is there a solution for this? Thanks in advance.

Solution

Here's a handy way to split your histogram data into groups of positive/negative values. Each time the grp column increments, the histogram column changes sign and all rows with the same grp value belong to the same interval between two sign changes.

df['grp'] = (df.Histogram > 0).astype(int).diff().abs().cumsum().fillna(0)

df.head(10)
          Date  Histogram  grp
0   01/02/2021        0.2  0.0
1   02/02/2021        0.3  0.0
2   03/02/2021        0.7  0.0
3   04/02/2021        0.2  0.0
4   05/02/2021       -0.2  1.0
5   06/02/2021       -0.5  1.0
6   07/02/2021       -0.1  1.0
7   08/02/2021        0.4  2.0
8   09/02/2021        0.3  2.0
9   10/02/2021       -0.2  3.0

The choice of (df.Histogram > 0) is arbitrary and counts zeroes as negative values. A histogram sequence of 0.2, 0, 0.4, 0.3 would yield groups [0, 1, 2, 2], a sequence of -0.2, 0, -0.4, -0.3 would yield a single group. You'll have to determine whether or not that is fine for your problem.

The .fillna(0) is there to fill the NaN that will arise for the first row returned by .diff(). Note that the choice of a zero replacement value is justified: if there was a change of signs from the first to the second row, grp would be 1 on row 2, correctly putting row 1 into its own group. If there was no change of signs, grp will be 0 on row 2, correctly grouping it with row 1.

You can now calculate minimum/maximum values per group using groupby like so

minmax = df.groupby('grp', as_index=False)['Histogram'].agg(
    {'hist_min': min, 'hist_max': max})
df = df.merge(minmax, on='grp')

df.head(10)
         Date  Histogram  grp  hist_min  hist_max
0  01/02/2021        0.2  0.0       0.2       0.7
1  02/02/2021        0.3  0.0       0.2       0.7
2  03/02/2021        0.7  0.0       0.2       0.7
3  04/02/2021        0.2  0.0       0.2       0.7
4  05/02/2021       -0.2  1.0      -0.5      -0.1
5  06/02/2021       -0.5  1.0      -0.5      -0.1
6  07/02/2021       -0.1  1.0      -0.5      -0.1
7  08/02/2021        0.4  2.0       0.3       0.4
8  09/02/2021        0.3  2.0       0.3       0.4
9  10/02/2021       -0.2  3.0      -0.2      -0.2

Finally, you can assemble your desired values using boolean indexing

df['minmax'] = df.hist_min
df.loc[df.Histogram > 0, 'minmax'] = df.hist_max[df.Histogram > 0]

df.head(10)
         Date  Histogram  grp  hist_min  hist_max  minmax
0  01/02/2021        0.2  0.0       0.2       0.7     0.7
1  02/02/2021        0.3  0.0       0.2       0.7     0.7
2  03/02/2021        0.7  0.0       0.2       0.7     0.7
3  04/02/2021        0.2  0.0       0.2       0.7     0.7
4  05/02/2021       -0.2  1.0      -0.5      -0.1    -0.5
5  06/02/2021       -0.5  1.0      -0.5      -0.1    -0.5
6  07/02/2021       -0.1  1.0      -0.5      -0.1    -0.5
7  08/02/2021        0.4  2.0       0.3       0.4     0.4
8  09/02/2021        0.3  2.0       0.3       0.4     0.4
9  10/02/2021       -0.2  3.0      -0.2      -0.2    -0.2

The entire process is vectorized as far as possible, so performance should be decent