I have a dataframe with a time series in one single column. The data looks like this chart
I would like to create a mask that is TRUE each time that the data is equal or lower than -0.20. It should also be TRUE before reaching -0.20 while negative. It should also be true after reaching -0.20 while negative. This version of the chart
is my manual attempt to show (in red) the values where the mask would be TRUE. I started creating the mask but I could only make it equal to TRUE while the data is less than -0.20 mask = (df['data'] < -0.2)
. I couldn't do any better, does anybody know how to achieve my goal?
One approach could be to group segments that are entirely below zero, and then for each group verify whether or not there any values below -0.2
.
See below for a full reproducible example script:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
np.random.seed(167)
df = pd.DataFrame(
{"y": np.cumsum([np.random.uniform(-0.01, 0.01) for _ in range(10 ** 5)])}
)
plt.plot(df)
gt_zero = df["y"] < 0
regions = (gt_zero != gt_zero.shift()).cumsum()
# here's your interesting DataFrame with the specified mask
df_interesting = df.groupby(regions).filter(lambda s: s.min() < -0.2)
# plot individual regions
for i, grp in df.groupby(regions):
if grp["y"].min() < -0.2:
plt.plot(grp, color="tab:red", linewidth=5, alpha=0.6)
plt.axhline(0, linestyle="--", color="tab:gray")
plt.axhline(-0.2, linestyle="--", color="tab:gray")
plt.show()