I have a pandas data-frame with a column named "Outside Dead Band", the data is recorded at 32Hz (32 data points per second).
I want to follow the follwing algorithm.
The duration of no nans
if duration is more than 2 seconds
take the max, if values between the nans are positive, and append to a list named maneuvers.
take the min, if the values between nans are negative, and append to a list named maneuvers.
if the duration is less than 2 seconds
take the max, if values between the nans are positive, and append to a list named gusts.
take the min, if the values between nans are negative, and append to a list named gusts.
Examples:
Example 1
Data Snippet
NaN
NaN
NaN
NaN
0.935829
NaN
NaN
0.9468344
NaN
0.9352744
NaN
0.9299145
NaN
0.9159902
NaN
0.9189067
0.9447504
NaN
NaN
0.9488161
Expected Outputs
gusts = [0.935829, 0.9468344, 0.9352744, 0.9299145, 0.9159902, 0.9159902, 0.9447504, 0.9488161]
Example 2
Data Snippet
NaN
NaN
1.066175
1.108567
1.103931
1.098653
1.094846
1.062542
1.053064
NaN
NaN
0.9460738
0.931207
0.9161806
0.9083371
0.9201323
0.9272887
0.9176005
0.9021356
0.9303108
0.9178913
0.8911541
0.8558757
0.8634101
0.828901
0.8187609
0.8117134
0.8005729
0.7740957
0.7548033
0.7564046
0.7697771
0.7818314
0.7997488
0.8270378
0.8616151
0.8802456
0.9116527
0.9257826
0.9388146
0.945994
0.9453149
0.9454532
0.9426287
0.928901
0.9325082
0.9312031
0.9289232
0.916741
0.9420649
0.9212928
0.922505
0.9238197
0.9236084
0.8717794
0.8492894
0.8158376
0.7905051
0.7699976
0.747136
0.7314162
0.7468339
0.7403114
0.7393804
0.7492437
0.7990298
0.818364
0.8724768
0.947295
0.9460738
0.931207
0.9161806
0.9083371
0.9201323
0.9272887
0.9176005
0.9021356
0.9303108
0.9178913
0.8911541
0.8558757
NaN
NaN
NaN
1.055898
NaN
Expected Outputs
gusts = [1.108567, 1.055898]
maneuvers = [0.947295]
Example 3
Data Snippet
NaN
NaN
-1.066175
-1.108567
-1.103931
-1.098653
-1.094846
-1.062542
-1.053064
NaN
NaN
-0.9460738
-0.931207
-0.9161806
-0.9083371
-0.9201323
-0.9272887
-0.9176005
-0.9021356
-0.9303108
-0.9178913
-0.8911541
-0.8558757
-0.8634101
-0.828901
-0.8187609
-0.8117134
-0.8005729
-0.7740957
-0.7548033
-0.7564046
-0.7697771
-0.7818314
-0.7997488
-0.8270378
-0.8616151
-0.8802456
-0.9116527
-0.9257826
-0.9388146
-0.945994
-0.9453149
-0.9454532
-0.9426287
-0.928901
-0.9325082
-0.9312031
-0.9289232
-0.916741
-0.9420649
-0.9212928
-0.922505
-0.9238197
-0.9236084
-0.8717794
-0.8492894
-0.8158376
-0.7905051
-0.7699976
-0.747136
-0.7314162
-0.7468339
-0.7403114
-0.7393804
-0.7492437
-0.7990298
-0.818364
-0.8724768
-0.947295
-0.9460738
-0.931207
-0.9161806
-0.9083371
-0.9201323
-0.9272887
-0.9176005
-0.9021356
-0.9303108
-0.9178913
-0.8911541
-0.8558757
NaN
NaN
NaN
-1.055898
NaN
Expected Outputs
gusts = [-1.108567, -1.055898]
maneuvers = [-0.947295]
I tried to isolate the loop and use a for loop and a series of if and else statements, but i seem to have my logic incorrect in that. would really appreciate some help on this within the dataframe itself if possible.
norm_accel = flight["Outside Dead Band"].tolist()
gusts = []
maneuvers = []
while i <= (len(norm_accel)):
if norm_accel[i] != numpy.nan:
if norm_accel[i+1] == numpy.nan:
gusts.append(norm_accel(i))
else:
j = i
counter = 0
while norm_accel[j] != numpy.nan:
counter =+ 1
j =+ 1
if counter >= 64:
maneuvers.append(max(norm_accel[i:j]))
else:
gusts.append(max(norm_accel[i:j]))
i = j
i = i + 1
I do know that this doesnt account for the max min condition, i am not sure how to incorporate that.
I would put this into a pandas dataframe, and use the occurrence of NaNs to create an id
column, which you can then use to do a groupby and calculate the relevant statistics. Assuming data
is a dataframe with the values in a val
column, it could look like:
data["id"] = data["val"].isna().cumsum()
data = data.dropna()
grps = data.groupby("id").agg(
counts=("val", "count"),
min=("val", "min"),
max=("val", "max"),
)
grps
Which using your Example 2 gives you:
counts min max
id
2 7 1.053064 1.108567
4 70 0.731416 0.947295
7 1 1.055898 1.055898
You can then use simple rules to create your lists:
grps["val"] = np.where(grps["max"] > 0, grps["max"], grps["min"])
manuevers = grps.loc[grps.counts >= 64, "val"].tolist()
gusts = grps.loc[grps.counts < 64, "val"].tolist()