This is the continuation of the previous problem that i've asked( Detecting specific character in Dataframe)
but let me explain it again
I have a dataframe using pandas (/python) that looks like this:
time_s | wow | lat_deg | lon_deg | |
---|---|---|---|---|
0 | 0.0 | 0.0 | 35.042628 | -89.978249 |
1 | 2.0 | 0.0 | 35.042628 | -89.978249 |
2 | 4.0 | 0.0 | 35.042628 | -89.978249 |
3 | 6.0 | 0.0 | 35.042628 | -89.978249 |
4 | 8.0 | 1 | 35.042628 | -89.978249 |
5 | 10.0 | 0.0 | 35.042628 | -89.978249 |
6 | 12.0 | 0.0 | 35.042628 | -89.978249 |
7 | 14.0 | 0.0 | 35.042628 | -89.978249 |
8 | 16.0 | 1 | 35.042628 | -89.978249 |
9 | 18.0 | 1 | 35.042628 | -89.978249 |
10 | 20.0 | 0.0 | 35.042628 | -89.978249 |
11 | 22.0 | 0.0 | 35.042628 | -89.978249 |
... | ... | ... | ... | ... |
in the wow
column, it is defined that it has the value of 0 and 1. Unfortunately, the data that i have, had some noise, that made the total entity (in row) more than it should be (it is actually from 500 data, but due to some noise, it is detected as 507 data)
Therefore, I intended to remove before processing that
The original data looked like this
(...,0,0,0,0,1,1,1,0,1,1,1,1,0,0,0,0,...)
I need to trim the data by deleting the "0" value in between 1 (1,0,1) so it will become
(...,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,...)
How do I can do that?
Assuming you only need to get rid out the 0
immediately enclosed by a pair of 1
(i.e. only 1 0 1
series considered), you can set a boolean mask and use .loc
to filter the rows, as follows:
m = (df['wow'] == 0.0) & (df['wow'].shift(1) == 1.0) & (df['wow'].shift(-1) == 1.0)
df[~m]
As your sample data doesn't have this scenario, I have modified the data a bit:
print(df)
time_s wow lat_deg lon_deg
0 0.0 0.0 35.042628 -89.978249
1 2.0 0.0 35.042628 -89.978249
2 4.0 0.0 35.042628 -89.978249
3 6.0 0.0 35.042628 -89.978249
4 8.0 1.0 35.042628 -89.978249
5 10.0 0.0 35.042628 -89.978249 <== Matching entry to get rid of
6 12.0 1.0 35.042628 -89.978249
7 14.0 0.0 35.042628 -89.978249 <== Matching entry to get rid of
8 16.0 1.0 35.042628 -89.978249
9 18.0 1.0 35.042628 -89.978249
10 20.0 0.0 35.042628 -89.978249
11 22.0 0.0 35.042628 -89.978249
m = (df['wow'] == 0.0) & (df['wow'].shift(1) == 1.0) & (df['wow'].shift(-1) == 1.0)
df[~m]
time_s wow lat_deg lon_deg
0 0.0 0.0 35.042628 -89.978249
1 2.0 0.0 35.042628 -89.978249
2 4.0 0.0 35.042628 -89.978249
3 6.0 0.0 35.042628 -89.978249
4 8.0 1.0 35.042628 -89.978249
6 12.0 1.0 35.042628 -89.978249
8 16.0 1.0 35.042628 -89.978249
9 18.0 1.0 35.042628 -89.978249
10 20.0 0.0 35.042628 -89.978249
11 22.0 0.0 35.042628 -89.978249