I am working on a file with many machinas. Each machina works during a cycle (which goes from -1 to -1) and I have some operation parameters:
1: High power operation 0.5: Low power operation 0: No operation -1: End of operation
I have a list of my machinas consisting of these kind of dataframe (one per machina - this one is a really small sample) :
*Indx* *N°1 Operation* *N°1 Operation length*
1 1 450
1 0.5 84
1 0 48
1 1 4
1 0.5 4
1 1 123
1 0.5 14
1 -1 45
2 1 471
2 0 47
2 0.5 44
2 0 145
2 0.5 78
2 1 71
2 0.5 19
2 0 2
2 -1 45
I would like to get for each group the last row with 1 value (high power operation). My purpose is then to sum the lengths from this last High operation value to end of cycle.
Desired output :
*Indx* *N°1 Operation length*
1 123+14
2 71+19+2
How can I get to that ?
Using dplyr
one way is to filter
the End of operation rows from the data, group_by
Indx
and sum
the Operation2
values which occur between last occurrence when Operation1
is 1 till the last row. We find the last occurrence of 1 using cumsum
value.
library(dplyr)
df %>%
filter(Operation1 != -1) %>%
group_by(Indx) %>%
summarise(Oplength = sum(Operation2[cumsum(Operation1 == 1) ==
max(cumsum(Operation1 == 1))]))
# A tibble: 2 x 2
# Indx Oplength
# <int> <int>
#1 1 137
#2 2 92
Or another way to find last occurrence is using which
and max
df %>%
filter(Operation1 != -1) %>%
group_by(Indx) %>%
summarise(Oplength = sum(Operation2[max(which(Operation1 == 1)) : n()]))
data
df <- structure(list(Indx = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L), Operation1 = c(1, 0.5, 0, 1, 0.5,
1, 0.5, -1, 1, 0, 0.5, 0, 0.5, 1, 0.5, 0, -1), Operation2 = c(450L,
84L, 48L, 4L, 4L, 123L, 14L, 45L, 471L, 47L, 44L, 145L, 78L,
71L, 19L, 2L, 45L)), class = "data.frame", row.names = c(NA, -17L))