I have a data frame looks like :
Date | v1 | v2 |
---|---|---|
2000/8/10 | S | 2600 |
2000/8/10 | S | 500 |
2000/8/10 | S | 3500 |
2001/11/12 | B | 2000 |
2001/11/22 | B | 1000 |
2001/11/22 | S | 1000 |
2001/12/06 | B | 1000 |
2001/12/06 | B | 1500 |
2001/12/06 | B | 1000 |
2001/12/07 | S | 2000 |
2001/12/07 | B | 4000 |
However, I'd like to extract the rows which has "B" and "S" in the same date It looks as below :
Date | v1 | v2 |
---|---|---|
2001/11/22 | B | 1000 |
2001/11/22 | S | 1000 |
2001/12/07 | S | 2000 |
2001/12/07 | B | 4000 |
Does anyone have ideas? That would be grateful ! Cheers
You can select the Date
which has both 'S'
and 'B'
in them.
library(dplyr)
df %>%
group_by(Date) %>%
filter(all(c('S', 'B') %in% v1)) %>%
ungroup
# Date v1 v2
# <chr> <chr> <int>
#1 2001/11/22 B 1000
#2 2001/11/22 S 1000
#3 2001/12/07 S 2000
#4 2001/12/07 B 4000
This logic can also be implemented in base R and data.table
:
#Base R
subset(df, as.logical(ave(v1, Date, FUN = function(x) all(c('S', 'B') %in% x))))
#data.table
library(data.table)
setDT(df)[, .SD[all(c('S', 'B') %in% v1)], Date]
data
df <- structure(list(Date = c("2000/8/10", "2000/8/10", "2000/8/10",
"2001/11/12", "2001/11/22", "2001/11/22", "2001/12/06", "2001/12/06",
"2001/12/06", "2001/12/07", "2001/12/07"), v1 = c("S", "S", "S",
"B", "B", "S", "B", "B", "B", "S", "B"), v2 = c(2600L, 500L,
3500L, 2000L, 1000L, 1000L, 1000L, 1500L, 1000L, 2000L, 4000L
)), row.names = c(NA, -11L), class = "data.frame")