How can I subset the following dataset based on the fact that Numeric_variable
includes 2 for some units (e.g. A,C), and 3 for some other units (e.g. B,D)? I would like to keep all units that have 1,1,1,1,1,1 (e.g. E,F) in each group. Data provided below.
Unit | Numeric_variable
A | 1
A | 1
A | 2
A | 2
A | 2
A | 2
C | 1
C | 1
C | 2
C | 2
C | 2
C | 2
B | 1
B | 1
B | 3
B | 3
B | 3
B | 3
D | 1
D | 1
D | 3
D | 3
D | 3
D | 3
E | 1
E | 1
E | 1
E | 1
E | 1
E | 1
F | 1
F | 1
F | 1
F | 1
F | 1
F | 1
Into:
df1:
Unit | Numeric_variable
A | 1
A | 1
A | 2
A | 2
A | 2
A | 2
C | 1
C | 1
C | 2
C | 2
C | 2
C | 2
E | 1
E | 1
E | 1
E | 1
E | 1
E | 1
F | 1
F | 1
F | 1
F | 1
F | 1
F | 1
df2:
Unit | Numeric_variable
B | 1
B | 1
B | 3
B | 3
B | 3
B | 3
D | 1
D | 1
D | 3
D | 3
D | 3
D | 3
E | 1
E | 1
E | 1
E | 1
E | 1
E | 1
F | 1
F | 1
F | 1
F | 1
F | 1
F | 1
data
Unit <- c("A", "A", "A", "A", "A", "A", "C", "C",
"C", "C", "C", "C", "B", "B", "B", "B", "B", "B", "D", "D", "D",
"D", "D", "D","E", "E", "E", "E", "E", "E","F", "F", "F", "F", "F", "F")
Numeric_variable <- c(1, 1, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 1, 1, 3, 3, 3, 3L, 1, 1, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
df <- data.frame(Unit, Numeric_variable)
We could this almost with the same code. Just and | is.na(x)
to filter
:
library(dplyr)
library(tidyr)
df_all <- df %>%
mutate(x = case_when(Numeric_variable==2 ~ 2,
Numeric_variable==3 ~ 3,
TRUE ~ NA_real_)) %>%
fill(x, .direction = "up")
df_2 <- df_all %>%
filter(x == 2 | is.na(x)) %>%
select(-x)
df_3 <- df_all %>%
filter(x == 3 | is.na(x)) %>%
select(-x)
df_2
Unit Numeric_variable
1 A 1
2 A 1
3 A 2
4 A 2
5 A 2
6 A 2
7 C 1
8 C 1
9 C 2
10 C 2
11 C 2
12 C 2
13 E 1
14 E 1
15 E 1
16 E 1
17 E 1
18 E 1
19 F 1
20 F 1
21 F 1
22 F 1
23 F 1
24 F 1
> df_3
Unit Numeric_variable
1 B 1
2 B 1
3 B 3
4 B 3
5 B 3
6 B 3
7 D 1
8 D 1
9 D 3
10 D 3
11 D 3
12 D 3
13 E 1
14 E 1
15 E 1
16 E 1
17 E 1
18 E 1
19 F 1
20 F 1
21 F 1
22 F 1
23 F 1
24 F 1