I have a need to filter out (drop) rows with certain index
, i.e. c("b-2022", "e-2022")
, from the following example pdata_frame
.
data_frame = data.frame(
code = c("b","b","d","e","d") ,
year = c(2021, 2022, 2021, 2022, 2022),
values = c(0,2,1,4,5)
)
library(plm)
pdata_frame <- pdata.frame(data_frame, index = c("code","year"), drop.index = FALSE)
# code year values
# b-2021 b 2021 0
# b-2022 b 2022 2
# d-2021 d 2021 1
# d-2022 d 2022 5
# e-2022 e 2022 4
Now I use a rather cumbersome way to manually code conditions without using index at all.
pdata_frame[-which(
(pdata_frame$code == "b" & pdata_frame$year==2022) |
(pdata_frame$code == "e" & pdata_frame$year==2022)), ]
Is there a way to make use of index for more efficient (succinct) filtering, smth like pdata_frame[-c(2, 5), ]
?
You can add a new column without using the plm
package and filter by this column.
This code is in Rbase
d <- c("b-2022", "e-2022")
data_frame <- within(data_frame,name <- paste0(code, "-", year))
subset(data_frame, subset = !name %in% d, select = -c(name))
EDIT :
This is a single line finally
d <- c("b-2022", "e-2022")
subset(data_frame, subset = ! paste0(code, "-", year) %in% d)