Search code examples
rdataframefilterplm

How to use index to filter rows in plm R dataframe?


I have a need to filter out (drop) rows with certain index, i.e. c("b-2022", "e-2022"), from the following example pdata_frame.

data_frame = data.frame(
  code = c("b","b","d","e","d") ,
  year = c(2021, 2022, 2021, 2022, 2022),
  values = c(0,2,1,4,5) 
)

library(plm)    
pdata_frame <- pdata.frame(data_frame, index = c("code","year"), drop.index = FALSE)

#        code year values
# b-2021    b 2021      0
# b-2022    b 2022      2
# d-2021    d 2021      1
# d-2022    d 2022      5
# e-2022    e 2022      4

Now I use a rather cumbersome way to manually code conditions without using index at all.

pdata_frame[-which(
  (pdata_frame$code == "b" & pdata_frame$year==2022) |
  (pdata_frame$code == "e" & pdata_frame$year==2022)), ]

Is there a way to make use of index for more efficient (succinct) filtering, smth like pdata_frame[-c(2, 5), ]?


Solution

  • You can add a new column without using the plm package and filter by this column.

    This code is in Rbase

    d <- c("b-2022", "e-2022")
    data_frame <- within(data_frame,name <- paste0(code, "-", year))
    subset(data_frame, subset = !name %in% d, select = -c(name))
    

    EDIT :

    This is a single line finally

    d <- c("b-2022", "e-2022")
    subset(data_frame, subset = ! paste0(code, "-", year) %in% d)