Search code examples
rpivotvlookup

Needing to find out if something exists in one time period but not another in R?


Below is the sample data. A bit clueless on how to attack this. The goal is simply to find the acct-run combinations with naics 454 in 2021 q4 that are present in both 2021 q4 and 2022 q1. Should I do I pivot wider and make the values into columns and from there, filter out those that are NA for the 2022 q1 set?

 year <- c(2021,2021,2021,2021,2021,2021,2021,2022,2022,2022,2022,2022,2022,2022)
 qtr <- c(4,4,4,4,4,4,4,1,1,1,1,1,1,1)
 acct <- c("001234","001245","001256","001267","001278","001314","001314","001234","001245","001256","001200","001201","001314","001314")
 run <- c(1,1,1,1,1,1,2,1,1,1,1,1,1,2)
 naics <- c("454","454","454","451","451","451","451","441","442","441","451","451","451","451")
 employment <- c(120,130,110,500,501,502,503,105,125,109,498,499,500,506)


 test <- data.frame (year,qtr,acct,run,naics,employment)

 Desired result

   acct     run     employment      year    qtr
  001234     1        120          2021     4
  001245     1        130          2021     4
  001256     1        110          2021     4
  001314     1        502          2021     4
  001314     2        503          2021     4
  001234     1        105          2022     1
  001245     1        125          2022     1
  001256     1        109          2022     1
  001314     1        500          2021     4
  001314     2        506          2021     4

Solution

  • Using a grouped filter in dplyr:

    library(dplyr)
    
    test %>% 
      group_by(acct, run) %>% 
      filter(
        any(year == 2021 & qtr == 4 & naics == 454), 
        any(year == 2022 & qtr == 1)
      ) %>% 
      ungroup()
    
    # A tibble: 6 × 6
       year   qtr acct     run naics employment
      <dbl> <dbl> <chr>  <dbl> <chr>      <dbl>
    1  2021     4 001234     1 454          120
    2  2021     4 001245     1 454          130
    3  2021     4 001256     1 454          110
    4  2022     1 001234     1 441          105
    5  2022     1 001245     1 442          125
    6  2022     1 001256     1 441          109