Search code examples
rif-statementjoindplyrconditional-statements

In R, what's the most efficient way to check if an object meets a criteria, and if it doesn't, modify it?


I have many cuts of my data that I eventually join together into one large dataset. However, sometimes the object is an error message because it didn't have enough sample size, causing the code to fail.

Before I do my full_joins, I want a simple way to say "If the length of any of these objects is 1, then make that object--and only those objects--have this set structure with just NAs". Is there a simple way to do that other than an if statement for each object? Or, alternatively, is there a way for R to 'skip' over the problematic rows if there's an error message (without relying on any specific characters)? I've used try(), but that doesn't always work and sometimes stops continuing to other joins.

#Here's an example of my data
library(dplyr)
object_1 <- tibble(name = c("Justin", "Corey"), month = c("Jan", "Jan"), score = c(1, 2))

object_2 <- tibble(name = c("Justin", "Corey"), month = c("Feb", "Feb"), score = c(100, 200))

object_3 <- "error message!"

object_4 <- tibble(name = c("Justin", "Corey"), month = c("Apr", "Apr"), score = c(95, 23))

object_5 <- "Another error!!"

#Here's me trying to join them, but it isn't working because of the errors
all_the_objects <- object_1 %>%
  full_join(object_2) %>%
  full_join(object_3) %>%
  full_join(object_4) %>%
  full_join(object_5) 

#Here's a solution that works, but doesn't seem very elegant:

if(length(object_1) == 1) {
  object_1 <- tibble(name = NA, month = NA, score = NA_real_)
} else if(length(object_2) == 1) {
  object_2 <- tibble(name = NA, month = NA, score = NA_real_)
} else if(length(object_3) == 1) {
  object_3 <- tibble(name = NA, month = NA, score = NA_real_)
} else if(length(object_4) == 1) {
  object_4 <- tibble(name = NA, month = NA, score = NA_real_)
} else if(length(object_5) == 1) {
  object_5 <- tibble(name = NA, month = NA, score = NA_real_)
}

#Now it'll work
all_the_objects <- object_1 %>%
  full_join(object_2) %>%
  full_join(object_3) %>%
  full_join(object_4) %>%
  full_join(object_5)

Solution

  • We may place the objects in a list and do the check at once and then join with reduce

    library(dplyr)
    library(purrr)
    map(mget(ls(pattern = '^object_\\d+$')), 
       ~ if(is.vector(.x)) tibble(name = NA_character_, month = NA_character_,
            score = NA_real_) else .x) %>% 
      reduce(full_join)
    

    -output

    # A tibble: 7 × 3
      name   month score
      <chr>  <chr> <dbl>
    1 Justin Jan       1
    2 Corey  Jan       2
    3 Justin Feb     100
    4 Corey  Feb     200
    5 <NA>   <NA>     NA
    6 Justin Apr      95
    7 Corey  Apr      23