Search code examples
rnested-function

R nested functions


I have to calculate the number of missing values per observation in a data set. As there are several variables across multiple time periods, I thought it best to try a function to keep my syntax clean. The first part of looking up the number of missing values works fine:

data$NMISS <- data %>% 
  select('x1':'x4') %>%  
  apply(1, function(x) sum(is.na(x)))

But when I try turn it into a function I get "Error in select():! NA/NaN argument"

library(dplyr)
library(tidyverse)

data <- data.frame(x1 = c(NA, 1, 5, 1),   
                   x2 = c(7, 1, 1, 5),
                   x3 = c(9, NA, 4, 9),
                   x4 = c(3, 4, 1, 2))

NMISSfunc <- function (dataFrame,variables) {
  
  dataFrame %>% select(variables) %>% 
    apply(1, function(x) sum(is.na(x)))
  
}

data$NMISS2 <- NMISSfunc(data,'x1':'x4')

I think it doesn't like the : in the range as it will accept c('x1','x2','x3','x4') instead of 'x1':'x4'

Some of the ranges are over twenty columns so listing them doesn't really provide a solution to keep the syntax neat.

Any suggestions?


Solution

  • You are right that you can't use "x4":"x4", as this isn't valid use of the : operator in this context. To get this to work in a tidyverse-style, your variables variable needs to be selectively unquoted inside select. Fortunately, the tidyverse has the curly-curly notation {{variables}} for handling exactly this situation:

    NMISSfunc <- function (dataFrame, variables) {
      
      dataFrame %>% 
        select({{variables}}) %>% 
        apply(1, function(x) sum(is.na(x)))
    }
    

    Now we can use x1:x4 (without quotes) and the function works as expected:

    NMISSfunc(data, x1:x4)
    #> [1] 1 1 0 0
    

    Created on 2022-12-13 with reprex v2.0.2