Search code examples
rsurvey

Dynamic variable names in svydesign from survey package


I want to add columns to a survey.design created with the survey package, which can be done as following:

library(survey)
data(api)

dclus1 <- svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
dclus2 <- transform(dclus1, 
                    api00_b = api00 + 1)

svymean(~ api00, design = dclus2)
#>         mean     SE
#> api00 644.17 23.542
svymean(~ api00_b, design = dclus2)
#>           mean     SE
#> api00_b 645.17 23.542

For a more complex task, I need to create these variable names dynamically from external vectors. The following produces an error, but I think provides an illustration of what I want to achieve:

vars <- c("api00_a", "api00_b")
dclus2 <- transform(dclus1, 
                    vars[[2]] = api00 + 1)

How could dynamic names for the new columns be implemented?


Solution

  • I don't think you can use a vector like this on the left-hand side of the equal sign in R. You don't have to use transform, which calls survey:::update.survey.design, though. You could just add your new variable directly:

    dclus2 <- dclus1
    dclus2$variables[ ,vars[[1]]] <- dclus2$variables[,"api00"] + 1
    

    This is the same as creating the new variable before converting to a survey.design object, as long as you do not use any survey functions for creation of the new variable. Just using Anthony's comment:

    apiclus2 <- apiclus1
    apiclus2[ , vars[[1]]] <- apiclus2[ , "api00" ] + 1
    dclus_prep_2 <- svydesign(id = ~dnum, weights = ~pw, data = apiclus2, fpc = ~fpc)
    

    You might prefer to use srvyr, which allows your kind of programming with dplyr's !!and :=:

    library(srvyr)
    dclus_srvyr_1 <- as_survey_design(.data = apiclus1, 
                                    ids = dnum, 
                                    weights = pw, 
                                    fpc = fpc)
    dclus_srvyr_2 <- mutate(dclus_srvyr_1, 
                        !!vars[[1]] := api00 + 1)
    

    All versions have the same result:

    lapply(list(dclus2, dclus_prep_2, dclus_srvyr_2), 
      function(design) svymean(~api00_a, design=design))
    [[1]]
              mean     SE
    api00_a 645.17 23.542
    
    [[2]]
              mean     SE
    api00_a 645.17 23.542
    
    [[3]]
              mean     SE
    api00_a 645.17 23.542