Search code examples
rgtsummary

Is there a way to create a tbl_strata using a svydesign object and tbl_svysummary?


I'm trying to create a stratified table using a categorial variable ("DOMINIO") in the argument "strata" and another categorical ("P601A") in the argument "by", then I use three numerical variables ("I601B2", "I601D2" and "I601Z2") in argument "include". I want to use a svydesign object since I want the output to be weighted. Nontheless, it just won't work and I'm not really sure why.

#Reading the data
load(url("https://github.com/cesarpoggi/PRUEBA/blob/main/PRUEBA_STACKOVERFLOW.rda?raw=true"))

#Setting the survey object
dessin2<- svydesign(id = ~1, 
                  data = PANxFAM,
                  weight = ~FACTOR07)

#Creating stratified table
tbl <- dessin2 %>% tbl_strata(
strata = DOMINIO,
.tbl_fun = ~ .x %>% tbl_svysummary(
    by = P601A,
    include = c(I601B2, I601D2, I601Z2)))

For some reason it just won't work. I know there's something wrong. I'd be very thankful if someone could tell me what's it and provide a solution.

PD: When I use the tbl_svysummary alone (as in below) it totally works. But I need to do an stratified table with the variable DOMINIO :(

tbl1 = tbl_svysummary(data= dessin2, by= P601A, 
           include = c(I601B2, I601D2, I601Z2), 
           statistic = list(all_continuous() ~ "{mean} ({sd})"), 
           digits = list(all_continuous() ~ c(2, 2)))
tbl1

Solution

  • It looks like the issue is that some combinations of DOMINO and P601A have too few observations to calculate the summary statistics. In the example below, I removed groups with too few observations and it runs without error.

    library(gtsummary)
    library(survey)
    packageVersion("gtsummary")
    #> [1] '1.6.2.9001'
    
    #Reading the data
    load(url("https://github.com/cesarpoggi/PRUEBA/blob/main/PRUEBA_STACKOVERFLOW.rda?raw=true"))
    
    df <- 
      PANxFAM[, c("I601B2", "P601A", "DOMINIO", "FACTOR07")] |> 
      tibble::as_tibble() %>% 
      dplyr::group_by(DOMINIO, P601A) |> 
      dplyr::filter(dplyr::n() > 5) |>  # remove groups with too few obs
      dplyr::ungroup()
    
    
    #Setting the survey object
    dessin2<- svydesign(id = ~1, 
                        data = df,
                        weight = ~FACTOR07)
    
    #Creating stratified table
    tbl <- 
      dessin2 %>% 
      tbl_strata(
        strata = DOMINIO,
        .tbl_fun = 
          ~ .x %>% 
          tbl_svysummary(
            by = P601A,
            include = c(I601B2)
          )
      )
    

    Created on 2022-10-24 with reprex v2.0.2