Search code examples
rcrosstabweightedmultiple-choiceexpss

Crosstab with multiple choice and weight variable


I am learning to use the excellent "expss" R package.

I need to know if it is possible to use this package to make a contingency table between a multiple choice variable and a categorical variable, considering a weight variable

The categorical variable is "sex" in this dataframe, and the weight variable is "survey_weight":

demo <- tribble(
~dummy1, ~dummy2, ~dummy3, ~survey_weight, ~sex,
      1,       0,       0,          1.5,  "male",
      1,       1,       0,          1.5,  "female",
      1,       1,       1,           .5,  "female",
      0,       1,       1,          1.5,  "male",
      1,       1,       1,           .5,  "male",
      0,       0,       1,           .5,  "male",
)
demo 

I need to calculate the percentage based on total respondents who answered the question, and not on total responses.

Thanks in advance!


Solution

  • library(expss)
    demo = text_to_columns('
     dummy1   dummy2   dummy3  survey_weight  sex
          1        0        0            1.5  male
          1        1        0            1.5  female
          1        1        1             .5  female
          0        1        1            1.5  male
          1        1        1             .5  male
          0        0        1             .5  male
    ')
    
    
    demo %>% 
        tab_cells(mdset(dummy1 %to% dummy3)) %>%  # 'mdset' designate that with have multiple dichotomy set
        tab_cols(sex) %>%  # columns
        tab_weight(survey_weight) %>% # weight
        tab_stat_cpct() %>% # statistic
        tab_pivot() 
    
    # |              |    sex |      |
    # |              | female | male |
    # | ------------ | ------ | ---- |
    # |       dummy1 |    100 | 50.0 |
    # |       dummy2 |    100 | 50.0 |
    # |       dummy3 |     25 | 62.5 |
    # | #Total cases |      2 |  4.0 |
    
    # shorter notation with the same result
    calc_cro_cpct(demo, mdset(dummy1 %to% dummy3), sex, weight = survey_weight)