Search code examples
rsurveyt-test

Testing difference between two means in r survey package


I have a problem when trying to test difference between two means in a survey with a t test. Basically I have a repeated health survey from different years and I want to test if the difference in means between them is meaningful. Using package survey I was able to get the means, along with standard errors with this code

svyby(~CADL2,by=~ageco+year,design = wcd,FUN= svymean,na.rm=TRUE)

           ageco year CADL2None CADL2Limited se.CADL2None se.CADL2Limited
50-54.2007 50-54 2007 0.8717507   0.12824934   0.02652932      0.02652932
55-59.2007 55-59 2007 0.8919843   0.10801569   0.01662038      0.01662038
60-64.2007 60-64 2007 0.9056865   0.09431355   0.01556955      0.01556955
65-69.2007 65-69 2007 0.8525438   0.14745624   0.02376984      0.02376984
70-74.2007 70-74 2007 0.7534787   0.24652131   0.03419399      0.03419399
75-79.2007 75-79 2007 0.7466576   0.25334237   0.04010796      0.04010796
80-85.2007 80-85 2007 0.5690972   0.43090276   0.06083682      0.06083682
85+.2007     85+ 2007 0.3853919   0.61460811   0.08913058      0.08913058
50-54.2017 50-54 2017 0.7150962   0.28490379   0.13929132      0.13929132
55-59.2017 55-59 2017 0.8720697   0.12793025   0.04088908      0.04088908
60-64.2017 60-64 2017 0.8503688   0.14963123   0.01783197      0.01783197
65-69.2017 65-69 2017 0.7931459   0.20685411   0.01829031      0.01829031
70-74.2017 70-74 2017 0.7764609   0.22353912   0.01895070      0.01895070
75-79.2017 75-79 2017 0.6666032   0.33339681   0.02428625      0.02428625
80-85.2017 80-85 2017 0.5462507   0.45374929   0.03324155      0.03324155
85+.2017     85+ 2017 0.3223467   0.67765331   0.03956227      0.03956227

Now I would want to test if for example the mean for age cohort 50-54 in 2007 (0.8717) is significantly different from that in 2017 (0.7151). I tried svyttest but as Im quite new to this package I couldn't get it to work.

svyby(~CADL2,by=~ageco+year,design = wcd,FUN= svyttest,na.rm=TRUE)

Error in formula[[3]] : subscript out of bounds

I tried swapping variables in the formula but everytime I got this error. Do you know how I could get it to work or achieve the test between two means in any onther way?


Solution

  • do you want svyby( CADL2 ~ year , ~ ageco , wcd , svyttest , keep.var = F ) ?

    example with data from ?svyttest

    library(survey)
    data(api)
    dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2)
    svyttest(enroll~comp.imp, dclus2)
    
    # single subset
    svyttest(enroll~comp.imp, subset(dclus2,stype=='E'))
    
    # with svyby
    svyby(enroll~comp.imp,~stype,dclus2,svyttest,keep.var=F)