Search code examples
rhypothesis-testproportions

2 Proportion Z-tests with NSDUH survey data


I'm working with data tables from 2020 NSDUH data. The data tables I have access to give me the following information:

  • estimated percentage of people with a substance use disorder
  • estimated number of people with a substance use disorder (for the entire US)
  • sample size
  • standard error

I would like to use a 2 Proportion Z-test to see if there is a statistically significantly difference between people with a substance use disorder based on their type of insurance. In R, prop.test requires you have the number of "successes", but I don't have the raw data to populate that info accurately.

Is there another way to do this or different test I should use that would allow me to use the proportions that have already been calculated along with the standard error?

Here is some sample data

idd_all <- matrix(c(36280,  0.066,  0.0021, 
                    23690,  0.053,  0.0024,
                    6990,   0.118,  0.0067), 
                  byrow = T, nrow = 3, ncol = 3,
                  dimnames = list(c("total", "medicaid 12+", "private 12+"),
                                  c("sample size", "% IDD", "SE")))
sample size est % IDD SE
total 36280 0.066 0.0021
medicaid 12+ 23690 0.053 0.0024
private 12+ 6990 0.118 0.0067

note: There are obviously more insurance types that we are not interested in, we want primarily interested in the difference between people on Medicaid and privately insured individuals.


Solution

  • I believe the following answers your question on how to test the null hypothesis of equal proportions between medicaid 12+ and private 12+.

    total_medicaid <- 23690 / 0.053
    total_private <- 6990 / 0.118
    
    x <- c(23690, 6990)
    n <- c(total_medicaid, total_private)
    prop.test(x, n)
    

    Result

    # 2-sample test for equality of proportions with continuity correction
    # 
    # data:  x out of n
    # X-squared = 3880.4, df = 1, p-value < 2.2e-16
    # alternative hypothesis: two.sided
    # 95 percent confidence interval:
    #   -0.06768921 -0.06231079
    # sample estimates:
    #   prop 1 prop 2 
    # 0.053  0.118