Search code examples
rpurrrcategorical-data

Apply two sample proportion test across rows


I'm trying to conduct a series of two-sample proportion tests across all rows of a data frame. Here is an example of first 3 rows, where x is the coiunt of yes responses and n is the total.

df <- data.frame("x1" = c(370,450,490), "x2" = c(150, 970, 120), "n1" = c(1500, 2700, 4500), "n2" = c(3000, 4900, 3200))

I'm using the function "prop.test" which compares two proportions as below:

  test <- prop.test(x = c(370, 150), n = c(1500, 3000), correct = "FALSE")

I've tried:

Map(prop.test, x = c(df$x1, df$x2), n = c(df$n1, df$n2), correct = "FALSE")

but it is returning output for 6 rows of 1-sample binomial tests instead of output for 3 rows of 2-sample binomial tests. I must be using Map incorrectly. Any ideas?


Solution

  • pmap allows to iterate over each row of the input data.frame.

    Try:

    library(purrr)
    purrr::pmap(df,~{prop.test(x = c(..1, ..2), n = c(..3, ..4), correct = "FALSE")}) 
    

    or

    purrr::pmap(df,~with(list(...),prop.test(x = c(x1, x2), n = c(n1, n2), correct = "FALSE"))
    
    
    [[1]]
    
        2-sample test for equality of proportions without continuity correction
    
    data:  c(..1, ..2) out of c(..3, ..4)
    X-squared = 378.44, df = 1, p-value < 2.2e-16
    alternative hypothesis: two.sided
    95 percent confidence interval:
     0.1734997 0.2198336
    sample estimates:
       prop 1    prop 2 
    0.2466667 0.0500000 
    
    
    [[2]]
    
        2-sample test for equality of proportions without continuity correction
    
    data:  c(..1, ..2) out of c(..3, ..4)
    X-squared = 11.22, df = 1, p-value = 0.0008094
    alternative hypothesis: two.sided
    95 percent confidence interval:
     -0.04923905 -0.01334598
    sample estimates:
       prop 1    prop 2 
    0.1666667 0.1979592 
    
    
    [[3]]
    
        2-sample test for equality of proportions without continuity correction
    
    data:  c(..1, ..2) out of c(..3, ..4)
    X-squared = 130.66, df = 1, p-value < 2.2e-16
    alternative hypothesis: two.sided
    95 percent confidence interval:
     0.06015674 0.08262104
    sample estimates:
       prop 1    prop 2 
    0.1088889 0.0375000