Search code examples
rdataframechi-squared

How to perform Two-proportion Z test on items in a row of a dataframe and append the p value to the dataframe?


I am using R, and have data in a dataframe.

Each row of the dataframe has data on an urban/rural basis, and a two proportion Z-Test to compare the rates between urban and rural populations.

df

State     UrbanPop     RuralPop     UrbanCases   RuralCases
AL         1000         250          200          50
AK         500          50           500          75

The idea is to get a Two proportion Z test from the data in row A and from row B independently to compare urban/rural within each State.

What I have tried is

df$P_Values <- apply(df,1,function(x) prop.test(x = c(df$UrbanPop, df$UrbanCases), n = c(df$RuralPop, df$RuralCases))$p.value)

I get a warning that the "Chi-squared approximation may be incorrect" for each row, and all the p values appended to the dataframe are equal to zero.

Any help would be greatly appreciated.

Thanks.


Solution

  • You got xand n wrong: x is "a vector of counts of successes"; that would match your *Cases, whereas n is the number of trials; that would correspond to your *Pop. If you re-assign the vectors for x and n, the code works:

    df$P_Values <- apply(df, 1, function(x) prop.test(n = c(df$UrbanPop, df$UrbanCases), 
                                                      x = c(df$RuralPop, df$RuralCases))$p.value)
    
    df
      UrbanPop RuralPop UrbanCases RuralCases             P_Values
    1     1000      250        200         50 0.000000000001119084
    2      500       50        500         75 0.000000000001119084