Search code examples
rstatisticssignificance

Correct statistical analysis to use in R to determine significance of a ratio in two groups?


I made some example data and an example graph to show what I need to do.

example.label <- c("A","B")
example.percent.good <- c(.75,.6)
example.data <- data.frame(example.label,example.percent.good)
example.data$example.percent.bad <- (1-example.data$example.percent.good)

example.data ##looks like this
     example.label example.percent.good example.percent.bad
    1             A                 0.75                0.25
    2             B                 0.60                0.40

I then melted the data to be able to graph it using the reshape package.

example.melt <- melt(example.data,id.vars="example.label")

example.melt$labelposition <-
 ifelse(example.melt$variable=="example.percent.bad", example.melt$value/2, 1 - example.melt$value/2)
 ##This just creates where the graph should place the labels

The data then looks like this....

 example.label             variable value labelposition
1             A example.percent.good  0.75         0.625
2             B example.percent.good  0.60         0.700
3             A  example.percent.bad  0.25         0.125
4             B  example.percent.bad  0.40         0.200

I then graphed it using ggplot2. The graph looks like this.Graph

What I then need to be able to do is to determine the statistical significance of the difference between those ratios. Obviously, in my actual data, the ratios came from somewhere, but here it is simply percentages in order to streamline the question, so there will be no significance within this particular example.

What is the correct statistical analysis to use to determine whether the difference between these ratios is statistically significant, and how do I accomplish this in R? Basically, is the 75%/25% in label A vs. the 60%/40% in label B statistically significant?

I don't know if this is even the right place to ask this. Thanks!


Solution

  • Nothing more than what you did can be done with just the percentage.

    You need the row data and then you can do a test of equality of means or equality of variance between your groups.