Search code examples
rgroup-byprobabilitypercentage

How to get observed frequency of one of four possible outcomes in pairs of participants?


I have a dataset of answers participants' first and second answer to a binary (e.g., correct or incorrect) question (before and after an intervention). Participants with disagreeing first answers were paired into groups before their second answer. Using R, I need to figure out the frequencies of the following outcomes for only the participants that were paired.

  1. The wrong one changed to correct and the correct one kept their answer.
  2. The correct one changed to incorrect and the wrong one keep their answer
  3. Both kept their first answer
  4. Both change their first answer (i.e., they switch)

The relevant variables are

  • Group number. This is assigned to both individuals and pairs. So only duplicate groups numbers represent pairs.
  • 1st and 2nd answer (for each participant).
Grp 1st 2nd Condition
2 0 0 Solo
3 0 0 Pair
3 1 0 Pair
4 0 0 Solo
5 0 1 Pair
... ... ... ...

My first attempt was to get descriptives for each participant's answers.

describe(data$condition=="pair" & data$first.answer==0 & data$second.answer==1
describe(data$condition=="pair" & data$first.answer==1 & data$second.answer==0)
describe(data$condition=="pair" & data$first.answer==0 & data$second.answer==0)
describe(data$condition=="pair" & data$first.answer==1 & data$second.answer==1)

But when it came time to apply this kind of analysis to groups, I got stuck.

How can I analyze each group (in R) to determine the percentages above?


Solution

  • Casting and the plotting the data seems to work. Something like the following:

    widerData <- data %>%
      select(-participant) %>%
      pivot_wider(names_from = id_in_group,
                  values_from = c(first_answer, second_answer)) %>%
      mutate(
        typology = case_when(
          treatment %in% treatments &
            second_answer_1 + second_answer_2 == 2 ~ 'Both changed to correct',
          treatment %in% treatments &
            second_answer_1 + second_answer_2 == 0 ~ 'Both changed to incorrect',
          treatment %in% treatments &
            second_answer_1 == 0 &
            second_answer_2 == 1 ~ 'Both kept old positions',
          treatment %in% treatments &
            second_answer_1 == 1 & second_answer_2 == 0 ~ 'Position interchange',
          !(treatment %in% treatments)  &
            first_answer_1 == second_answer_1 ~ 'Single old position kept',
          !(treatment %in% chat_treatments)  &
            first_answer_1 != second_answer_1 ~ 'Single position changed'
        )
      )
    
    bar_fun <- function(df) {
      df %>%
        group_by(treatment, typology) %>%
        tally() %>%
        group_by(treatment) %>%
        mutate(freq = n / sum(n)) %>%
        ggplot(aes(x = typology, y = freq, fill=freq)) + geom_bar(stat = 'identity',show.legend = FALSE) +
        facet_wrap(~ treatment) +
        theme(axis.text.x = element_text(angle = 90)) +
        scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
        geom_text(
          aes(label = percent_format(accuracy = 1)(freq)),
          position = position_dodge(width = 0.9),
          vjust = -0.25
        )+
        ylab('Share of participants')+
        ylim(0,1)
      
    }
    bar_fun(widerData%>% filter((treatment %in% treatments)))
    

    Graph of frequencies of four outcomes in one of the two types of pairs

    See https://rpubs.com/chapkovski/socrates