Search code examples
rdataframerowdata-wrangling

How do I change unique row values into another set of unique row values in a data frame in R?


I have a data frame with the experimental results coming from participants who took my test online. In the data file, each unique participant is identified by a randomly generated code given to them at the end of the experiment. Since it is cumbersome to identify each person by a random code that looks like gibberish, I would like to replace these codes by readable labels like Participant_1, Participant_2, etc.

So I think I need a piece of code that identifies each unique random code in the data file and replaces them with the participant labels one by one. But I could not figure it out and any help would be much appreciated.

Here is a piece of code that shows the output I have vs. the output I want. Note that each participant has answered different numbers of questions, so this cannot be used as an easy way out to parse them.

Participant_Identifiers <- c(rep("QHDKWEFHWKHFFH", 4), rep("WHWIHFJNWFKWF", 7), rep("HEIFFFBBKQLSD", 3))

Participant_Scores <- c(20, 30, 59, 20, 47, 84, 21, 90,54,78,90,97)

df <- data.frame("Participant_Identifiers" = c(rep("QHDKWEFHWKHFFH", 4), rep("WHWIHFJNWFKWF", 7), rep("HEIFFFBBKQLSD", 3)), 
                  "Participant_Scores" = c(20, 30, 59, 20, 47, 84, 21, 90,54,78,90,97, 35, 67))

df

df_I_want <- data.frame("Participant_Identifiers" = c(rep("Participant_1", 4), rep("Participant_2", 7), rep("Participant_3", 3)), 
                       "Participant_Scores" = c(20, 30, 59, 20, 47, 84, 21, 90,54,78,90,97, 35, 67))

df_I_want

Solution

  • You could use match with unique

    df$new_col <- paste0("Participant_", match(df$Participant_Identifiers, 
                                         unique(df$Participant_Identifiers)))
    

    Or since Participant_Identifiers is factor, you can convert them to integer

    df$new_col <- paste0("Participant_", as.integer(df$Participant_Identifiers))