Search code examples
rdplyrnibble

Change values in a variable based on a conditional value in R


I want to change values in my username variable but only when they meet a condition set from the variable chatforum. For example, I want all instances of users called "Alex" from Canadian chatrooms to be relabeled as "AlexCA":

# mock dataset
library(tidyverse)
username <- c("Alex", "Alex", "Alex", "Alex")
id <- c(1001, 1002, 1003, 1001)
chatforum <- c("Canada", "U.S.", "U.K.", "Canada")

df <- cbind(username, id, chatforum)
df <- as_tibble(df)
glimpse(df)

df <- df  %>% filter(chatforum=="Canada") %>% 
  mutate(username = replace(username, username == "Alex", "AlexCA"))

Though the code above works, I want the entire dataset returned to me, with the changes I just made. Using filter returns a dataset with only the filtered rows, not the entire dataset.

I was advised to use if_else or case_when() but this also changes the username Alice to AlexCA, when I only want the username "Alex" to change when chatroom == Canada:

df <- df %>% mutate(username = if_else(chatforum=="Canada", "AlexCA", username))

Do you know how I can change the values in my username column based on the condition that the value is Alex and the chatroom value is equal to Canada?


Solution

  • For using case_when or ifelse, you can have multiple conditions that must be met in order to the apply the change. So, if chatforum == "Canada" & username == "Alex", then we change the name to AlexCA.

    library(tidyverse)
    
    df %>%
      mutate(username = case_when(
        chatforum == "Canada" & username == "Alex" ~ "AlexCA",
        TRUE ~ username
      ))
    

    Or in base R:

    df[df$chatforum == "Canada" & df$username == "Alex",]$username <- "AlexCA"
    

    Output

      username id    chatforum
      <chr>    <chr> <chr>    
    1 AlexCA   1001  Canada   
    2 Alex     1002  U.S.     
    3 Alex     1003  U.K.     
    4 AlexCA   1001  Canada  
    

    But if you need to do this for a lot of countries, then you might want to create a key or add a new column with the abbreviation you want. For example, you could do something like this, where we create an abbreviation from the chatforum, then combine it with the username.

    df %>%
      mutate(abrv = toupper(substr(str_replace_all(chatforum, "[[:punct:]]", ""), 1, 2))) %>%
      unite(username, c(username, abrv), sep = "")
    
    #  username id    chatforum
    #  <chr>    <chr> <chr>    
    #1 AlexCA   1001  Canada   
    #2 AlexUS   1002  U.S.     
    #3 AlexUK   1003  U.K.     
    #4 AlexCA   1001  Canada   
    

    Or instead of uniting after creating an abbreviation column, you could still use case_when for certain conditions.

    df %>%
      mutate(abrv = toupper(substr(str_replace_all(chatforum, "[[:punct:]]", ""), 1, 2))) %>%
      mutate(username = case_when(
        chatforum == "Canada" & username == "Alex" ~ paste0(username, abrv),
        TRUE ~ username
      ))
    
    #  username id    chatforum abrv 
    #  <chr>    <chr> <chr>     <chr>
    #1 AlexCA   1001  Canada    CA   
    #2 Alex     1002  U.S.      US   
    #3 Alex     1003  U.K.      UK   
    #4 AlexCA   1001  Canada    CA