Search code examples
rdplyrtidyversestringr

Is there a way to replace a group of strings with another group if strings via dplyr and stringr


I have reproduced a simple version of my problem. I essentially want to replace the English words in the statement column with the Spanish equivalent, for all statements.

library(tidyverse)
english <- c('hello','world','my','name', 'is')
spanish <- c('hola','mundo','mi','nombre', 'es')
statement <-c('Hello my name is john doe',' hello world','my name is world','hello john, my world is','jane is my world ')

df <- data.frame(english,spanish,statement)  
df

I tried

df %>% 
  str_replace_all(statement, c(df$english), c(df$spanish))

and

str_replace_all(statement, c(df$english), c(df$spanish)).

The second try got me closer to my answer. Only one answer was replaced.


Solution

  • As a super-simple solution, str_replace_all takes a named vector and automatically matches everything up:

    library(tidyverse)
    english <- c('hello','world','my','name', 'is')
    spanish <- c('hola','mundo','mi','nombre', 'es')
    statement <-c('Hello my name is john doe',' hello world','my name is world','hello john, my world is','jane is my world ')
    
    names(spanish) <- english
    
    str_replace_all(statement, spanish)
    #> [1] "Hello mi nombre es john doe" " hola mundo"                
    #> [3] "mi nombre es mundo"          "hola john, mi mundo es"     
    #> [5] "jane es mi mundo "