Search code examples
rstring-matchingstringrstringi

Extract last word in a string after comma if there are multiple words else the first word


I have data where the words as follows

 location<- c("xyz, sss, New Zealand", "USA", "Pris,France")
 id<- c(1,2,3)
 df<-data.frame(location,id)

I would like to extract the country name from the data. The tricky part is if i extract just the last word then I will have only one record (France).

library(stringr)
df$country<- word(df$location,-1)

Any ideas on how to extract country data from this data?

 id  location                      country
  1   xyz, sss, New Zealand        New Zealand
  2   USA                          USA
  3   Pris,France                  France

Solution

  • You can try sub

     df$country <- sub('.*,\\s*', '', df$location)
     df$country
     #[1] "New Zealand" "USA"         "France"   
    

    Or

     library(stringr)
     str_extract(df$location, '\\b[^,]+$')
     #[1] "New Zealand" "USA"         "France"