Search code examples
rgrepllevels

grepl in R: Replace character/numeric levels


I've like to replace my levels dog1 ... dog4 and cat1 ... cat4 by only two levels DOG and CAT, but if I use grepl my output as only NAs.

In my code:

x  <- (rep(c("dog1","dog2","dog3","dog4","cat1","cat2","cat3","cat4"),2)) #Levels
y<-rnorm(16)
d<-data.frame(cbind(x,y))
head(d)

     x                 y
1 dog1 0.906357739138289
2 dog2 0.974674552504268
3 dog3 0.664045049199848
4 dog4 0.911777985232099
5 cat1 0.246575548162824
6 cat2 0.758069789161901


d$x[grepl("dog", d$x)] <- "DOG" 

Warning message: In [<-.factor(*tmp*, grepl("dog", d$x), value = c(NA, NA, NA, : invalid factor level, NA generated

d$x[grepl("cat", d$x)] <- "CAT"

Warning message:
In `[<-.factor`(`*tmp*`, grepl("cat", d$x), value = c(NA_integer_,  :
  invalid factor level, NA generated

head(d)

     x                 y
1 <NA> 0.906357739138289
2 <NA> 0.974674552504268
3 <NA> 0.664045049199848
4 <NA> 0.911777985232099
5 <NA> 0.246575548162824
6 <NA> 0.758069789161901

My desirable output if the code run OK is:

head(d)

     x                 y
1 DOG  0.906357739138289
2 DOG  0.974674552504268
3 DOG  0.664045049199848
4 DOG  0.911777985232099
5 CAT  0.246575548162824
6 CAT  0.758069789161901

Solution

  • You could try creating the data frame with strings as factors false:

    d <- data.frame(cbind(x,y), stringsAsFactors=FALSE)
    d$x[grepl("dog", d$x)] <- "DOG"
    d$x[grepl("cat", d$x)] <- "CAT"