Search code examples
rstringdelete-rowdplyr

How to delete rows that contain special characters in R


summary(housingdata$City)

output ---> Amsterdam Amsterdam-Zuidoost Berlín

         14791                167                  1 
        Berlin     爱ä¸\u0081å ¡  ì—\u0090ë“ ë²„ëŸ¬ 
         13641                  4                  1 
            NA             Others          Stockholm 
             0               8231                692 
          NA's 
            46 

I tried the following codes, but they don't seem to work:

housingdata$City[housingdata$City == 'NA'] <- NA
housingdata$City[housingdata$City == '爱ä¸\u0081å'] <- NA
housingdata$City[housingdata$City == 'BerlÃn'] <- NA
housingdata$City[housingdata$City == 'ì—\u0090ë“ ë²„ëŸ¬'] <- NA

Solution

  • We can use grep to return only letters

    subset(housingdata, grepl('^[A-Za-z_ -]+$', City))