summary(housingdata$City)
output ---> Amsterdam Amsterdam-Zuidoost BerlÃn
14791 167 1
Berlin 爱ä¸\u0081å ¡ ì—\u0090ë“ ë²„ëŸ¬
13641 4 1
NA Others Stockholm
0 8231 692
NA's
46
I tried the following codes, but they don't seem to work:
housingdata$City[housingdata$City == 'NA'] <- NA
housingdata$City[housingdata$City == '爱ä¸\u0081å'] <- NA
housingdata$City[housingdata$City == 'BerlÃn'] <- NA
housingdata$City[housingdata$City == 'ì—\u0090ë“ ë²„ëŸ¬'] <- NA
We can use grep
to return only letters
subset(housingdata, grepl('^[A-Za-z_ -]+$', City))