I wrote a small code to extract hashtags from tweets in R
m<-c(paste("Hello! #London is gr8. #Wow"," ")) # My tweet
#m<- c("Hello! #London is gr8. #Wow")
x<- unlist(gregexpr("#(\\S+)",m))
#substring(m,x)[1]
subs<-function(x){
return(substring(m,x+1,(x-2+regexpr(" |\\n",substring(m,x)[1]))))
}
tag<- sapply(x, subs)
#x
tag
This code didn't work without my appending the space at the end of the tweet. What could be the reason? I tried \n as well.
$
matches the end of a string.
m<- c("Hello! #London is gr8. #Wow")
subs<-function(x){
return(substring(m,x+1,(x-2+regexpr(" |$",substring(m,x)[1]))))
}
With the rest of your code intact:
> tag
[1] "London" "Wow"