How to remove hashtag, user mentions & URLs from tweet. Twitter4j library(sentiment analysis) does not work properly with these noise words
Example: Tweet: Hello great morning today #summermorning @evilpriest @holysinner https://goo.le/asxmo/dataload.......
Should look like - Hello great morning today summermorning
Is there any method or utility available in twitter4J itself or we need to write our own? Please guide.
Use regular expressions to filter out the #es before parsing a sentence through the sentiment analysis pipeline! Use this:
String withoutHashTweet = originalTweet.replaceAll("[#]", "");
So "Hello great morning today #summermorning @evilpriest @holysinner " should return : "Hello great morning today summermorning @evilpriest @holysinner"
Similarly replace the hash in the code with @ to remove the respective sign