Search code examples
pythonapitwitterstreamingtweepy

Excluding link at the end while pulling tweets in tweepy Streaming


I am pulling text or extended_text using tweepy streaming, but when I pull these tweets, there is always a t.co/randomletters link at the end that leads to nowhere. What is it and how do I get rid of it? Here is an example:

 "text": "To make room for more expression, we will now count all emojis as equal—including those with gender‍‍‍ ‍‍and skin tone modifiers https://t.co(forward slash)MkGjXf9aXm"

Please help


Solution

  • As far as my experience with twitter and tweepy goes, these URL's are included in a tweet's text whenever there is a URL of some sort in the actual tweet, so we can't really avoid getting them.

    You could remove them after you get them, this is a simple regex that replaces the pattern of these URL's with a blank string.

    import re
    
    re.sub(r' https://t.co/\w{10}', '', tweet_text)