I am using tweepy to get tweets pertaining to a certain hashtag(s) and then I send them to a certain black box for some processing. However, tweets containing any URL should not be sent. What would be the most appropriate way of removing any such tweets?
To go with @Colin's suggestion, this question covers the issue of finding urls with regex.
An example code snippet would be;
import re
// tweet_list is a list containing string you with to clean of urls
pattern = 'https?://(?:[-\w.]|(?:%[\da-fA-F]{2}))+'
filtered_tweet_list = [tweet for tweet in tweet_list if not re.findall(pattern, tweet)]