Search code examples
pythonnlpnltksentiment-analysis

Unable to import process_tweets from utils


Thanks for looking into this, I have a python program for which I need to have process_tweet and build_freqs for some NLP task, nltk is installed already and utils wasn't so I installed it via pip install utils but the above mentioned two modules apparently weren't installed, the error I got is standard one here,

ImportError: cannot import name 'process_tweet' from
'utils' (C:\Python\lib\site-packages\utils\__init__.py)

what have I done wrong or is there anything missing? Also I referred This stackoverflow answer but it didn't help.


Solution

  • You can easily access any source code with ??, for example in this case: process_tweet?? (the code above from deeplearning.ai NLP course custome utils library):

    def process_tweet(tweet):
    """Process tweet function.
    Input:
        tweet: a string containing a tweet
    Output:
        tweets_clean: a list of words containing the processed tweet
    
    """
    stemmer = PorterStemmer()
    stopwords_english = stopwords.words('english')
    # remove stock market tickers like $GE
    tweet = re.sub(r'\$\w*', '', tweet)
    # remove old style retweet text "RT"
    tweet = re.sub(r'^RT[\s]+', '', tweet)
    # remove hyperlinks
    tweet = re.sub(r'https?:\/\/.*[\r\n]*', '', tweet)
    # remove hashtags
    # only removing the hash # sign from the word
    tweet = re.sub(r'#', '', tweet)
    # tokenize tweets
    tokenizer = TweetTokenizer(preserve_case=False, strip_handles=True,
                               reduce_len=True)
    tweet_tokens = tokenizer.tokenize(tweet)
    
    tweets_clean = []
    for word in tweet_tokens:
        if (word not in stopwords_english and  # remove stopwords
                word not in string.punctuation):  # remove punctuation
            # tweets_clean.append(word)
            stem_word = stemmer.stem(word)  # stemming word
            tweets_clean.append(stem_word)