Collecting Unique tweets REST API

Background: I want to get only unique tweets. According to comments on stackoverflow, one way to do this is to create a set

However, when I try the following code, I get an TypeError: Unhashable. I found some info here TypeError : Unhashable type. I also know I can remove duplicates in MongoDB, where I am storing, but it's cleaner if I do it before storing.

Question: Is there a way I can only collect unique tweets?

results = []
pages = 2 
counts = 100

while True:        
    for tweet in tweepy.Cursor(api.search, q = keywords, since="2017-07-21", until="2017-07-27", count = counts, lang = language,monitor_rate_limit=True, wait_on_rate_limit=True).pages(pages):
        results.extend(tweet)


    results = set(results)

Solution

It is difficult to say for sure without a concrete example

{ ~ }  » python                                                                                                                            
>>> results = ["hi", "hello", "hi", "goodbye"]
>>> a = set()
>>> for tweet in results:
...     a.add(tweet)
...
>>> print a
set(['hi', 'hello', 'goodbye'])
>>>

as you can see above the set has only 1 "hi", you shouldn't try to hash the entire list as a whole.

Ok, as per your comments I did a littler reverse engineering, I determined that the tweets have a text field that you need to add to the set,

so just replace a.add(tweet) with a.add(tweet.text)