Search code examples
pythonseleniumfirefoxtwitterwebdriver

How to get all tweets with python selenium?


I want to get all tweets written all the time from an user. I writted a simple python script for that but the problem is when the browser opens page and gets tweets from page source code it gives only tweets from dead page. Dead page means we can't move in it. It shows only requested url. Twitter users know that the pages are dynamic. So the page is refreshed as it goes down. What I want is that the Selenium takes the page down by itself and takes the tweets to the end.

this is my code :

from selenium import webdriver
from bs4 import BeautifulSoup

driver_path = "C:\\Users\\Muhammd\\Desktop\\geckodriver.exe"

browser = webdriver.Firefox(executable_path= driver_path)
browser.get("https://twitter.com/ErhanErkut")
soup = BeautifulSoup(browser.page_source, 'html.parser')
tweets = [p.text for p in soup.findAll('p', class_ = 'tweet-text')]
for i in tweets:
    print(i)

Solution

  • I would recommend twitter API instead (Notice the screen_name and count):

    import twitter
    api = twitter.Api(consumer_key='your-twitter app consumer key',
      consumer_secret='your secret',
      access_token_key='XXXX',
      access_token_secret='XXXXXX')
    
    #print(api.VerifyCredentials())
    
    tweets = api.GetUserTimeline(screen_name="ErhanErkut", count=20)
    
    print(tweets)
    

    To run the above program, first install:

    pip install python-twitter
    

    And after this create a twitter app on https://developer.twitter.com/. On the twitter app you can see the consumer keys and you can generate access tokens.

    Twitter Developer API reference

    You can download big data using Twitter API.