Search code examples
pythonselenium-webdriverbrowser-automationpyjamas

Can anyone clarify some options for Python Web automation


I'm trying to make a simple script in python that will scan a tweet for a link and then visit that link. I'm having trouble determining which direction to go from here. From what I've researched it seems that I can Use Selenium or Mechanize? Which can be used for browser automation. Would using these be considered web scraping?

Or

I can learn one of the twitter apis , the Requests library, and pyjamas(converts python code to javascript) so I can make a simple script and load it into google chrome's/firefox extensions.

Which would be the better option to take?


Solution

  • There are many different ways to go when doing web automation. Since you're doing stuff with Twitter, you could try the Twitter API. If you're doing any other task, there are more options.

    • Selenium is very useful when you need to click buttons or enter values in forms. The only drawback is that it opens a separate browser window.

    • Mechanize, unlike Selenium, does not open a browser window and is also good for manipulating buttons and forms. It might need a few more lines to get the job done.

    • Urllib/Urllib2 is what I use. Some people find it a bit hard at first, but once you know what you're doing, it is very quick and gets the job done. Plus you can do things with cookies and proxies. It is a built-in library, so there is no need to download anything.

    • Requests is just as good as urllib, but I don't have a lot of experience with it. You can do things like add headers. It's a very good library.

    Once you get the page you want, I recommend you use BeautifulSoup to parse out the data you want.

    I hope this leads you in the right direction for web automation.