Search code examples
pythondynamicwebscreen-scraping

Simple Dynamic Web Scraping - Without BeautifulSoup


I'm busy trying to scrape a dynamic website in order to get a URL that I can use to download the server software for a game every time it updates.

The site is "http://craftstud.io/builds" and where it says "Server XX.X.X.X" is what I'm trying to scrape.

I really don't want it to get complicated with Javascript and external modules, so if there is a simple solution I am all ears.

I also can't for the life of me get third party modules installed such as BeautifulSoup (Stupid Windows).

Thanks all!


Solution

  • If you want something simple, consider using a simple regular expression:

    >>> import re
    >>> import urllib2
    >>> html = urllib2.urlopen("http://craftstud.io/builds").read()
    >>> re.search(r"Server \d+\.\d+\.\d+\.\d+", html).group()
    'Server 0.1.24.1'
    

    That said, if you can install BeautifulSoup4 via pip, you'll find lots of use for it in the future. (Make sure you use pip install BeautifulSoup4 instead of just pip install BeautifulSoup I just installed a copy on a windows machine a couple days ago.)