Search code examples
pythonrequesturllib2

Get input from page source contained in python var


I have a Python script that make a request using urllib2, and store in a var the entire source code of the web page using:

source = urlopen(request).read().decode()

Assuming there is the following html input in the source variable

<input name="form1" type="hidden" value="value1">

How do I get the value of that input contained in my var? Can I have a sample code for doing that?

Edit:

As suggested, a BeautifulSoup code like this should work?

soup = BeautifulSoup(source, 'html.parser')
for value in soup.find(name='value1'):
    value = value.get('value')

Solution

  • You need to use BeautifulSoup. So, let's say you want to extract the value of the value attribute. Here's how you'd do it:

    import BeautifulSoup
    import urllib2
    
    request = "http://example.com"
    source = urllib2.urlopen(request).read().decode()
    # Or you can test with:
    # source = "<input name='form1' type='hidden' value='value1'>"
    soup = BeautifulSoup(source, "html.parser")
    value = soup.find("input", {"name": "form1"}).get("value")