Search code examples
pythonhttpweb-scrapingpython-requestsmechanicalsoup

Form request with mechanicalsoup not showing expected results


I am a novice in web-scraping and web-things in general (but pretty much used to Python), and I'd like to understand how it works to integrate a website search in a bioinformatics research tool.

Goal: retrieve the output of the form on http://www.lovd.nl/3.0/search

import mechanicalsoup

# Connect to LOVD
browser = mechanicalsoup.StatefulBrowser()
browser.open("http://www.lovd.nl/3.0/search")

# Fill-in the search form
browser.select_form('#websitevariantsearch')
browser["variant"] = "chr15:g.40699840C>T"
browser.submit_selected()

# Display the results
print(browser.get_current_page())

In the output I get the very same page ( http://www.lovd.nl/3.0/search). I tried with standard requests but I get another kind of error:

from requests import get, Session

url="http://www.lovd.nl/3.0/search"
formurl = "http://www.lovd.nl/3.0/ajax/search_variant.php"
client = Session()

#get the csrf
soup = BeautifulSoup(client.get(url).text, "html.parser")
csrf = soup.select('form input[name="csrf_token"]')[0]['value']

form_data = {
    "search": "",
    "csrf_token": csrf,
    "build": "hg19",
    "variant": "chr15:g.40699840C>T"
}

response = get(formurl, data=form_data)
html=response.content
return html

...and this returns only an

alert("Error while sending data.");

The form_data fields were took from the XHR request (from developer -> network tab).

I can see that the data is sent asynchronously via ajax but I do not understand the practical implications of this information.

Need some guidance


Solution

  • MechanicalSoup does not do JavaScript. The website you are trying to browse has:

    <form id="websitevariantsearch"
          action=""
          onsubmit="if ...">
    

    There's no action in the sense of traditional HTML forms, but there's a piece of JavaScript executed on submission. MechanicalSoup won't help here. Selenium may work: http://mechanicalsoup.readthedocs.io/en/stable/faq.html#how-does-mechanicalsoup-compare-to-the-alternatives