Search code examples
pythonweb-scrapingbeautifulsoupcode-contractsetherscan

How to "webscrape" a site containing a popup window, using python?


I am trying to web scrape a certain part of the etherscan site with python, since there is no api for this functionality. Basically going to this link and one would need to press verify, after doing so a popup comes up which you can see here. What I need to scrape is this part 0x0882477e7895bdc5cea7cb1552ed914ab157fe56 in case the message starts with the message as seen in the picture.

I've written the below python script that starts this off, but I don't know how it's possible to interact further with the site, in order to have that popup come to the foreground and scrape the information. Is this possible to do?

from bs4 import BeautifulSoup
from requests import get

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0','X-Requested-With': 'XMLHttpRequest',}
url = "https://etherscan.io/proxyContractChecker?a=0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48"
response = get(url,headers=headers )
soup = BeautifulSoup(response.content,'html.parser')

Thank You


Solution

  • import requests
    from bs4 import BeautifulSoup
    
    
    def Main(url):
        with requests.Session() as req:
            r = req.get(url, headers={'User-Agent': 'Ahmed American :)'})
            soup = BeautifulSoup(r.content, 'html.parser')
            vs = soup.find("input", id="__VIEWSTATE").get("value")
            vsg = soup.find("input", id="__VIEWSTATEGENERATOR").get("value")
            ev = soup.find("input", id="__EVENTVALIDATION").get("value")
            data = {
                '__VIEWSTATE': vs,
                '__VIEWSTATEGENERATOR': vsg,
                '__EVENTVALIDATION': ev,
                'ctl00$ContentPlaceHolder1$txtContractAddress': '0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48',
                'ctl00$ContentPlaceHolder1$btnSubmit': "Verify"
            }
            r = req.post(
                "https://etherscan.io/proxyContractChecker?a=0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48", data=data, headers={'User-Agent': 'Ahmed American :)'})
            soup = BeautifulSoup(r.content, 'html.parser')
            token = soup.find(
                "div", class_="alert alert-success").text.split(" ")[-1]
            print(token)
    
    
    Main("https://etherscan.io/proxyContractChecker")
    

    Output:

    0x0882477e7895bdc5cea7cb1552ed914ab157fe56