Search code examples
pythonweb-scrapingbeautifulsoupsearchbar

Insert value in searchbar, select autocomplete result and get value by bs4


I am trying to use Beautiful Soup to read a value from a web page. The following steps are necessary:

  1. go to the webpage: url = 'https://www.msci.com/our-solutions/esg-investing/esg-fund-ratings/funds/'

  2. insert the ISIN in the searchbar

enter image description here

3. select the autocomplete-results from the container msci-ac-search-data-dropdown (click) 4. read the value from the "div class: ratingdata-outercircle esgratings-profile-header-green" to get the text: "ratingdata-fund-rating esg-fund-ratings-circle-aaa".

so far i have tried the following:


import requests
from bs4 import BeautifulSoup

isin = 'IE00B4L5Y983'

url = 'https://www.msci.com/our-solutions/esg-investing/esg-fund-ratings/funds/'
soup = BeautifulSoup( requests.get(url).content, 'html.parser' )

payload = {}
for i in soup.select('form[action="https://www.msci.com/search"] input[value]'):
    payload[i['name']] = i['value']
payload['UQ_txt'] = isin

Solution

  • Try:

    import requests
    from bs4 import BeautifulSoup
    
    isin = "IE00B4L5Y983"
    url = "https://www.msci.com/our-solutions/esg-investing/esg-fund-ratings"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0",
        "X-Requested-With": "XMLHttpRequest",
    }
    
    params = {
        "p_p_id": "esg_fund_ratings_profile",
        "p_p_lifecycle": "2",
        "p_p_state": "normal",
        "p_p_mode": "view",
        "p_p_resource_id": "searchFundRatingsProfiles",
        "p_p_cacheability": "cacheLevelPage",
        "_esg_fund_ratings_profile_keywords": isin,
    }
    
    data = requests.get(url, params=params, headers=headers).json()
    
    params = {
        "p_p_id": "esg_fund_ratings_profile",
        "p_p_lifecycle": "2",
        "p_p_state": "normal",
        "p_p_mode": "view",
        "p_p_resource_id": "showEsgFundRatingsProfile",
        "p_p_cacheability": "cacheLevelPage",
        "_esg_fund_ratings_profile_fundShareClassId": data[0]["url"],
    }
    
    headers["Referer"] = "https://www.msci.com/our-solutions/esg-investing/esg-fund-ratings/funds/{}/{}".format(
        data[0]["encodedTitle"], data[0]["url"]
    )
    
    soup = BeautifulSoup(
        requests.get(url, params=params, headers=headers).content, "html.parser"
    )
    data = soup.select_one(".ratingdata-fund-rating")["class"]
    print(data)
    

    Prints:

    ['ratingdata-fund-rating', 'esg-fund-ratings-circle-aaa']