Search code examples

How to get google search page html code using python?

I try to extract the google search page HTML code in python. I use requests module in python.

from bs4 import BeautifulSoup

url = ""

resp = requests.get(url)
soup = BeautifulSoup(resp.text, 'html.parser')
search = soup.find_all('div',class_="yuRUbf")

But I can't find any of this class_="yuRUbf" in the code. I think it do not give me the source code. Now how can I do this work.

I also used resp.content but it didn't work. I also selenium but it didn't work.


  • You can use SelectorGadget Chrome extension to easily get CSS selectors by clicking on the desired element in your browser (not always work perfectly if the website is rendered via JavaScript).

    To collect information from all pages you can use non-token pagination with while True loop. The while loop is an endless loop, the exit from which in our case is the presence of a switch button to the next page, namely the CSS selector ".d6cvqb a[id=pnnext]":

    if soup.select_one('.d6cvqb a[id=pnnext]'):
            params["start"] += 10

    Also you can exit the loop by using the limit on the number of search pages:

    if page_num == page_limit:

    Check code in the online IDE.

    from bs4 import BeautifulSoup
    import requests, json, lxml
    query = "how to get google search page source code by python"
    params = {
        "q": query,          # query example
        "hl": "en",          # language
        "gl": "uk",          # country of the search, UK -> United Kingdom
        "start": 0,          # number page by default up to 0
        #"num": 100          # parameter defines the maximum number of results to return.
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36"
    page_limit = 5           # page limit
    page_num = 0
    data = []
    while True:
        page_num += 1
        print(f"page: {page_num}")
        html = requests.get("", params=params, headers=headers, timeout=30)
        soup = BeautifulSoup(html.text, 'lxml')
        for result in".tF2Cxc"):
            title = result.select_one(".DKV0Md").text
               snippet = result.select_one(".lEBKkf span").text
               snippet = None
            links = result.select_one(".yuRUbf a")["href"]
              "title": title,
              "snippet": snippet,
              "links": links
        if page_num == page_limit:
        if soup.select_one(".d6cvqb a[id=pnnext]"):
            params["start"] += 10
    print(json.dumps(data, indent=2, ensure_ascii=False))

    Example output:

        "title": "How To Build a Website With Python -",
        "snippet": "Examples of Sites Created Using Python · Google: The most popular search engine in the world uses Python · Instagram: Python was used to create the backend of ...",
        "links": ""
        "title": "Google Search Operators: 40 Commands to Know in 2023 ...",
        "snippet": "30 Mar 2022 — ",
        "links": ""
        "title": "Python From Scratch: Create a Dynamic Website - Code",
        "snippet": "19 Feb 2022 — ",
        "links": ""
        "title": "How to Use Python to Analyze Google Search Results at Scale",
        "snippet": "21 Dec 2020 — ",
        "links": ""
      other results ...

    Or you can also use Google Search Engine Results API from SerpApi. It's a paid API with the free plan. The difference is that it will bypass blocks (including CAPTCHA) from Google, no need to create the parser and maintain it.

    Code example:

    from serpapi import GoogleSearch
    from urllib.parse import urlsplit, parse_qsl
    import json, os
    query = "how to get google search page source code by python"
    params = {
      "api_key": "...",                # serpapi key
      "engine": "google",              # serpapi parser engine
      "q": query,                      # search query
      "num": "100"                     # number of results per page (100 per page in this case)
      # other search parameters:
    search = GoogleSearch(params)      # where data extraction happens
    organic_results_data = []
    page_num = 0
    while True:
        results = search.get_dict()    # JSON -> Python dictionary
        page_num += 1
        for result in results["organic_results"]:
                "title": result.get("title"),
                "snippet": result.get("snippet"),
                "link": result.get("link")
        if "next_link" in results.get("serpapi_pagination", []):
    print(json.dumps(organic_results_data, indent=2, ensure_ascii=False))


        "title": "How To Work with Web Data Using Requests and Beautiful ...",
        "snippet": "This tutorial will go over how to work with the Requests and Beautiful Soup Python packages in order to make use of data from web pages.",
        "link": ""
        "title": "google search - Simply Python",
        "snippet": "I have included part of the code for the noun phrase detection (Under ... Run google search and obtain page source for the images.",
        "link": ""
        "title": "Web Scraping Using Selenium Python - Analytics Vidhya",
        "snippet": "Step 2 – Install Chrome Driver · Step 2 – Install Chrome Driver · Step 3 – Specify search URL",
        "link": ""
      other results ...

    There's a 13 ways to scrape any public data from any website blog post if you want to know more about website scraping.