Search code examples
pythonselenium-webdriverweb-scraping

Extract google reviews from google map


from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
import time

# Specify the URL of the business page on Google Maps
url = 'https://www.google.com/maps/place/FRUYO+MALAYSIA/@2.2916032,111.8210233,17z/data=!4m8!3m7!1s0x31f77f4fb024a7e1:0x468c52dc9e9179c3!8m2!3d2.2916032!4d111.8210233!9m1!1b1!16s%2Fg%2F11p65htbhd?entry=ttu'

# Create an instance of the Chrome driver
driver = webdriver.Chrome()

# Navigate to the specified URL
driver.get(url)

# Wait for the reviews to load
wait = WebDriverWait(driver, 20)  # Increased the waiting time

# Scroll down to load more reviews
body = driver.find_element(By.XPATH, '//body')
num_reviews = len(driver.find_elements(By.CLASS_NAME, 'wiI7pd'))
while True:
    body.send_keys(Keys.END)
    time.sleep(2)  # Adjust the delay based on your internet speed and page loading time
    new_num_reviews = len(driver.find_elements(By.CLASS_NAME, 'wiI7pd'))
    if new_num_reviews == num_reviews:
        # Scroll to the top to ensure all reviews are loaded
        body.send_keys(Keys.HOME)
        time.sleep(2)
        break
    num_reviews = new_num_reviews

# Wait for the reviews to load completely
wait.until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'wiI7pd')))

# Extract the text of each review
review_elements = driver.find_elements(By.CLASS_NAME, 'wiI7pd')
reviews = [element.text for element in review_elements]

# Print the reviews
print(reviews)

# Close the browser
driver.quit()

Hi Everyone,

I need help in scraping the google reviews. The code above works fine, but it only scrap the first 8 reviews without scrolling to the bottom even though I already tried scroll down to load more reviews in my code but it doesn't work. Any one have any idea why is it so? Any help or advise is greatly appreciated!


Solution

  • You're scrolling down on the wrong element. You can check which element is the element that requires scrolling by searching in the Chrome Developer Tools Elements tab for an element that contains the scrollbar, and to be even more sure you could copy it's CSS selector and run document.querySelector("\<selector you just copied\>").scrollTop in the Console tab after each time you scroll to see if the value changes.

    So instead of doing

    body = driver.find_element(By.XPATH, '//body')
    

    You could fetch the element through XPATH

    body = driver.find_element(By.XPATH, "//div[contains(@class, 'm6QErb') and contains(@class, 'DxyBCb') and contains(@class, 'kA9KIf') and contains(@class, 'dS8AEf')]")
    

    Or through the CSS selector

    body = driver.find_element(By.CSS_SELECTOR, "div.m6QErb.DxyBCb.kA9KIf.dS8AEf")