Search code examples
pythonselenium-webdriverweb-scrapingcookiesweb-crawler

Why isn't this website loading cookies properly with Selenium + Python?


I'm doing some webscraping and I actually have a problem with my code.

All I wanna do is:

  1. Entering the website https://solucoes.receita.fazenda.gov.br/Servicos/cnpjreva/cnpjreva_solicitacao.asp
  2. Waiting 40 seconds for the client to fill captcha MANUALLY.
  3. After the explicit 40 seconds, the code automatically fills the text input and clicks the SEND button.
  4. Want it to work properly.

I think what's happening is this:

  1. I'm not properly saving cookies (both from captcha and any other).
  2. I'm not properly restoring cookies after clicking SEND.

I know it's not working because when the program clicks SEND, the website says it didnt identify the cookies from hCaptcha or the text input. When I run the same routine without using selenium (when I do it totally manually), it works properly. I already tried changing browsers.

What should I do? Please, demonstrate it in my code following:

import time
import pickle
from selenium.webdriver.common.by import By
from selenium import webdriver

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--disable-notifications")
chrome_options.add_argument("--disable-infobars")

driver = webdriver.Chrome(options=chrome_options)

driver.get("https://solucoes.receita.fazenda.gov.br/Servicos/cnpjreva/cnpjreva_solicitacao.asp")

# Waiting for the client to manually fill captcha
time.sleep(45)

cnpj_code = "33.224.254/0001-42"

# Then, it fills the text input
campo_texto = driver.find_element(By.XPATH, "//html//body//div[1]//div[1]//div//div//div//form//div[1]//div[1]//div//div//input")
campo_texto.send_keys(cnpj_code)

# Then, it clicks the button to GO
enter_button = driver.find_element(By.XPATH, "//html//body//div[1]//div[1]//div//div//div//form//div[3]//div//button[1]")
enter_button.click()

time.sleep(10) #should actually show the results but just stays in the same page and the website says there is a error with cookies

driver.quit()

Tried changing browsers. I expected it to appear the results the same way it does when I run the website manually (without using selenium). It resulted in a cookie error.


Solution

  • You can fix the problem with undetected-chromedriver which you can easily install:

    pip install undetected-chromedriver
    

    Here's how with a clean and better version of the code:

    # Import necessary libraries
    import undetected_chromedriver as uc
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.wait import WebDriverWait
    import selenium.webdriver.support.expected_conditions as EC
    
    # Initialize an undetected Chrome driver instance
    driver = uc.Chrome()
    
    # Create a WebDriverWait object with a timeout of 30 seconds
    wait = WebDriverWait(driver, 30)
    driver.get("https://solucoes.receita.fazenda.gov.br/Servicos/cnpjreva/cnpjreva_solicitacao.asp")
    
    # Wait for the CAPTCHA iframe to appear and switch to it
    frame = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'iframe[title="Widget contendo caixa de seleção para desafio de segurança hCaptcha"]')))
    driver.switch_to.frame(frame)
    
    # Wait for the CAPTCHA box to be checked (indicating manual completion by the user)
    wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'div[aria-checked="true"]')))
    
    # Switch back to the default content (out of the CAPTCHA iframe)
    driver.switch_to.default_content()
    
    # Define the CNPJ code to be entered
    cnpj_code = "33.224.254/0001-42"
    
    # Locate the CNPJ input element, enter the CNPJ code, and proceed
    wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "input#cnpj"))).send_keys(cnpj_code)
    # click on the button 'Consultar'
    driver.find_element(By.CSS_SELECTOR, "button.btn.btn-primary").click()
    
    # Wait for the results page to load
    wait.until(EC.presence_of_element_located((By.ID, 'principal')))
    

    enter image description here