I'm doing some webscraping and I actually have a problem with my code.
All I wanna do is:
I think what's happening is this:
I know it's not working because when the program clicks SEND, the website says it didnt identify the cookies from hCaptcha or the text input. When I run the same routine without using selenium (when I do it totally manually), it works properly. I already tried changing browsers.
What should I do? Please, demonstrate it in my code following:
import time
import pickle
from selenium.webdriver.common.by import By
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--disable-notifications")
chrome_options.add_argument("--disable-infobars")
driver = webdriver.Chrome(options=chrome_options)
driver.get("https://solucoes.receita.fazenda.gov.br/Servicos/cnpjreva/cnpjreva_solicitacao.asp")
# Waiting for the client to manually fill captcha
time.sleep(45)
cnpj_code = "33.224.254/0001-42"
# Then, it fills the text input
campo_texto = driver.find_element(By.XPATH, "//html//body//div[1]//div[1]//div//div//div//form//div[1]//div[1]//div//div//input")
campo_texto.send_keys(cnpj_code)
# Then, it clicks the button to GO
enter_button = driver.find_element(By.XPATH, "//html//body//div[1]//div[1]//div//div//div//form//div[3]//div//button[1]")
enter_button.click()
time.sleep(10) #should actually show the results but just stays in the same page and the website says there is a error with cookies
driver.quit()
Tried changing browsers. I expected it to appear the results the same way it does when I run the website manually (without using selenium). It resulted in a cookie error.
You can fix the problem with undetected-chromedriver which you can easily install:
pip install undetected-chromedriver
Here's how with a clean and better version of the code:
# Import necessary libraries
import undetected_chromedriver as uc
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
import selenium.webdriver.support.expected_conditions as EC
# Initialize an undetected Chrome driver instance
driver = uc.Chrome()
# Create a WebDriverWait object with a timeout of 30 seconds
wait = WebDriverWait(driver, 30)
driver.get("https://solucoes.receita.fazenda.gov.br/Servicos/cnpjreva/cnpjreva_solicitacao.asp")
# Wait for the CAPTCHA iframe to appear and switch to it
frame = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'iframe[title="Widget contendo caixa de seleção para desafio de segurança hCaptcha"]')))
driver.switch_to.frame(frame)
# Wait for the CAPTCHA box to be checked (indicating manual completion by the user)
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'div[aria-checked="true"]')))
# Switch back to the default content (out of the CAPTCHA iframe)
driver.switch_to.default_content()
# Define the CNPJ code to be entered
cnpj_code = "33.224.254/0001-42"
# Locate the CNPJ input element, enter the CNPJ code, and proceed
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "input#cnpj"))).send_keys(cnpj_code)
# click on the button 'Consultar'
driver.find_element(By.CSS_SELECTOR, "button.btn.btn-primary").click()
# Wait for the results page to load
wait.until(EC.presence_of_element_located((By.ID, 'principal')))