Login to a website then open it in browser

I am trying to write a Python 3 code that logins in to a website and then opens it in a web browser to be able to take a screenshot of it. Looking online I found that I could do webbrowser.open('example.com') This opens the website, but cannot login. Then I found that it is possible to login to a website using the request library, or urllib. But the problem with both it that they do not seem to provide the option of opening a web page.

So how is it possible to login to a web page then display it, so that a screenshot of that page could be taken

Thanks

Solution

Have you considered Selenium? It drives a browser natively as a user would, and its Python client is pretty easy to use.

Here is one of my latest works with Selenium. It is a script to scrape multiple pages from a certain website and save their data into a csv file:

import os
import time
import csv

from selenium import webdriver

cols = [
    'ies', 'campus', 'curso', 'grau_turno', 'modalidade',
    'classificacao', 'nome', 'inscricao', 'nota'
]

codigos = [
    96518, 96519, 96520, 96521, 96522, 96523, 96524, 96525, 96527, 96528
]

if not os.path.exists('arquivos_csv'):
    os.makedirs('arquivos_csv')

options = webdriver.ChromeOptions()
prefs = {
    'profile.default_content_setting_values.automatic_downloads': 1,
    'profile.managed_default_content_settings.images': 2
}
options.add_experimental_option('prefs', prefs)

# Here you choose a webdriver ("the browser")
browser = webdriver.Chrome('chromedriver', chrome_options=options)

for codigo in codigos:
    time.sleep(0.1)
    # Here is where I set the URL
    browser.get(f'http://www.sisu.mec.gov.br/selecionados?co_oferta={codigo}')
    with open(f'arquivos_csv/sisu_resultados_usp_final.csv', 'a') as file:
        dw = csv.DictWriter(file, fieldnames=cols, lineterminator='\n')
        dw.writeheader()
        ies = browser.find_element_by_xpath('//div[@class ="nome_ies_p"]').text.strip()
        campus = browser.find_element_by_xpath('//div[@class ="nome_campus_p"]').text.strip()
        curso = browser.find_element_by_xpath('//div[@class ="nome_curso_p"]').text.strip()
        grau_turno = browser.find_element_by_xpath('//div[@class = "grau_turno_p"]').text.strip()
        tabelas = browser.find_elements_by_xpath('//table[@class = "resultado_selecionados"]')
        for t in tabelas:
            modalidade = t.find_element_by_xpath('tbody//tr//th[@colspan = "4"]').text.strip()
            aprovados = t.find_elements_by_xpath('tbody//tr')
            for a in aprovados[2:]:
                linha = a.find_elements_by_class_name('no_candidato')
                classificacao = linha[0].text.strip()
                nome = linha[1].text.strip()
                inscricao = linha[2].text.strip()
                nota = linha[3].text.strip().replace(',', '.')
                dw.writerow({
                    'ies': ies, 'campus': campus, 'curso': curso,
                    'grau_turno': grau_turno, 'modalidade': modalidade,
                    'classificacao': classificacao, 'nome': nome,
                    'inscricao': inscricao, 'nota': nota
                })

browser.quit()

In short, you set preferences, choose a webdriver (I recommend Chrome), point to the URL and that's it. The browser is automatically opened and start executing your instructions.

I have tested using it to log in and it works fine, but never tried to take screenshot. It theoretically should do.