python web-scraping beautifulsoup python-requests request

how to expand content in beautiful soup using a button

I am trying to web scrape this site link. The problem is that the page link remains the same even if I click the expand content button. I need to web scrape all of the news dating back to the first post.

      import bs4, requests, 

      rom bs4 import BeautifulSoup

      url = "https://www.internazionale.it/tag/la-settimana"

      html = requests.get(url)

      html.raise_for_status()

      s = BeautifulSoup(html.text, 'html.parser')

      results = s.find('div',  class\_='hentryfeed__container container_full')

      link_articolo = results.find_all('div', class\_='box-article-intro')

      for articolo in link_articolo:

         link_articoli = articolo.find('a', href=True)

     print('https://www.internazionale.it' + link_articoli\['href'\])

This is the working code for page one, but the button that expand the content doesn't change the url code, so I need to find a new solution to web scrape all the news untill the first post

Solution

To get all the links you can use this example (emulating the Ajax call using requests):

import requests
from bs4 import BeautifulSoup

url = "https://www.internazionale.it/tag/la-settimana"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

stream_id = soup.select_one("[data-stream-id]")["data-stream-id"]

# load first links
links = []
for article in soup.select(".box-article__data"):
    links.append("https://www.internazionale.it" + article.a["href"])
    data_datetime = article.find_previous(attrs={"data-datetime": True})[
        "data-datetime"
    ].split()[0]

# load rest of the links
while True:
    url = f"https://data.internazionale.it/stream_data/items/tag/0/{stream_id}/{data_datetime}.json"
    data = requests.get(url).json()

    if not data.get("items"):
        break

    for i in data["items"]:
        links.append("https://www.internazionale.it" + i["url"])
        print(links[-1])

    data_datetime = data["datetime"].split()[0]

# `links` now contains all the links

Prints:

...

https://www.internazionale.it/opinione/giovanni-de-mauro/2001/08/02/la-battaglia-di-genova
https://www.internazionale.it/opinione/giovanni-de-mauro/2001/01/13/astroturf
https://www.internazionale.it/opinione/giovanni-de-mauro/1999/03/11/tutti-al-centro
https://www.internazionale.it/opinione/giovanni-de-mauro/1998/05/07/il-futuro-di-israele
https://www.internazionale.it/opinione/giovanni-de-mauro/1998/04/29/i-nuovi-vicini
https://www.internazionale.it/opinione/giovanni-de-mauro/1995/12/22/interviste