BeautifulSoup: iteration over 24 char (from a to z) fails : reducing the complexity to get a first insight into the dataset:

i have a list of insurers in spain - it is collected in 24 rubriques - on a website: See the following

insurandes - espanol: the full list:

it is divided into 24 pages:

idea - what is aimed: i want to fetch the data from the pages- with BS4 and request - and finally save it into a dataframe: Well - the task of scraping the list from the website using BeautifulSoup (BS4) and requests in Python seems to be apropiate; i think that we need to go the following steps:

a. firstly we need to import necessary libraries: BeautifulSoup, requests, and pandas. b. then we need to use the requests library to get the HTML content of each of the pages that are interesting: i.e. A to Z-page. c. then i use BeautifulSoup to parse the HTML content. d. subsequently i think extracting the relevant information (insurers' names) from the parsed HTML is the next step e. finally i want to store the extracted data in a pandas DataFrame.

but this does not work... - also not for the iteration from A to Z:

import requests
from bs4 import BeautifulSoup
import pandas as pd

# Function to scrape insurers from a given URL
def scrape_insurers(url):
    response = requests.get(url)
    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html.parser')
        # Extracting insurer names
        insurers = [insurer.text.strip() for insurer in soup.find_all('h3')]
        return insurers
        print("Failed to retrieve data from", url)
        return []

# Define the base URL
base_url = ""

# List to store all insurers
all_insurers = []

# Loop through each page (A to Z)
for char in range(65, 91):  # ASCII codes for A to Z
    page_url = f"{base_url}#{chr(char)}"
    insurers = scrape_insurers(page_url)

# Convert the list of insurers to a pandas DataFrame
df = pd.DataFrame({'Insurer': all_insurers})

# Display the DataFrame

# Save DataFrame to a CSV file
df.to_csv('insurers_spain.csv', index=False) fails with the following results:

Failed to retrieve data from
Failed to retrieve data from
Failed to retrieve data from
Failed to retrieve data from
Failed to retrieve data from

and so forth and so forth:

well i think it is quite easier to reduce the steps of complexity in the first place.

i think that its better to take one single URL i want to visit. It is just better to test what results we get back with our request. After this is finished, now i can evaluate the request; well i think i can use the beautiful soup lib to check for specific fields in common. well i think that i should avoid to do three things (which can obviously terrible wrong) in one step.

so i do it like so for the first character: for A:

import requests
from bs4 import BeautifulSoup

# Function to scrape insurers from a given URL
def scrape_insurers(url):
    response = requests.get(url)
    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html.parser')
        # Extracting insurer names
        insurers = [insurer.text.strip() for insurer in soup.find_all('h3')]
        return insurers
        print("Failed to retrieve data from", url)
        return []

# Define the base URL
base_url = ""

# Define the character we want to fetch data for
char = 'A'

# Construct the URL for the specified character
url = base_url + char

# Fetch and print data for the specified character
insurers_char = scrape_insurers(url)
print(f"Insurers for character '{char}':")

but see the Output here:

Failed to retrieve data from
Insurers for character 'A':


  • Try:

    import pandas as pd
    import requests
    from bs4 import BeautifulSoup
    url = ""
    headers = {
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:124.0) Gecko/20100101 Firefox/124.0"
    soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
    data = []
    for c in".contact-item"):
        for t in"span, a"):
        title, *other = c.get_text(separator="|||", strip=True).split("|||")
            {"Title": title, **{(s := d.split(":", maxsplit=1))[0]: s[1] for d in other}}
    df = pd.DataFrame(data)


                                                                                          Title                         Tfno.                           Fax                                                         Web                                                                                           Dirección                                          Email
    0                               A.M.A., AGRUPACIÓN MUTUAL ASEGURADORA, MUTUA DE SEGUROS APF                  91 343 47 00                (91) 343 47 68                                                                                       VÍA DE LOS POBLADOS, 3 28033  (MADRID)                                            NaN
    1                                                  ABANCA GENERALES DE SEGUROS Y REASEGUROS         881920742 / 881920744                           NaN                                                         NaN                                                  AV. LINARES RIVAS 30, 3º 15005 A CORUÑA (A CORUÑA)                                            NaN
    2                                     ABANCA VIDA Y PENSIONES DE SEGUROS Y REASEGUROS, S.A.                   981 188 075                           NaN                                                         NaN                                         AVENIDA DE LA MARINA, 1-3ª PLANTA 15001 A CORUÑA (A CORUÑA)                                            NaN
    3                                          ADMIRAL EUROPE COMPAÑIA DE SEGUROS S.A.U. (AECS)                           NaN                           NaN                                                                   RODRÍGUEZ MARÍN, 61 - 1ª PLANTA 28016 MADRID (MADRID)                                            NaN
    4                                    AEGON ESPAÑA, SOCIEDAD ANÓNIMA DE SEGUROS Y REASEGUROS                  91 563 62 22                           NaN                                                VÍA DE LOS POBLADOS, 3 - EDIFICIO 4B - PARQUE EMPRESARIAL CRISTALIA 28033  (MADRID)                                            NaN
    5                                          AGROPELAYO SOCIEDAD DE SEGUROS, SOCIEDAD ANÓNIMA                           NaN                           NaN                                                         NaN                                                             SANTA ENGRACIA, 67 - 69 28010  (MADRID)                                            NaN