Search code examples

How Can I Scrape Event Links and Contact Information from a Website with Python?

I am trying to scrape event links and contact information from the RaceRoster website ( using Python, requests, Pandas, and BeautifulSoup. The goal is to extract the Event Name, Event URL, Contact Name, and Email Address for each event and save the data into an Excel file so we can reach out to these events for business development purposes.

However, the script consistently reports that no event links are found on the search results page, despite the links being visible when inspecting the HTML in the browser. Here’s the relevant HTML for the event links from the search results page:

<a href="" 
   rel="noopener noreferrer" 
    13th Annual Delaware Tech Chocolate Run 5k

Steps Taken:

  1. Verified the correct selector for event links:"")
  1. Checked the response content from the requests.get() call using soup.prettify(). The HTML appears to lack the event links that are visible in the browser, suggesting the content may be loaded dynamically via JavaScript.

  2. Attempted to scrape the data using BeautifulSoup but consistently get:

Found 0 events on the page.
Scraped 0 events.
No contacts were scraped.

What I Need Help With:

  • How can I handle this JavaScript-loaded content? Is there a way to scrape it directly, or do I need to use a tool like Selenium?
  • If Selenium is required, how do I properly integrate it with BeautifulSoup for parsing the rendered HTML?

Current Script:

import requests
from bs4 import BeautifulSoup
import pandas as pd

def scrape_event_contacts(base_url, search_url):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
    event_contacts = []

    # Fetch the main search page
    print(f"Scraping page: {search_url}")
    response = requests.get(search_url, headers=headers)

    if response.status_code != 200:
        print(f"Failed to fetch page: {search_url}, status code: {response.status_code}")
        return event_contacts

    soup = BeautifulSoup(response.content, "html.parser")
    # Select event links
    event_links ="")

    print(f"Found {len(event_links)} events on the page.")

    for link in event_links:
        event_url = link['href']
        event_name = link.text.strip()  # Extract Event Name

            print(f"Scraping event: {event_url}")
            event_response = requests.get(event_url, headers=headers)
            if event_response.status_code != 200:
                print(f"Failed to fetch event page: {event_url}, status code: {event_response.status_code}")

            event_soup = BeautifulSoup(event_response.content, "html.parser")

            # Extract contact name and email
            contact_name = event_soup.find("dd", class_="event-details__contact-list-definition")
            email = event_soup.find("a", href=lambda href: href and "mailto:" in href)

            contact_name_text = contact_name.text.strip() if contact_name else "N/A"
            email_address = email['href'].split("mailto:")[1].split("?")[0] if email else "N/A"

            if contact_name or email:
                print(f"Found contact: {contact_name_text}, email: {email_address}")
                    "Event Name": event_name,
                    "Event URL": event_url,
                    "Event Contact": contact_name_text,
                    "Email": email_address
                print(f"No contact information found for {event_url}")
        except Exception as e:
            print(f"Error scraping event {event_url}: {e}")

    print(f"Scraped {len(event_contacts)} events.")
    return event_contacts

def save_to_spreadsheet(data, output_file):
    if not data:
        print("No data to save.")
    df = pd.DataFrame(data)
    df.to_excel(output_file, index=False)
    print(f"Data saved to {output_file}")

if __name__ == "__main__":
    base_url = ""
    search_url = ""
    output_file = "/Users/my_name/Documents/event_contacts.xlsx"

    contact_data = scrape_event_contacts(base_url, search_url)
    if contact_data:
        save_to_spreadsheet(contact_data, output_file)
        print("No contacts were scraped.")

Expected Outcome:

  • Extract all event links from the search results page.
  • Navigate to each event’s detail page.
  • Scrape the contact name () and email () from the detail page.
  • Save the results to an Excel file.


  • Use the API endpoint to get the data on upcoming events.

    Here's how:

    import requests
    from tabulate import tabulate
    import pandas as pd
    url = ''
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36',
    events = requests.get(url,headers=headers).json()['data']
    loc_keys = ["address", "city", "country"]
    table = [
            " ".join([event["location"][key] for key in loc_keys if key in event["location"]])
        ] for event in events
    columns = ["Name", "URL", "Location"]
    print(tabulate(table, headers=columns))
    df = pd.DataFrame(table, columns=columns)
    df.to_csv('5k_events.csv', index=False, header=True)

    This should print:

    Name                                         URL                                                                                         Location
    -------------------------------------------  ------------------------------------------------------------------------------------------  ----------------------------------------------------------------------------------------------------------------------------
    Credit Union Cherry Blossom                                Washington, D.C. Washington United States
    Big Cork Wine Run 5k                                              Big Cork Vineyards, 4236 Main Street, Rohrersville, MD 21779, U.S. Rohrersville United States
    3rd Annual #OptOutside Black Friday Fun Run  Grain H2O, Summit Harbour Place, Bear, DE, USA Bear United States
    Ryan's Race 5K walk Run                                         Odessa High School, Tony Marchio Drive, Townsend, DE Townsend United States
    13th Annual Delaware  Tech Chocolate Run 5k         Delaware Technical Community College - Charles L. Terry Jr. Campus - Dover, Campus Drive, Dover, DE, USA Dover United States
    Builders Dash 5k                                                      Rail Haus - Beer Garden, North West Street, Dover, DE Dover United States
    The Ivy Scholarship 5k                                          Hare Pavilion, River Place, Wilmington, DE Wilmington United States
    39th Firecracker 5k Run Walk                              Rockford Tower, Lookout Drive, Wilmington, DE Wilmington United States
    24th Annual John D Kelly Logan House 5k            Kelly's Logan House, Delaware Avenue, Wilmington, DE, USA Wilmington United States
    2nd Annual Scott Trot 5K                                      American Legion Post 17, American Legion Road, Lewes, DE Lewes United States


    To get more events data, just paginate the API with these parameters: l=10&p=1. For example, Also, note there's a field in meta -> hits that holds the number of found events. For your query that's 1465.