Search code examples
pythonselenium-webdriverscreen-scraping

Finding URL with Selenium not printing correctly


I'm trying to compile a list of links within a page. However, when printing the list the output is a bunch of random numbers

links = driver.find_elements(By.CSS_SELECTOR, "meta[content*='www.airbnb.com.au/rooms/']")

print(links)

Example output:

[<selenium.webdriver.remote.webelement.WebElement (session="faf70ce53ba59d6f6995883b0edfc006", element="f.CD9BA18B85DC3350F27BC5BF71FF60B2.d.5C798985F0B6BFD4214898A37DFC2A30.e.81")>, <selenium.webdriver.remote.webelement.WebElement (session="faf70ce53ba59d6f6995883b0edfc006", element="f.CD9BA18B85DC3350F27BC5BF71FF60B2.d.5C798985F0B6BFD4214898A37DFC2A30.e.82")>, <selenium.webdriver.remote.webelement.WebElement (session="faf70ce53ba59d6f6995883b0edfc006", element="f.CD9BA18B85DC3350F27BC5BF71FF60B2.d.5C798985F0B6BFD4214898A37DFC2A30.e.83")>, <selenium.webdriver.remote.webelement.WebElement (session="faf70ce53ba59d6f6995883b0edfc006", element="f.CD9BA18B85DC3350F27BC5BF71FF60B2.d.5C798985F0B6BFD4214898A37DFC2A30.e.84")>]

The website im trying to scrape:

https://www.airbnb.com.au/s/Gold-Coast--QLD/homes?place_id=ChIJt2BdK0cakWsRcK_e81qjAgM&refinement_paths%5B%5D=%2Fhomes&checkin=2024-12-22&checkout=2024-12-28&date_picker_type=calendar&adults=9&children=2&pets=1&search_type=user_map_move&tab_id=home_tab&query=Gold%20Coast%2C%20QLD&flexible_trip_lengths%5B%5D=one_week&monthly_start_date=2024-08-01&monthly_length=3&monthly_end_date=2024-11-01&search_mode=regular_search&price_filter_input_type=2&price_filter_num_nights=6&channel=EXPLORE&ne_lat=-27.8312213554841&ne_lng=153.85466017727208&sw_lat=-28.314709414353157&sw_lng=153.3574718233147&zoom=10.257561001998951&zoom_level=10.257561001998951&search_by_map=true&price_min=8683&price_max=14459&min_bedrooms=5


Solution

  • print(links)
    

    You are just printing the list object which has web elements. To get the URLs from the target elements, you should capture the content attribute's value.

    Try this:

    for link in links:
        print(link.get_attribute("content"))
    

    Output:

    www.airbnb.com.au/rooms/50961691?adults=9&children=2&pets=1&search_mode=regular_search&check_in=2024-12-22&check_out=2024-12-28&source_impression_id=p3_1720094600_P3E3TFbSLV_F8Hv-&previous_page_section_name=1000
    www.airbnb.com.au/rooms/10732858?adults=9&children=2&pets=1&search_mode=regular_search&check_in=2024-12-22&check_out=2024-12-28&source_impression_id=p3_1720094600_P3xqAumg9A_T_K_Z&previous_page_section_name=1000
    www.airbnb.com.au/rooms/25083963?adults=9&children=2&pets=1&search_mode=regular_search&check_in=2024-12-22&check_out=2024-12-28&source_impression_id=p3_1720094600_P3LxjqyKDwEQn8FV&previous_page_section_name=1000
    www.airbnb.com.au/rooms/1112833302463251442?adults=9&children=2&pets=1&search_mode=regular_search&check_in=2024-12-22&check_out=2024-12-28&source_impression_id=p3_1720094600_P3NtINEuiRiA5r4F&previous_page_section_name=1000
    
    Process finished with exit code 0