Search code examples
pythonpython-3.xweb-scrapingpython-requests

Failed to extract all the image links linked to the floorplans using the requests module


I'm trying to get the image links associated with the floor plans located in the middle of the webpage using the requests module. The links are available in the page source, but I can't manage to scrape them, even with regex, as they are scattered throughout it. There are 11 images in there.

import re
import json
import requests

link = 'https://www.livabl.com/abbotsford-bc/jem1'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
    'Referer': 'https://www.livabl.com/',
}

def get_floor_plan_images(link,headers):
    res = requests.get(link,headers=headers)
    print(res.status_code)
    match = re.search(r"\{\\\"images\\\":(.*?]),",res.text)
    if match:
        image_links = match.group(1)
        return image_links

images = get_floor_plan_images(link,headers)
print(images)

How can I extract all the image links connected to the floorplans using the requests module?


Solution

  • I think this is what you need:

    import re
    import json
    import requests
    
    link = 'https://www.livabl.com/abbotsford-bc/jem1'
    
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
        'Referer': 'https://www.livabl.com/',
    }
    
    def get_floor_plan_images(link,headers):
        res = requests.get(link,headers=headers)
        print(res.status_code)
        return re.finditer(r"\{\\\"images\\\":(.*?]),",res.text)
    
    
    for img in get_floor_plan_images(link,headers):
        print(img.group(1))