python-3.x web-scraping beautifulsoup html-parsing

Scrape image's metadata from Facebook public posts

This is a follow-up question in my quest to get some data from Facebook public posts. I'm trying to collect images metadata this time (image's url). Link posts work fine but some posts return empty data. I used the same approach suggested in answers to my previous question but it doesn't work for the example below. Will appreciate suggestions!

link = "https://www.facebook.com/228735667216/posts/10151653129902217"
res = requests.get(link,headers={'User-Agent':'Mozilla/5.0'})
comment = res.text.replace("-->", "").replace("<!--", "")
soup = BeautifulSoup(comment, "lxml")
image = soup.find("div", class_="uiScaledImageContainer _517g")
img = image.find("img", class_="scaledImageFitWidth img")
href = img["src"]
print(href)

Solution

To log in using requests is not that easy so I intentionally skipped that library. You can try using only selenium or selenium in combination with BeautifulSoup to do the doing.

from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.common.keys import Keys

url = "https://www.facebook.com/228735667216/posts/10156284868312217"

chrome_options = webdriver.ChromeOptions()

#This is how you can make the browser headless
chrome_options.add_argument("--headless")
#The following line controls the notification popping up right after login
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)

driver.get(url)
driver.find_element_by_id("email").send_keys("your_username")
driver.find_element_by_id("pass").send_keys("your_password",Keys.RETURN)
driver.get(url)
soup = BeautifulSoup(driver.page_source, "lxml")
for img in soup.find_all(class_="scaledImageFitWidth"):
    print(img.get("src"))
driver.quit()

Output are like (partial):

https://external.fdac17-1.fna.fbcdn.net/safe_image.php?d=AQBjBuP0TBYabtnO&w=540&h=282&url=https%3A%2F%2Fs3.amazonaws.com%2Fprod-cust-photo-posts-jfaikqealaka%2F3065-6e4c325b07b921fdefed4dd727881f8d.jpg&cfs=1&upscale=1&fallback=news_d_placeholder_publisher&_nc_hash=AQCVKXMSqvNiHZik
https://external.fdac17-1.fna.fbcdn.net/safe_image.php?d=AQCJ6RFOF4dY2xTn&w=100&h=100&url=https%3A%2F%2Fcdn.images.express.co.uk%2Fimg%2Fdynamic%2F106%2F750x445%2F1046936.jpg&cfs=1&upscale=1&fallback=news_d_placeholder_publisher_square&_nc_hash=AQAyFxRaZTGV47Se