Search code examples
pythonweb-scrapingdata-retrieval

Get the background-image from tag


I wanted to download images from Flickr, and so far I did this:

import bs4, requests, re

img = requests.get('https://www.flickr.com/search/?text=nature')
img.raise_for_status()

soup = bs4.BeautifulSoup(img.text)

elem = soup.select('main div div div div')
elem[10]
<div class="view photo-list-photo-view requiredToShowOnServer awake" 
  data-view-signature="photo-list-photo-view__UA_1__engagementModelName_photo-lite-models__excludePeople_true__id_8598154512__interactionViewName_photo-list-photo-interaction-view__isMobile_false__isOwner_false__layoutItem_1__measureAFT_true__model_1__modelParams_1__openAdvanced_false__parentContainer_1__parentSignature_photolist-47t__requiredToShowOnClient_true__requiredToShowOnServer_true__rowHeightMod_1__searchSimilar_true__searchSimilarWithTerm_false__searchTerm_nature__searchType_1__showAdvanced_true__showInteractionBarPlaceholder_false__showSort_true__showTools_true__sortMenuItems_1__unifiedSubviewParams_1__viewType_jst" 
  style="transform: translate(277px, 191px); -webkit-transform: translate(277px, 191px); -ms-transform: translate(277px, 191px); width: 364px; height: 205px; background-image: url(//c1.staticflickr.com/9/8232/8598154512_a4e080002d.jpg)"> 
  <div class="interaction-view"></div>
</div>

Can you help me to get the background image from the style tag in Elem?


Solution

  • To extract the attributes you can use elem[10].attrs

    And then split the string or use regex to extract the background.

    import bs4, requests, re
    
    img = requests.get('https://www.flickr.com/search/?text=nature')
    img.raise_for_status()
    
    soup = bs4.BeautifulSoup(img.text)
    
    elem = soup.select('main div div div div')
    print('https://'+elem[10].attrs['style'].split('background-image:')[-1][7:-1])