Search code examples
pythonbeautifulsoupimgur

Getting URL of a picture on imgur


I am trying to automate a process of downloading imgur files, and for this purpose I am using beautifulsoup to get the link however to be honest I am pretty lost on why this doesn't work, as according to my research it should:

    soup = BeautifulSoup("http://imgur.com/ha0WYYQ")
    imageUrl = soup.select('.image a')[0]['href']

The code above just returns an empty list, and therefore an error. I tried to modify it, but to no avail. Any and all input is appreciated.


Solution

  • <div class="post-image">
    
    
                            <a href="//i.imgur.com/ha0WYYQ.jpg" class="zoom">
                                        <img src="//i.imgur.com/ha0WYYQ.jpg" alt="Frank in his bb8 costume" itemprop="contentURL">
    
                </a>
    
    
    </div>
    

    this is the image tag, the "post-image" is a single word, can not be separated.

    imageUrl = soup.select('.post-image a')[0]['href']
    

    shortcut for select one tag:

    imageUrl = soup.select_one('.post-image a')['href']
    

    To parse a document, pass it into the BeautifulSoup constructor. You can pass in a string or an open filehandle:

    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(open("index.html"))
    
    soup = BeautifulSoup("<html>data</html>")