Search code examples
pythonbeautifulsouppython-requestspython-webbrowser

soup.select('.r a') in f'https://google.com/search?q={query}' brings back empty list in Python BeautifulSoup. **NOT A DUPLICATE**


The Situation:

The "I'm Feeling Lucky!" project in the "Automate the boring stuff with Python" ebook no longer works with the code he provided.

Specifically:

linkElems = soup.select('.r a')

What I have done: I've already tried using the solution provided within this stackoverflow question

I'm also currently using the same search format.

Code:

    import webbrowser, requests, bs4

    def im_feeling_lucky():
    
        # Make search query look like Google's
        search = '+'.join(input('Search Google: ').split(" "))
  
        # Pull html from Google
        print('Googling...') # display text while downloading the Google page
        res = requests.get(f'https://google.com/search?q={search}&oq={search}')
        res.raise_for_status()

        # Retrieve top search result link
        soup = bs4.BeautifulSoup(res.text, features='lxml')


        # Open a browser tab for each result.
        linkElems = soup.select('.r')  # Returns empty list
        numOpen = min(5, len(linkElems))
        print('Before for loop')
        for i in range(numOpen):
            webbrowser.open(f'http://google.com{linkElems[i].get("href")}')

The Problem:

The linkElems variable returns an empty list [] and the program doesn't do anything past that.

The Question:

Could sombody please guide me to he correct way of handling this and perhaps explain why it isn't working?


Solution

  • I too had had the same problem while reading that book and found a solution for that problem.

    replacing

    soup.select('.r a')
    

    with

    soup.select('div#main > div > div > div > a')
    

    will solve that issue

    following is the code that will work

    import webbrowser, requests, bs4 , sys
    
    print('Googling...')
    res = requests.get('https://google.com/search?q=' + ' '.join(sys.argv[1:]))
    res.raise_for_status()
    
    soup = bs4.BeautifulSoup(res.text)
    
    linkElems = soup.select('div#main > div > div > div > a')  
    numOpen = min(5, len(linkElems))
    for i in range(numOpen):
        webbrowser.open('http://google.com' + linkElems[i].get("href"))
    

    the above code takes input from commandline arguments