Search code examples
pythonbeautifulsoupf-string

Trying to use any() to change a word in a url


I am trying to create a program that takes user input of a suburb and then returns a list of good cafes in that suburb.

The webpage I am scraping has a number of good cafe lists for various suburbs but has not got a list for every suburb where I live.

What I have done so far is create the code to get a list of suburbs, then created the code to scrape the webpage for the "best of" cafes based on using a f-string. My old code meant I would have to manually enter every suburb that the website has a page for as elif statements. Like this:

def cafe_search():
    user_suburb = input("What Suburb?")
    if user_suburb == "Thornbury":
        print(get_cafes("thornbury"))
    elif user_suburb == "Northcote":
        print(get_cafes("northcote"))
    elif user_suburb == "Carlton":
        print(get_cafes("carlton"))        

But I am trying to find a way to use a "suburb_list" I pull from wikipedia and then match it with the user's input to add to the f-string expression and then check whether that suburb has a cafe listing. I trying to do this with this f-string:

f"https://www.broadsheet.com.au/melbourne/guides/best-cafes-{user_suburb}"

I am trying to use the any() function to do this... not sure how successful that will be? I would be really grateful for any tips. PS, I am pretty new to all of this and this is my first project so my question may be a bit clumsy and my code inefficient, apologies!

#import stuff to open and scrape websites
    from urllib.request import urlopen
    from bs4 import BeautifulSoup
    import requests
    from requests import get
    
#suburbs
#open suburb listing
url_suburbs = "https://en.wikipedia.org/wiki/List_of_Melbourne_suburbs"
html_suburbs = urlopen(url_suburbs)

soup_suburb_list = BeautifulSoup(html_suburbs, 'html.parser')
type(soup_suburb_list)
    
#grab suburb names
suburbs_containers = soup_suburb_list.select(".mw-parser-output > ul")

    suburbs = []
    for container in suburbs_containers: 
        suburb_list = container.find_all('a')
        for suburb in suburb_list:
            suburbs.append(suburb.text)

#cafes
    
def get_cafes(user_suburb):
    #open url
        url_cafes = f"https://www.broadsheet.com.au/melbourne/guides/best-cafes-{user_suburb}"
        html_cafes = urlopen(url_cafes)
    
    #create beautiful soup object for cafes
        soup_cafe_list = BeautifulSoup(html_cafes, 'html.parser')
        type(soup_cafe_list)

    #grab cafe names
        cafe_names = soup_cafe_list.find_all("h2", class_= "venue-title")
        print (cafe_names)
    
    #function to search cafes        
def cafe_search():
        user_suburb = input("What Suburb?")
        if user_suburb == any(suburbs):
            print(get_cafes("user_suburb"))

Solution

  • any(mylist) returns True if anything in mylist is True in the Python way of testing True - it’s like an or on mylist values. Similarly all(Mylist) is like an and on the values in mylist. See docs.python.org/3.8/library/functions.html#any

    So for your code to check:

    if user_suburb == any(suburbs):
    

    any(suburbs) will return True if suburbs is not zero-length, and user_suburb is not the value True, so the test will always fail.

    More on Python's testing for True is here, see below Truth Value Testing - worth reading! - https://docs.python.org/3.8/library/stdtypes.html

    So you can't use any() but the in operator works against a list, so change:

    if user_suburb == any(suburbs)
    

    to

    if user_suburb in suburbs
    

    You’ll probably need to be careful that case is consistent as in is case-sensitive - may be simplest to lowercase everything going into suburbs and to lowercase user_suburb before using in

    To lowercase the suburbs list change e.g.

    suburbs.append(suburb.text)
    

    to

    suburbs.append(suburb.text.lower())
    

    and change the check to:

    if user_suburb.lower() in suburbs: