Search code examples
pythonparsingbeautifulsouphref

How i can get href from row


I do some telegram bot, and i need to get links from html. I want to take href for Matches from this website https://www.hltv.org/matches

My previous code is

     elif message.text == "Matches":
        url_news = "https://www.hltv.org/matches"
        response = requests.get(url_news)
        soup = BeautifulSoup(response.content, "html.parser")
        match_info = []
        match_items = soup.find("div", class_="upcomingMatchesSection")
        print(match_items)
        for item in match_items:
            match_info.append({
                    "link": item.find("div", class_="upcomingMatch").text,
                    "title": item["href"]

            })

And i dont know how i can get links from this body.Appreciate any help


Solution

  • What happens?

    You try to iterate over match_items but there is nothing to iterate, cause you only selected the section including the matches but not the matches itself.

    How to fix?

    Select the upcomingMatches instead and iterate over them:

    match_items = soup.select("div.upcomingMatchesSection div.upcomingMatch")
    

    Getting the url you have to select an <a>:

    item.a["href"]
    

    Example

    from bs4 import BeautifulSoup as bs
    import requests
    
    
    url_news = "https://www.hltv.org/matches"
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36'}
    
    response = requests.get(url_news, headers=headers)
    soup = BeautifulSoup(response.content, "html.parser")
    match_info = []
    match_items = soup.select("div.upcomingMatchesSection div.upcomingMatch")
    
    for item in match_items:
        match_info.append({
                "title": item.get_text('|', strip=True),
                "link": item.a["href"]
    
        })
    match_info
    

    Output

    [{'title': '09:00|bo3|1WIN|K23|Pinnacle Fall Series 2|Odds',
      'link': '/matches/2352066/1win-vs-k23-pinnacle-fall-series-2'},
     {'title': '09:00|bo3|INDE IRAE|Nemiga|Pinnacle Fall Series 2|Odds',
      'link': '/matches/2352067/inde-irae-vs-nemiga-pinnacle-fall-series-2'},
     {'title': '10:00|bo3|OPAA|Nexus|Malta Vibes Knockout Series 3|Odds',
      'link': '/matches/2352207/opaa-vs-nexus-malta-vibes-knockout-series-3'},
     {'title': '11:00|bo3|Checkmate|TBC|Funspark ULTI 2021 Asia Regional Series 3|Odds',
      'link': '/matches/2352092/checkmate-vs-tbc-funspark-ulti-2021-asia-regional-series-3'},
     {'title': '11:00|bo3|ORDER|Alke|ESEA Premier Season 38 Australia|Odds',
      'link': '/matches/2352122/order-vs-alke-esea-premier-season-38-australia'},...]