Search code examples
pythonweb-scrapingbeautifulsouppython-3.7

BeautifulSoup won't return the real text on the page souce


I am trying to scrape football match results from livescore.com using requests and BeautifulSoup . For some reason, instead of team names and score it returns this:

03-12-2019 - __home_team__ - __home_score__ - __away_team__ - __away_score__

My code:

import requests
from bs4 import BeautifulSoup
from datetime import date, timedelta

yesterday = date.today() - timedelta(days=1)
checkDate = '2019-' + yesterday.strftime('%m') + '-'  + yesterday.strftime('%d')
url = 'https://www.livescore.com/soccer/' + checkDate
playDate = yesterday.strftime('%d') + '-'  + yesterday.strftime('%m') + '-2019'

response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

home = soup.find_all('div', class_='ply tright name')
away = soup.find_all('div', class_='ply name')
hScore = soup.find_all('span', class_='hom')
aScore = soup.find_all('span', class_='awy')

with open('Scores.csv', 'a') as f:
    for h, a, hs, aws in zip(home, away, hScore, aScore):
        f.write(playDate + ',' + h.text + ',' + hs.text + ',' + a.text + ',' + aws.text + '\n')
        print(playDate + ' - ' + h.text + ' ' + hs.text + ' - ' + a.text + ' ' + aws.text)

The source code:

<a href="/soccer/england/premier-league/crystal-palace-vs-afc-bournemouth/6-18427820/" class="match-row scorelink even  " data-type="evt" data-id="soccer-6-18427820" data-stg-id="159">
   <div class="min ">
      <div>
         <span>FT</span> 
         <span class="ico-alert tt hidden">
            <svg class="inc icon-warning">
               <use xlink:href="#icon-warning"></use>
            </svg>
            <span class="tip" data-type="tooltip">Limited coverage</span>
         </span>
      </div>
   </div>
   <div class="ply tright name"><span>Crystal Palace</span></div>
   <div class="sco"> <span class="hom">1</span><span> - </span><span class="awy">0</span> </div>
   <div class="ply name"><span>AFC Bournemouth</span></div>
   <div class="star-container" data-type="star-container">
      <div class=" " data-type="star">
         <svg>
            <use xlink:href="#icon-star"></use>
         </svg>
      </div>
   </div>
</a>

What I've tried:

1.) Getting the 'a' tag (returns nothing)

2.) Using find_all('span', class_ = None) (returns a single space character)

Intended output would be (random names just for e.g.):

04-12-2019,Chelsea,1,1,Liverpool (for the CSV file)

04-12-2019 - Chelsea 1 - Liverpool 1 (for the print() function)


Solution

  • You'll have to use selenium to allow the page to render.

    from bs4 import BeautifulSoup
    from datetime import date, timedelta
    from selenium import webdriver
    
    yesterday = date.today() - timedelta(days=1)
    checkDate = '2019-' + yesterday.strftime('%m') + '-'  + yesterday.strftime('%d')
    url = 'https://www.livescore.com/soccer/' + checkDate
    playDate = yesterday.strftime('%d') + '-'  + yesterday.strftime('%m') + '-2019'
    
    driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
    driver.get(url)
    
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    
    home = soup.find_all('div', class_='ply tright name')
    away = soup.find_all('div', class_='ply name')
    hScore = soup.find_all('span', class_='hom')
    aScore = soup.find_all('span', class_='awy')    
    
    
    with open('Scores.csv', 'a') as f:
        for h, a, hs, aws in zip(home, away, hScore, aScore):
            f.write(playDate + ',' + h.text + ',' + hs.text + ',' + a.text + ',' + aws.text + '\n')
            print(playDate + ' - ' + h.text + ' ' + hs.text + ' - ' + a.text + ' ' + aws.text)
    
    driver.close()
    

    Output:

    03-12-2019 - Crystal Palace 1 - AFC Bournemouth 0
    03-12-2019 - Burnley 1 - Manchester City 4
    03-12-2019 - Burton Albion 1 - Southend United 1
    03-12-2019 - Eastleigh 0 - Wrexham 2
    03-12-2019 - Farsley Celtic 1 - Brackley Town 1
    03-12-2019 - Hereford 2 - York City  2
    03-12-2019 - Kidderminster Harriers 1 - Gateshead 1
    03-12-2019 - Leamington 3 - Darlington 0
    03-12-2019 - Hungerford Town 1 - Tonbridge Angels 0
    03-12-2019 - Brighton & Hove Albion U21 0 - Newport County * 0
    03-12-2019 - Colchester United 1 - Stevenage 2
    03-12-2019 - Shrewsbury Town 1 - Manchester City Academy * 1
    03-12-2019 - Milton Keynes Dons 2 - Coventry City 0
    03-12-2019 - Port Vale * 2 - Mansfield Town 2
    03-12-2019 - Portsmouth 2 - Northampton Town 1
    03-12-2019 - Salford City 3 - Wolverhampton Wanderers Academy 0
    03-12-2019 - Walsall 3 - Chelsea U21 2
    03-12-2019 - Cremonese 1 - Empoli 0
    03-12-2019 - Genoa 3 - Ascoli 2
    03-12-2019 - Fiorentina 2 - Cittadella 0
    03-12-2019 - Angers 0 - Marseille 2
    03-12-2019 - Bordeaux 6 - Nimes 0
    03-12-2019 - Brest 5 - Strasbourg 0
    03-12-2019 - Lyon 0 - Lille 1
    03-12-2019 - Le Havre 2 - Le Mans 0
    03-12-2019 - Auxerre 1 - Valenciennes 1
    03-12-2019 - Niort 0 - AC Ajaccio 1
    03-12-2019 - Troyes 1 - Rodez 0
    03-12-2019 - Grenoble 1 - Clermont Foot 1
    03-12-2019 - Chateauroux 1 - Sochaux 1
    03-12-2019 - Paris FC 0 - Guingamp 3
    03-12-2019 - Lens 3 - Chambly 0
    03-12-2019 - Orleans 0 - Lorient 4
    03-12-2019 - Royal Antwerp * 3 - Genk 3
    03-12-2019 - Sporting Covilha 1 - Benfica 1
    03-12-2019 - Brora Rangers 1 - Greenock Morton 3
    03-12-2019 - Ayr United 0 - Dunfermline Athletic 1
    03-12-2019 - Stenhousemuir 2 - Elgin City 2
    03-12-2019 - Panetolikos 5 - Ialysos 1
    03-12-2019 - Ergotelis 0 - Trikala 1
    03-12-2019 - Fatih Karagumruk SK 1 - Goztepe 2
    03-12-2019 - Yeni Malatyaspor 3 - Keciorengucu 1
    03-12-2019 - Alanyaspor 5 - Adanaspor 1
    03-12-2019 - Esenler Erokspor 0 - Sivasspor 2
    03-12-2019 - Fenerbahce 4 - Istanbulspor AS 0
    03-12-2019 - Cefn Druids AFC 2 - Cardiff Met University 1
    03-12-2019 - TNS 1 - Carmarthen 0
    03-12-2019 - Glentoran ? - Glenavon ?
    03-12-2019 - Legia Warszawa II 0 - Piast Gliwice 2
    03-12-2019 - Gornik Leczna 0 - Legia Warszawa 2
    03-12-2019 - Sibenik 0 - NK Lokomotiva 4
    03-12-2019 - MTK Budapest 0 - Diosgyori VTK 0
    03-12-2019 - Szeged-Grosics Akademia 0 - Fehervar FC 1
    03-12-2019 - Gaz Metan Medias 1 - FC Voluntari 0
    03-12-2019 - CSM Politehnica Iasi 1 - FC FCSB 2
    03-12-2019 - Beroe 3 - CSKA 1948 4
    03-12-2019 - Slavia Sofia 1 - Botev Plovdiv 2
    03-12-2019 - Bnei Yehuda Tel Aviv FC 1 - Hapoel Raanana FC 1
    03-12-2019 - Maccabi Netanya FC 1 - Hapoel Ironi Kiryat Shmona 0
    03-12-2019 - Hapoel Kfar Saba FC 0 - Hapoel Beer Sheva FC 1
    03-12-2019 - Union 1 - Huracan 0
    03-12-2019 - Club Atletico Platense 2 - Atlanta 1
    03-12-2019 - Club Atletico Mitre 0 - Independiente Rivadavia 0
    03-12-2019 - San Martin San Juan 1 - CA Alvarado 1
    03-12-2019 - Santamarina 2 - Villa Dalmine 1
    03-12-2019 - Atletico Rafaela 2 - Chacarita Juniors 0
    03-12-2019 - Quilmes 1 - Brown de Adrogue 1
    03-12-2019 - Gimnasia Mendoza 0 - San Martin de Tucuman 3
    03-12-2019 - CR Vasco DA Gama RJ 1 - Cruzeiro 0
    03-12-2019 - Royal Pari 0 - San Jose 1
    03-12-2019 - Luqueno 0 - General Diaz 5
    03-12-2019 - CD Motagua 5 - CD Vida 2
    03-12-2019 - Laos U23 0 - Thailand U23 2
    03-12-2019 - Indonesia U23 8 - Brunei U23 0
    03-12-2019 - Singapore U23 0 - Vietnam U23 1
    03-12-2019 - Al Riffa ? - Al Hidd ?
    03-12-2019 - Al-Najma Manama ? - Busaiteen ?
    03-12-2019 - East Riffa ? - Manama Club ?
    03-12-2019 - PSS Sleman 5 - Perseru Badak Lampung 1
    03-12-2019 - Persib Bandung 0 - Persela Lamongan 2
    03-12-2019 - Al Akhdoud 1 - Ohod 2
    03-12-2019 - Al-Wehda 1 - Al Khaleej 0
    03-12-2019 - FC Masr * 0 - El Gounah 0
    03-12-2019 - Al Ahly 3 - Bani Sweef 1
    03-12-2019 - AS Slimane ? - Esperance ?
    03-12-2019 - Etoile Metlaoui ? - Etoile du Sahel ?
    03-12-2019 - __home_team__ __home_score__ - __away_team__ __away_score__