Search code examples
pythonweb-scrapingbeautifulsoupdata-extraction

How extract data from the site (corona) by BeautifulSoup?


I want to save the number of articles in each country in the form of the name of the country, the number of articles in a file for my research work from the following site. To do this, I wrote this code, which unfortunately does not work.

http://corona.sid.ir/

!pip install bs4
from bs4 import BeautifulSoup # this module helps in web scrapping.
import requests  # this module helps us to download a web page
url='http://corona.sid.ir/'
data  = requests.get(url).text 
soup = BeautifulSoup(data,"lxml")  # create a soup object using the variable 'data'
soup.find_all(attrs={"class":"value"})

Result= []


Solution

  • You are using the wrong url. Try this:

    from bs4 import BeautifulSoup # this module helps in web scrapping.
    import requests  # this module helps us to download a web page
    import pandas as pd
    
    url = 'http://corona.sid.ir/world.svg'
    data  = requests.get(url).text 
    soup = BeautifulSoup(data,"lxml")  # create a soup object using the variable 'data'
    soup.find_all(attrs={"class":"value"})
    
    rows = []
    for each in soup.find_all(attrs={"class":"value"}):
        row = {}
        row['country'] = each.text.split(':')[0]
        row['count'] = each.text.split(':')[1].strip()
        rows.append(row)
        
    df = pd.DataFrame(rows)
    

    Output:

    print(df)
                      country count
    0                 Andorra    17
    1    United Arab Emirates   987
    2             Afghanistan    67
    3                 Albania   143
    4                 Armenia    49
    ..                    ...   ...
    179                 Yemen    54
    180               Mayotte     0
    181          South Africa  1938
    182                Zambia   127
    183              Zimbabwe   120
    
    [184 rows x 2 columns]