Search code examples
pythonbeautifulsoupfinance

Beautifulsoup scrape financial data


I am an absolute beginner, trying to extract financial data from a website. The data I'm looking for is hidden in unordered lists and spans with no name. Does anybody know how to deal with this using beautifulsoup? I'd like to add the data in a dataframe in pandas.

<div class="finance__details__right">
        <ul>
            <li>
                <i>Άνοιγμα</i>
                <span data-bind="text: o.extend({priceNumeric: { dependOn: l, precision: 4 }})">15,8100</span>
            </li>
            <li>
                <i>Υψηλό</i>
                <span data-bind="text: hp.extend({priceNumeric: { dependOn: l, precision: 4 }})">16,2400</span>
            </li>
            <li>
                <i>Χαμηλό</i>
                <span data-bind="text: lp.extend({priceNumeric: { dependOn: l, precision: 4 }})">15,8100</span>
            </li>
        </ul>
        <ul>
            <li>
                <i>Όγκος</i>
                <span data-bind="text: tv() > 0? tv.extend({numeric: { precision: 0}})(): '', flashBackground: tv">301.286</span>
            </li>
            <li>
                <i>Τζίρος</i>
                <span data-bind="text: to() > 0? to.extend({numeric: { precision: 0 }})() + ' €': '', flashBackground: to">
4.843.588 €                </span>
            </li>
            <li>
                <i>Πράξεις</i>
                <span data-bind="text: t() > 0? t.extend({numeric: { precision: 0}})(): '', flashBackground: t">1.890</span>
            </li>
        </ul>
            <ul>
                <li>
                    <i>Αγοραστές</i>
                    <span data-bind="text: bs() > 0? b.extend({priceNumeric: l})() + ' x ' + bs.extend({numeric: { precision: 0}})(): '', flashBackground: b">
                    </span>
                </li>
                <li>
                    <i>Πωλητές</i>
                    <span data-bind="text: as() > 0? a.extend({priceNumeric: l})() + ' x ' + as.extend({numeric: { precision: 0}})(): '', flashBackground: a">
16,2400 x 6.884
                    </span>
                </li>
                <li>
                    <i>Κεφαλαιοποίηση</i>
                    <span data-bind="text: cp.extend({numeric: { precision: 0}})() + ' €', flashBackground: cp">2.320.552.455 &euro;</span>
                </li>

this is the code (that doesnt work)

SCRIP = 'ΜΥΤΙΛ'
link = f'https://www.capital.gr/finance/quote/{SCRIP}'
hdr = {'User-Agent':'Mozilla/5.0'}
req = Request(link,headers=hdr)
 
try:
    page=urlopen(req)
    soup = BeautifulSoup(page)
    
    div_html = soup.find('div',{'class': 'finance_details_right'})
    ul_html = div_html.find('ul')
    Κεφαλαιοποίηση = 0.0
        
    for li in ul_html.find_all("li"):
        name_span = li.find('')
        if 'Κεφαλαιοποίηση' in name_span.text: 
            num_span = li.find('span',{'class':''})
            
            Κεφαλαιοποίηση = float(num_span) if (num_span != '') else 0.0
            break
    
    print(f'Κεφαλαιοποίηση - {SCRIP}: {Κεφαλαιοποίηση} Cr')

except:
    print(f'EXCEPTION THROWN: UNABLE TO FETCH DATA FOR {SCRIP}')

I am looking for Κεφαλαιοποίηση and 2.320.552.455 without the euro sign

any help is greatly appreciated.

Thank you in advance


Solution

  • The following code will get you that number (if I understood your question) for a number of tickers:

    import requests
    from bs4 import BeautifulSoup
    
    tickers = ['ΜΥΤΙΛ','ΛΑΜΔΑ'] 
    for t in tickers:
        url = f'https://www.capital.gr/finance/quote/{t}'
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'lxml')
    
        el = soup.find('i', string = 'Κεφαλαιοποίηση').parent.find('span').text.split(' ')[0]
        print(t, 'Κεφαλαιοποίηση', el)
    

    Result:

    ΜΥΤΙΛ Κεφαλαιοποίηση 2.320.552.455
    ΛΑΜΔΑ Κεφαλαιοποίηση 1.126.696.558