I am trying to extract financial data from yahoo finance using python. Below there is a link to an image that shows in circles which data I am trying to retrieve. It has the organization of the data table however I do not know where to begin with the givens shown in the picture.
This is the image of the code location of the numbers I'm trying to extract from yahoo finance, with the table name and td tickers.
I realize that I must somehow use the td tickers to find the numbers that I need for the extraction however Im not sure what are the basics commands that I need to implement.
This is a link to an example of the the data table that I'm trying to scrape
The page you scraped is rendered by JavaScript, requests and urllib can not handle JavaScript. I recommend you using selenium and BeautifulSoup to extract data.
This is when JavaScript is disabled:
the data you wanted is in this url :
i put it in the bs4, you can get the data by you own:
import requests, bs4, json
r = requests.get('http://financials.morningstar.com/ajax/ReportProcess4HtmlAjax.html?&t=XNAS:AAPL®ion=usa&culture=en-US&ops=clear&cur=&reportType=is&period=12&dataType=A&order=asc&columnYear=5&curYearPart=1st5year&rounding=3&view=raw&r=378724&callback=jsonp1482077238548&_=1482077239651')
js = r.text.strip('jsonp1482077238548()')
html_str = json.loads(js)['result']
soup = bs4.BeautifulSoup(html_str, 'lxml')
<div id="baseline" style="display:none">
<div class="left ">
<div class="r_xcmenu rf_table_left">
<div class="rf_header ">
<div class="lbl " currency="USD" fiscalyearend="September" fyenumber="9" id="unitsAndFiscalYear">
<div class="rf_crow1" id="label_i1" style="_height:16px; _float:none;">
<div class="lbl">
<div class="chart_contain_free" id="chart_i1">
<div class="chart_icon">