I am trying to parse a table inside an html page at this link, and I have yet to find a method ensuring I can point to the right table, as the page contains a few other tables as well - as shown on the image attached.
I have tried the simpler method, using pandas.read_html and let it figure it out, but this only returns the content of the top of the page (I am guessing), missing out everything else.
import pandas as pd
url='https://www.360optimi.com/app/sec/resourceType/benchmarkGraph?resourceSubTypeId=5c9316b28e202b46c92ca518&resourceId=envdecAluminumWindowProfAl&profileId=Saray2016&benchmarkToShow=co2_cml&entityId=5e4eae0f619e783ceb5d0732&indicatorId=lcaForLevels-CO2&stateIdOfProject='
tables = pd.read_html(url)
print(tables[0])
which returns:
0 1 2
0 English Français Deutsch
1 Español Suomi Norsk
2 Nederlands Svenska Italiano
Any idea on how I can use the right html tags to point to the table of interest?
EDIT: As some of you noted that login credentials are required for the web page (apologies), I have uploaded the html code here.
I have taken as input the html that you have provided. If you want to use this code on a url, just extract the html of that url before using this code
from bs4 import BeautifulSoup
import pandas as pd
Your_input_html_string = str(html_code_of_your_url)
soup = BeautifulSoup(Your_input_html_string) #Provide the html code of the url in string format as input over here
#The table id which you want to extract from this html is "resourceBenchmarkTable". So let's extract the html of this table alone from the entire html
extracted_table_html = str(soup.find_all("table",id="resourceBenchmarkTable"))
#Now, convert the specific extracted html of table into pandas dataframe
table_dataframe = pd.read_html(extracted_table_html)
print(table_dataframe)
Output: (Shows only first 5 rows to keep the answer short)