Search code examples
pythonselenium-webdriverweb-scrapingcss-selectorswebdriver

How to scrap dynamic HTML table with differencet class name for each row containing nested elements?


I want to create a dataframe by scrapping the table here which has different class name for each row and contains nested elements.

table_rows = driver.find_elements(By.CLASS_NAME, "bgColor-white")
for _, val in enumerate(table_rows):
    print(val.text)

Print output of the above code is string but could not segregate into appropriate columns.


Solution

  • Identify the table element and then get the outerHTML of the table element. Use pandas read_html() method and get the dataframe.

    driver.get ("https://www.egp.gov.bt/resources/common/TenderListing.jsp?lang=en_US&langForMenu=en_US&h=t")
    time.sleep(3)
    table= driver.find_element(By.CSS_SELECTOR, "table#resultTable").get_attribute("outerHTML")
    df=pd.read_html(table)[0]
    print(df)
    

    console output:

       Sl. No.           Tender ID,  Reference No,  Public Status  ... Type,  Method Publishing Date & Time | Closing Date & Time
    0        1    15183, TSHA-6/Engineering/9/2022-2023/769, Live  ...     NCB,  OTM        03-Mar-2023 15:00 | 14-Mar-2023 15:10
    1        2            15180, STCB/PD/TS/Samtse/2023/213, Live  ...     NCB,  OTM        03-Mar-2023 10:00 | 14-Mar-2023 11:10
    2        3            15160, JNEC/Adm-33/2022-2023, Cancelled  ...     NCB,  OTM        02-Mar-2023 22:00 | 10-Mar-2023 10:30
    3        4          15179,  DAG/DEHSS(07)/2022-2023/148, Live  ...     NCB,  OTM        02-Mar-2023 15:00 | 16-Mar-2023 09:00
    4        5  15181, DCHS/PRP-01/2022-2023/244, Amendment/Co...  ...     NCB,  OTM        02-Mar-2023 09:00 | 13-Mar-2023 10:30
    5        6                  15174, NBC/Adm/06/2022/1198, Live  ...     NCB,  OTM        01-Mar-2023 09:00 | 20-Mar-2023 11:30
    6        7                15161, PDA/adm -35/2022-2023/, Live  ...     NCB,  OTM        27-Feb-2023 16:00 | 10-Mar-2023 11:00
    7        8  15169,  MD/Dz.EHSS-20/2022-2023/5179, Amendmen...  ...     NCB,  OTM        27-Feb-2023 14:30 | 10-Mar-2023 14:00
    8        9                                 15157, nofp2, Live  ...     NCB,  OTM        21-Feb-2023 09:00 | 08-Mar-2023 11:30
    9       10   15158, MD/DES-20/2022-2023/5095, Being processed  ...     NCB,  OTM        21-Feb-2023 02:00 | 02-Mar-2023 10:00
    
    [10 rows x 6 columns]