Search code examples
c#xpathhtml-agility-pack

Deriving XPath for accessing table elements


I am trying to scrape the table from this website using HtmlAgilityPack and a C# Console App.

I am able to scrape the names of the stocks in Column 2 (ex: EDAP TMS ADR (EDAP) but I am not being able to get the correct XPath for any of the values from the Price, Chg, Chg% columns.

For ex: My XPath for the names column works perfectly as:

"//*[@id=\"column0\"]//div//table//tr//td//a"

What would be the XPath for the Price, Chg, Chg% columns? Can you help me understand how you would derive it?


Solution

  • Here is the xpath to get the desired column output, based on column name.

    For price: Getting price from 4th row.

    //div[@class='mdcNarrowM']//table//tr[4]/td[count(ancestor::table[1]//tr[1]/td[.='Price']/preceding-sibling::td)+1]
    

    General Notation in this case: (update the row number and column names as per your need) Tested for all the columns in that table.

    //div[@class='mdcNarrowM']//table//tr[row_number_goes_here]/td[count(ancestor::table[1]//tr[1]/td[.='column name goes here']/preceding-sibling::td)+1]
    

    To get all the rows (except the header row) use this below xpath.

    //div[@class='mdcNarrowM']//table//tr[not(td[@class='colhead'])]/td[count(ancestor::table[1]//tr[1]/td[.='Price']/preceding-sibling::td)+1]