Search code examples
htmlxmlxpathyql

XPath YQL get only specific columns


Hi I have a html page that I want to query/"scrape" using YQL. I want to get only four columns text from the table tag on that html page and I don't know how to represent that using XPath.

I located one of the cells by right clicking the cell in Chrome, inspect element and copy xpath and This is the result I got for only that cell.

//*[@id="partsTable"]/tbody/tr[1]/td[8]/text()

So that is the expression for the 1st row and the 8th column. Actually, I want to get all the rows for the content in the 5,6,8,9 columns. I don't know if it would be possible to write that in XPath easily.

Thanks a lot for the help. (I am absolutely new to XPath so explanation would be appreciated)


Solution

  • You can query specific positions with a syntax similar to SQL's IN:

    [position() = (5, 6, 8, 9)]
    

    So your full expression would be:

    //*[@id="partsTable"]/tbody/tr[1]/td[position() = (5, 6, 8, 9)]/text()